logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-01232018-121811


Tipo di tesi
Tesi di laurea magistrale
Autore
MESSINA, NICOLA
URN
etd-01232018-121811
Titolo
Design and Testing of a Neural Network for Relational Content-Based Image Retrieval
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
COMPUTER ENGINEERING
Relatori
relatore Dott. Falchi, Fabrizio
relatore Dott. Gennaro, Claudio
relatore Dott. Amato, Giuseppe
Parole chiave
  • Content Based Image Retrieval
  • Deep Learning
  • Relational Reasoning
Data inizio appello
23/02/2018
Consultabilità
Completa
Riassunto
Humans often perceive the physical world as sets of relations between objects, whatever nature (visual, tactile, auditive) they exhibit.

Spatial relationships, in particular, assume a strong importance, since they correlate objects in the three-dimensional spatial world in which we are submerged.

In this work we will study deep learning architectures that are able to mimic this spatial consciousness. We will investigate visual deep learning models able to understand how objects in an image are arranged.

In literature this problem is often faced using VQA (Visual Question Answering): a question regarding the arrangement of objects in a particular image is asked and the network should be able to answer correctly.

This thesis aims to employ one of the latest proposals in the VQA field, the Relational Networks by DeepMind team, to introduce Relational Content-Based Image Retrieval (R-CBIR).

Current CBIR systems do not take into consideration relations between objects inside images. We will analyze and modify the Relational Network architecture in order to extract visual relational features, so that relational indexing becomes possible. Statistics will be collected and analyzed in order to measure the accuracy gain relational features reach over standard ones.

We will reference a state-of-the-art synthetic dataset, CLEVR and we will even consider one of its variants, Sort-of-CLEVR, built to be simpler and easier to train and debug.
The code will be written in Python using PyTorch framework with CUDA acceleration and it will be optimized to run on a multi-GPU system.
File