logo SBA

ETD

Digital archive of theses discussed at the University of Pisa

 

Thesis etd-01232018-121811


Thesis type
Tesi di laurea magistrale
Author
MESSINA, NICOLA
URN
etd-01232018-121811
Thesis title
Design and Testing of a Neural Network for Relational Content-Based Image Retrieval
Department
INGEGNERIA DELL'INFORMAZIONE
Course of study
COMPUTER ENGINEERING
Supervisors
relatore Dott. Falchi, Fabrizio
relatore Dott. Gennaro, Claudio
relatore Dott. Amato, Giuseppe
Keywords
  • Content Based Image Retrieval
  • Deep Learning
  • Relational Reasoning
Graduation session start date
23/02/2018
Availability
Full
Summary
Humans often perceive the physical world as sets of relations between objects, whatever nature (visual, tactile, auditive) they exhibit.

Spatial relationships, in particular, assume a strong importance, since they correlate objects in the three-dimensional spatial world in which we are submerged.

In this work we will study deep learning architectures that are able to mimic this spatial consciousness. We will investigate visual deep learning models able to understand how objects in an image are arranged.

In literature this problem is often faced using VQA (Visual Question Answering): a question regarding the arrangement of objects in a particular image is asked and the network should be able to answer correctly.

This thesis aims to employ one of the latest proposals in the VQA field, the Relational Networks by DeepMind team, to introduce Relational Content-Based Image Retrieval (R-CBIR).

Current CBIR systems do not take into consideration relations between objects inside images. We will analyze and modify the Relational Network architecture in order to extract visual relational features, so that relational indexing becomes possible. Statistics will be collected and analyzed in order to measure the accuracy gain relational features reach over standard ones.

We will reference a state-of-the-art synthetic dataset, CLEVR and we will even consider one of its variants, Sort-of-CLEVR, built to be simpler and easier to train and debug.
The code will be written in Python using PyTorch framework with CUDA acceleration and it will be optimized to run on a multi-GPU system.
File