ETD

Archivio digitale delle tesi discusse presso l'Università di Pisa

Tesi etd-02012016-201609


Tipo di tesi
Tesi di laurea magistrale
Autore
RAZZANO, ALESSIA
URN
etd-02012016-201609
Titolo
Design and Implementation of a Multimedia Information Retrieval Engine for the MSR-Bing Image Retrieval Challenge
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
COMPUTER ENGINEERING
Relatori
relatore Prof. Amato, Giuseppe
relatore Prof. Gennaro, Claudio
relatore Prof. Marcelloni, Francesco
Parole chiave
  • content-based image retrieval
  • text-based image retrieval
  • features extraction
  • Caffe framework
  • Apache Lucene
  • similarity function
Data inizio appello
26/02/2016
Consultabilità
Completa
Riassunto
The aim of this work is to design and implement a multimedia information retrieval engine for the MSR-Bing Retrieval Challenge provided by Microsoft. The challenge is based on the Clickture dataset, generated from click logs of Bing image search. The system has to predict the relevance of images with respect to text queries, by associating a score to a pair (image, text query) that indicates how the text query is good at describing the image content. We attempt to combine textual and visual information, by performing text-based and content-based image retrieval. The framework used to extract visual features is Caffe, an efficient implementation
of deep Convolutional Neural Network(CNN).
Decision is taken using a knowledge base containing triplets each consisting of a text query, an image, and the number of times that a users clicked on the image, in correspondence of the text query. Two strategies were proposed. In one case we analyse the intersection among the riplets
elements retrieved respectively using the textual query and the image itself. In the other case we analyse the union. To solve efficiency issues we proposed an approach that index visual features using Apache Lucene, that is a text search engine library written entirely in Java, suitable for nearly any application requiring full-text search abilities. To this aim, we have converted image features into a textual form, to index them into an inverted index by means of Lucene. In this way we were able to set up a robust retrieval system that combines full-text
search with content-based image retrieval capabilities. To prove that our search of textually and visually similar images really works, a small web-based prototype has been implemented. We evaluated different versions of our system over the development set in order to evaluate the measures of similarity to compare images, and to assess the best sorting strategy. Finally, our proposed approaches have been compared with those implemented by the winners of previous challenge editions.
File