ETD system

Electronic theses and dissertations repository

 

Tesi etd-02072017-161336


Thesis type
Tesi di laurea magistrale
Author
DE ROSA, PIETRO
URN
etd-02072017-161336
Title
Design and implementation of a distributed system for content-based image retrieval
Struttura
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
COMPUTER ENGINEERING
Supervisors
relatore Prof. Gennaro, Claudio
relatore Prof. Amato, Giuseppe
relatore Prof. Falchi, Fabrizio
Parole chiave
  • deep features
  • elasticsearch
  • cbir system
Data inizio appello
24/02/2017;
Consultabilità
Completa
Riassunto analitico
The aim of this work is to design and implement a distributed system for content-based image retrieval on very large image databases. To realize this system, a standard full-text search engine has been used. In particular, the system has been developed with the open source software Elasticsearch which, in turn, is built on top of Apache LuceneTM, a widely used full-text search engine Java library.
In order to allow the full-text search engine to perform similarity search, we used Deep Convolutional Neural Network Features extracted from the images of the dataset and encoded as standard text.
Given the distributed nature of Elasticsearch, the index can be split and spread among several nodes. This makes it easy to parallelize the search, thus leading to a significant performance enhancement.
All the experiments have been conducted on the Yahoo Flickr Creative Commons 100M dataset, publicly available and composed of about 100 million of tagged images. A web-based GUI has been designed to allow the user to perform both textual and visual similarity search on the dataset of images.
File