Tesi etd-07052024-150551

Tipo di tesi

Tesi di laurea magistrale

Autore

PACINI, GIACOMO

Indirizzo email

g.pacini14@studenti.unipi.it, giacomopacini98@gmail.com

URN

etd-07052024-150551

Titolo

Advanced Query Suggestion for Interactive Text-to-Image Retrieval: a novel task and benchmark

Dipartimento

INGEGNERIA DELL'INFORMAZIONE

Corso di studi

ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING

Relatori

relatore Prof. Tonellotto, Nicola
relatore Dott. Falchi, Fabrizio
relatore Dott. Carrara, Fabio
relatore Dott. Messina, Nicola

Parole chiave

cross modal retrieval
information retrieval
interactive image search
query expansion
query suggestion
retrieval benchmark
text to image retrieval

Data inizio appello

26/07/2024

Consultabilità

Completa

Riassunto

The increasing amount of multimedia elements in visual collections fosters the development of novel and customizable interactive image retrieval systems. While current literature focuses on improving and measuring the ability of a system to retrieve the most relevant items given a natural language query, few works have tried to make these systems more interactive to enhance the browsing experience.
The objective of this master thesis is to introduce and define a novel task in the field of cross-modal retrieval, termed "Visual Guided Query Suggestion" (VGQS). This task aims to enhance user experience in cross-modal retrieval systems by generating expanded query suggestions based on the initial search results.
Specifically, VGQS systems automatically suggest the smallest textual modifications needed to explore visually consistent subsets of the collection.
To facilitate the evaluation and development of methods addressing VGQS, we present a comprehensive benchmark dataset. This dataset consists of initial queries, grouped result sets, and human-defined expanded queries for each group.
We establish dedicated metrics to rigorously evaluate the performance of various methods on this task. These metrics are designed to measure the representativeness, specificity and similarity to the original query of the suggested expanded ones. Baseline methods, adapted from related fields such as image captioning and query expansion, are applied to this task to provide reference performance scores.
The thesis details the creation of the benchmark dataset, the definition of the evaluation metrics, and the adaptation of baseline methods. Experimental results are presented, showcasing the performance of these baseline methods and highlighting the potential and challenges of the VGQS task.
This work lays the foundation for future research in enhancing cross-modal retrieval systems through intelligent query expansion strategies.
VGQS, integrated into interactive multimedia browsing software, aims at increasing its interactivity and the overall user experience.

File

Nome file	Dimensione
MasterTh...acomo.pdf	37.09 Mb
Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-07052024-150551