Tesi etd-07052024-150551 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
PACINI, GIACOMO
Indirizzo email
g.pacini14@studenti.unipi.it, giacomopacini98@gmail.com
URN
etd-07052024-150551
Titolo
Advanced Query Suggestion for Interactive Text-to-Image Retrieval: a novel task and benchmark
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Prof. Tonellotto, Nicola
relatore Dott. Falchi, Fabrizio
relatore Dott. Carrara, Fabio
relatore Dott. Messina, Nicola
relatore Dott. Falchi, Fabrizio
relatore Dott. Carrara, Fabio
relatore Dott. Messina, Nicola
Parole chiave
- cross modal retrieval
- information retrieval
- interactive image search
- query expansion
- query suggestion
- retrieval benchmark
- text to image retrieval
Data inizio appello
26/07/2024
Consultabilità
Completa
Riassunto
The increasing amount of multimedia elements in visual collections fosters the development of novel and customizable interactive image retrieval systems. While current literature focuses on improving and measuring the ability of a system to retrieve the most relevant items given a natural language query, few works have tried to make these systems more interactive to enhance the browsing experience.
The objective of this master thesis is to introduce and define a novel task in the field of cross-modal retrieval, termed "Visual Guided Query Suggestion" (VGQS). This task aims to enhance user experience in cross-modal retrieval systems by generating expanded query suggestions based on the initial search results.
Specifically, VGQS systems automatically suggest the smallest textual modifications needed to explore visually consistent subsets of the collection.
To facilitate the evaluation and development of methods addressing VGQS, we present a comprehensive benchmark dataset. This dataset consists of initial queries, grouped result sets, and human-defined expanded queries for each group.
We establish dedicated metrics to rigorously evaluate the performance of various methods on this task. These metrics are designed to measure the representativeness, specificity and similarity to the original query of the suggested expanded ones. Baseline methods, adapted from related fields such as image captioning and query expansion, are applied to this task to provide reference performance scores.
The thesis details the creation of the benchmark dataset, the definition of the evaluation metrics, and the adaptation of baseline methods. Experimental results are presented, showcasing the performance of these baseline methods and highlighting the potential and challenges of the VGQS task.
This work lays the foundation for future research in enhancing cross-modal retrieval systems through intelligent query expansion strategies.
VGQS, integrated into interactive multimedia browsing software, aims at increasing its interactivity and the overall user experience.
The objective of this master thesis is to introduce and define a novel task in the field of cross-modal retrieval, termed "Visual Guided Query Suggestion" (VGQS). This task aims to enhance user experience in cross-modal retrieval systems by generating expanded query suggestions based on the initial search results.
Specifically, VGQS systems automatically suggest the smallest textual modifications needed to explore visually consistent subsets of the collection.
To facilitate the evaluation and development of methods addressing VGQS, we present a comprehensive benchmark dataset. This dataset consists of initial queries, grouped result sets, and human-defined expanded queries for each group.
We establish dedicated metrics to rigorously evaluate the performance of various methods on this task. These metrics are designed to measure the representativeness, specificity and similarity to the original query of the suggested expanded ones. Baseline methods, adapted from related fields such as image captioning and query expansion, are applied to this task to provide reference performance scores.
The thesis details the creation of the benchmark dataset, the definition of the evaluation metrics, and the adaptation of baseline methods. Experimental results are presented, showcasing the performance of these baseline methods and highlighting the potential and challenges of the VGQS task.
This work lays the foundation for future research in enhancing cross-modal retrieval systems through intelligent query expansion strategies.
VGQS, integrated into interactive multimedia browsing software, aims at increasing its interactivity and the overall user experience.
File
Nome file | Dimensione |
---|---|
MasterTh...acomo.pdf | 37.09 Mb |
Contatta l’autore |