Tesi etd-11072024-114738 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
PASQUALETTI, MATTEO
URN
etd-11072024-114738
Titolo
Investigating the surprising benefits of noise in Retrieval-Augmented Generation systems
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Tonellotto, Nicola
Parole chiave
- attention modification
- attention weights
- large language model
- power of noise
- retrieval augmented generation
Data inizio appello
26/11/2024
Consultabilità
Completa
Riassunto
This thesis will aim on understanding how the introduction of noise influences text generation within Retrieval-Augmented Generation (RAG) systems. By exploring the effects of incorporating unrelated documents or random text alongside relevant sources, the research will seek to uncover the mechanisms that lead to an overall improved accuracy in question answering, known as "The Power of Noise".
The analysis will focus on how the presence of noise interacts with the generation component of the RAG system. These findings are expected to help identify strategies for designing ad-hoc prompts and provide a deeper understanding of how Large Language Models (LLMs) reason in generating answers. A central result anticipated from this research will be the identification of the "Lost in the Middle" effect, where non-distracting, unrelated information will surprisingly enhance the system's ability to focus on relevant data. However, a definitive explanation for this phenomenon has to be determined yet.
The results of this research are expected to have significant implications for applications that rely on up-to-date, accurate information retrieval and generation.
The analysis will focus on how the presence of noise interacts with the generation component of the RAG system. These findings are expected to help identify strategies for designing ad-hoc prompts and provide a deeper understanding of how Large Language Models (LLMs) reason in generating answers. A central result anticipated from this research will be the identification of the "Lost in the Middle" effect, where non-distracting, unrelated information will surprisingly enhance the system's ability to focus on relevant data. However, a definitive explanation for this phenomenon has to be determined yet.
The results of this research are expected to have significant implications for applications that rely on up-to-date, accurate information retrieval and generation.
File
Nome file | Dimensione |
---|---|
tesi_pasqualetti.pdf | 2.82 Mb |
Contatta l’autore |