logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-09122025-214441


Tipo di tesi
Tesi di laurea magistrale
Autore
DI VITO, TOMMASO
URN
etd-09122025-214441
Titolo
Exploiting Learned Sparse Representations in Retrieval-augmented Generation with FlashRAG
Dipartimento
INFORMATICA
Corso di studi
INFORMATICA
Relatori
relatore Dott. Nardini, Franco Maria
relatore Dott. Rulli, Cosimo
relatore Prof. Venturini, Rossano
Parole chiave
  • FlashRAG
  • Information Retrieval
  • RAG
  • Retrieval systems
  • Retrieval-augmented Generation
  • SEISMIC
  • Sparse Representations
Data inizio appello
17/10/2025
Consultabilità
Non consultabile
Data di rilascio
17/10/2028
Riassunto
Large Language Models (LLMs) have achieved remarkable progress in natural language processing but remain prone to factual inaccuracies and hallucinations (false answers), particularly when dealing with domain-specific knowledge or rapidly changing information. Retrieval-Augmented Generation (RAG) addresses these issues by combining information retrieval with generative modeling, grounding outputs in external evidence. Examples of RAG systems include ChatGPT, Gemini, and Claude. This thesis investigates the integration of sparse neural retrieval into RAG pipelines, focusing on the SPLADE retriever and the SEISMIC inverted-index framework for efficient and scalable approximate retrieval. Compared to dense and lexical methods, this approach improves retrieval efficiency and scalability while maintaining accuracy. The study also analyzes different pipeline configurations, addressing challenges such as context length limitations and irrelevant document injection. Furthermore, two RAG libraries, Bergen and FlashRAG, are extended to support SPLADE with SEISMIC; Bergen reveals architectural limitations, while FlashRAG is further enhanced with new evaluation metrics, modularized components, and distributed execution features. Experiments on MS-MARCO and Natural Questions demonstrate that sparse retrieval achieves competitive or superior performance with lower latency under specific conditions and that mitigation techniques such as reranking, noise injection, and sliding-window context handling significantly influence end-to-end RAG quality. Overall, this work provides scalable solutions, comprehensive evaluation methodologies, and novel insights to guide future research and applications of RAG systems.
File