logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-11202023-200029


Tipo di tesi
Tesi di laurea magistrale
Autore
GALIMZHANOVA, ELNARA
URN
etd-11202023-200029
Titolo
Leveraging Large Language Models for Conversational Search
Dipartimento
INFORMATICA
Corso di studi
DATA SCIENCE AND BUSINESS INFORMATICS
Relatori
relatore Passaro, Lucia C.
Parole chiave
  • Conversational Search
  • gpt
  • Large Language Model
Data inizio appello
01/12/2023
Consultabilità
Non consultabile
Data di rilascio
01/12/2093
Riassunto
In this study, we investigate the capabilities of instructed Large Language Models (LLMs) to improve conversational search. Our main focus is on the practice of rephrasing user utterances within a conversational context, by integrating conversational settings into the input. We aim to identify the prompts that generate the most informative rewritten queries, ultimately leading to improved retrieval performance. To achieve this, we rewrite user prompts using the gpt-3.5-turbo model, employing various prompt formulations. To assess various utterance rewritings and build our information retrieval pipeline, we employ a two-stage process. Initially, we utilize the Dirichlet Prior Hierarchical (DPH) model to index the document collection. Subsequently, we utilize the MonoT5 model to re-rank the retrieved documents. Our experimental methodology is applied to publicly available TREC CAST datasets that comprise two distinct versions: one featuring automatically rewritten utterances and the other with manually rewritten utterances. It is important to highlight that, despite our efforts to incorporate conversational context into the input, we did not observe a significant increase in performance results when compared to manually rewritten utterances. Nevertheless, our experiments consistently demonstrate that, in most instances, our proposed prompting techniques outperform the baseline approaches.
File