logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-01312022-174930


Tipo di tesi
Tesi di laurea magistrale
Autore
ACQUAVIA, ANTONIO
URN
etd-01312022-174930
Titolo
Efficiency-effectiveness Trade-offs in Neural IR Models
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Dott. Tonellotto, Nicola
relatore Dott. Macdonald, Craig
Parole chiave
  • static pruning
  • information retrieval
  • dense retrieval
Data inizio appello
18/02/2022
Consultabilità
Non consultabile
Data di rilascio
18/02/2025
Riassunto
Neural ranking models use shallow or deep neural networks to rank search results in response to a query. The adoption of neural networks in the information retrieval field allows the creation of a representation of the query and of the documents that capture the semantics of the terms. This leads to better results with respect to the classic information retrieval: each term is no longer represented as a single value but as a dense vector, whose values are computed employing a neural network. Despite good results, this approach requires a huge amount of training data and resources to work properly. The challenge is to improve the efficiency of the neural information retrieval models working on the negative aspects and reach a satisfactory trade-off. We adapt some already existing static pruning techniques to work with dense indices and we propose new approaches that exploit the embeddings space features. Experiments conducted on MSMarco passage ranking corpus demonstrate that approaches based on inverse document frequency lead to the most acceptable trade-off, e.g., a loss of 2-5% in nDCG@10 w.r.t. a reduction of 2x in index size.
File