logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-02012023-182023


Tipo di tesi
Tesi di laurea magistrale
Autore
ANNISA, DININTA
URN
etd-02012023-182023
Titolo
Cross-Task Generalization via In-Context Tuning
Dipartimento
INFORMATICA
Corso di studi
INFORMATICA
Relatori
relatore Attardi, Giuseppe
Parole chiave
  • cross-task generalization
  • in-context tuning
  • language models
  • natural language processing
Data inizio appello
24/02/2023
Consultabilità
Tesi non consultabile
Riassunto
Cross-task generalization is the ability of an intelligent system to generalize to unseen tasks, allowing it to solve tasks it was not trained for. In Natural Language Processing, there is a growing interest in achieving this level of generalization using pre-trained language models.

In this thesis, we study various techniques to train cross-task models and approaches in using language models through an extensive literature review. Specifically, we investigate a new training objective where the model predicts an instance given a short instruction and a few examples. This method is called in-context tuning in the literature. We fine-tune T5 base and large using this technique and evaluate it using CrossFit, a repository of few-shot NLP tasks which contains classification, question answering, content generation, and some other types of tasks. Our experiments show that in-context tuning allows cross-task generalization with better performance compared to standard fine-tuned language models in a few-shot setting. Our model also outperforms raw GPT-J in 7 out of 8 classification tasks and 2 out of 3 question answering tasks with lower variance, suggesting that it is less sensitive to the choice of examples in the context.
File