Tesi etd-02012023-182023

Tipo di tesi

Tesi di laurea magistrale

Autore

ANNISA, DININTA

URN

etd-02012023-182023

Titolo

Cross-Task Generalization via In-Context Tuning

Dipartimento

INFORMATICA

Corso di studi

INFORMATICA

Relatori

relatore Attardi, Giuseppe

Parole chiave

cross-task generalization
in-context tuning
language models
natural language processing

Data inizio appello

24/02/2023

Consultabilità

Tesi non consultabile

Riassunto

Cross-task generalization is the ability of an intelligent system to generalize to unseen tasks, allowing it to solve tasks it was not trained for. In Natural Language Processing, there is a growing interest in achieving this level of generalization using pre-trained language models.

In this thesis, we study various techniques to train cross-task models and approaches in using language models through an extensive literature review. Specifically, we investigate a new training objective where the model predicts an instance given a short instruction and a few examples. This method is called in-context tuning in the literature. We fine-tune T5 base and large using this technique and evaluate it using CrossFit, a repository of few-shot NLP tasks which contains classification, question answering, content generation, and some other types of tasks. Our experiments show that in-context tuning allows cross-task generalization with better performance compared to standard fine-tuned language models in a few-shot setting. Our model also outperforms raw GPT-J in 7 out of 8 classification tasks and 2 out of 3 question answering tasks with lower variance, suggesting that it is less sensitive to the choice of examples in the context.

File

Nome file	Dimensione
Tesi non consultabile. Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-02012023-182023