Tesi etd-02052024-125042 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
PAPUCCI, MICHELE
URN
etd-02052024-125042
Titolo
Label Selection in Text-to-Text Neural Language Models for Classification
Dipartimento
INFORMATICA
Corso di studi
DATA SCIENCE AND BUSINESS INFORMATICS
Relatori
relatore Prof.ssa Monreale, Anna
relatore Prof. Dell'Orletta, Felice
relatore Prof. Dell'Orletta, Felice
Parole chiave
- ai
- classification
- intelligenza artificiale
- natural language processing
- nlp
- text-to-text
- transformers
Data inizio appello
23/02/2024
Consultabilità
Tesi non consultabile
Riassunto
This work contains a set of preliminary experiments with the objective of exploring and optimizing the use of reasonably small text-to-text Transformers to solve classification tasks. The broader objective is to see if we can use text-to-text Language Models, that aren't costly to train and deploy like the ones that are currently very popular (e.g. Chat-GPT or LLaMa), as a unifying framework to solve any Natural Language Processing tasks. Contrary to what we need to do with larger models, with reasonably sized Transformers we need to find optimal way of casting the tasks into a text-to-text form, i.e. having a textual input, and expecting a textual output from the model. This thesis focuses on classification tasks, and in particular on the problem of how to represent class names into the best possible strings that maximize performances for the model. First, we evaluated whether this smaller models can obtain reasonable performances in classification tasks. Then, we tested the importance of label representation in this settings, finding that is, indeed, important to maximize the model performances.
Finally, we presented and evaluated a novel technique to extract label representation from the training set of a classification task based on Attention-attribution explainability methods.
Finally, we presented and evaluated a novel technique to extract label representation from the training set of a classification task based on Attention-attribution explainability methods.
File
Nome file | Dimensione |
---|---|
Tesi non consultabile. |