ETD

Archivio digitale delle tesi discusse presso l'Università di Pisa

Tesi etd-05062022-162420


Tipo di tesi
Tesi di dottorato di ricerca
Autore
MIASCHI, ALESSIO
URN
etd-05062022-162420
Titolo
Tracking Linguistic Abilities in Neural Language Models
Settore scientifico disciplinare
INF/01
Corso di studi
INFORMATICA
Relatori
tutor Dott. Dell'Orletta, Felice
relatore Prof.ssa Monreale, Anna
Parole chiave
  • NLP
  • machine learning
  • neural language models
  • interpretability
  • probing tasks
Data inizio appello
24/05/2022
Consultabilità
Completa
Riassunto
In he last few years, the analysis of the inner workings of state-of-the-art Neural Language Models (NLMs) has become one of the most addressed line of research in Natural Language Processing (NLP). Several techniques have been devised to obtain meaningful explanations and to understand how these models are able to capture semantic and linguistic knowledge. The goal of this thesis is to investigate whether exploiting NLP methods for studying human linguistic competence and, specifically, the process of written language evolution is it possible to understand the behaviour of state-of-the-art Neural Language Models (NLMs).
First, we present an NLP-based stylometric approach for tracking the evolution of written language competence in L1 and L2 learners using a wide set of linguistically motivated features capturing stylistic aspects of a text. Then, relying on the same set of linguistic features, we propose different approaches aimed at investigating the linguistic knowledge implicitly learned by NLMs. Finally, we propose a study in order to investigate the robustness of one of the most prominent NLM, i.e. BERT, when dealing with different types of errors extracted from authentic texts written by L1 Italian learners.
File