ETD

Digital archive of theses discussed at the University of Pisa

 

Thesis etd-05062022-162420


Thesis type
Tesi di dottorato di ricerca
Author
MIASCHI, ALESSIO
URN
etd-05062022-162420
Thesis title
Tracking Linguistic Abilities in Neural Language Models
Academic discipline
INF/01
Course of study
INFORMATICA
Supervisors
tutor Dott. Dell'Orletta, Felice
relatore Prof.ssa Monreale, Anna
Keywords
  • NLP
  • machine learning
  • neural language models
  • interpretability
  • probing tasks
Graduation session start date
24/05/2022
Availability
Full
Summary
In he last few years, the analysis of the inner workings of state-of-the-art Neural Language Models (NLMs) has become one of the most addressed line of research in Natural Language Processing (NLP). Several techniques have been devised to obtain meaningful explanations and to understand how these models are able to capture semantic and linguistic knowledge. The goal of this thesis is to investigate whether exploiting NLP methods for studying human linguistic competence and, specifically, the process of written language evolution is it possible to understand the behaviour of state-of-the-art Neural Language Models (NLMs).
First, we present an NLP-based stylometric approach for tracking the evolution of written language competence in L1 and L2 learners using a wide set of linguistically motivated features capturing stylistic aspects of a text. Then, relying on the same set of linguistic features, we propose different approaches aimed at investigating the linguistic knowledge implicitly learned by NLMs. Finally, we propose a study in order to investigate the robustness of one of the most prominent NLM, i.e. BERT, when dealing with different types of errors extracted from authentic texts written by L1 Italian learners.
File