ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-05062022-162420

Tipo di tesi

Tesi di dottorato di ricerca

URN

etd-05062022-162420

Titolo

Tracking Linguistic Abilities in Neural Language Models

Settore scientifico disciplinare

INF/01 - INFORMATICA

Corso di studi

INFORMATICA

Parole chiave

interpretability
machine learning
neural language models
NLP
probing tasks

Data inizio appello

24/05/2022

Consultabilità

Completa

Riassunto (Inglese)

Riassunto (Italiano)

In he last few years, the analysis of the inner workings of state-of-the-art Neural Language Models (NLMs) has become one of the most addressed line of research in Natural Language Processing (NLP). Several techniques have been devised to obtain meaningful explanations and to understand how these models are able to capture semantic and linguistic knowledge. The goal of this thesis is to investigate whether exploiting NLP methods for studying human linguistic competence and, specifically, the process of written language evolution is it possible to understand the behaviour of state-of-the-art Neural Language Models (NLMs).
First, we present an NLP-based stylometric approach for tracking the evolution of written language competence in L1 and L2 learners using a wide set of linguistically motivated features capturing stylistic aspects of a text. Then, relying on the same set of linguistic features, we propose different approaches aimed at investigating the linguistic knowledge implicitly learned by NLMs. Finally, we propose a study in order to investigate the robustness of one of the most prominent NLM, i.e. BERT, when dealing with different types of errors extracted from authentic texts written by L1 Italian learners.

File

Nome file	Dimensione
PhD_Thes...aschi.pdf	19.52 Mb
Relazion...aschi.pdf	55.11 Kb
Contatta l’autore