Tesi etd-03302025-231847

Tipo di tesi

Tesi di laurea magistrale

Autore

BERGAMI, GIOVANNI

URN

etd-03302025-231847

Titolo

Design of method to explain transformers based on similarity-differences and uniqueness for long text classification

Dipartimento

INGEGNERIA DELL'INFORMAZIONE

Corso di studi

ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING

Relatori

relatore Prof. Cimino, Mario Giovanni Cosimo Antonio
relatore Dott. Parola, Marco
relatore Prof. Sabet Jahromi, Mohammad Naser

Parole chiave

ai
artificial intelligence
bert
deep learning
explainability
interpretability
legal dataset
long text
sidu
text classification
transformers
xai

Data inizio appello

14/04/2025

Consultabilità

Non consultabile

Data di rilascio

14/04/2095

Riassunto

Explainable Artificial Intelligence (XAI) plays a crucial role in making deep learning models more interpretable, particularly in high-stakes domains such as the legal field. This thesis explores the development of a novel XAI method inspired by the SIDU method for long-text classification in legal and sentiment analysis datasets. Specifically, it examines how different classification techniques—such as First-512 tokens, Last-512 tokens, Random-512 tokens, and Random-512 with rationale—affect model interpretability. The study focuses on two transformer-based architectures, BERT and RoBERTa, and introduces novel XAI approaches, including Cosine Similarity Masking (thresholded and ranged) and Persistent Homology Masking (based on angular and Euclidean distances), alongside SHAP as a baseline comparison. A key aspect of this work is the utilization of datasets with annotated rationales, which highlight the most relevant text segments for classification. Both qualitative and quantitative analyses demonstrate that the proposed XAI methods produce explanations that align with human-annotated rationales, indicating that the models' internal representations capture meaningful semantic information.

File

Nome file	Dimensione
La tesi non è consultabile. Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-03302025-231847