logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-01222024-232801


Tipo di tesi
Tesi di laurea magistrale
Autore
CASCIONE, ALESSIO
URN
etd-01222024-232801
Titolo
Pivotal Instances Identification for Learning Interpretable Decision-Making Models
Dipartimento
FILOLOGIA, LETTERATURA E LINGUISTICA
Corso di studi
INFORMATICA UMANISTICA
Relatori
relatore Guidotti, Riccardo
relatore Setzu, Mattia
Parole chiave
  • decision tree
  • machine learning
  • prototype selection
  • XAI
Data inizio appello
09/02/2024
Consultabilità
Completa
Riassunto
Machine Learning tools have become an essential aid to decision-making processes, leading to improvements in the resources we leverage to tackle business problems and social issues. At the same time, many Machine Learning methodologies rely on complex architectures, making it difficult for experts to provide a reasonable explanation for the choices made by a particular model, as well as for the common user to understand why such choices have been made. Therefore, the challenge of constructing interpretable models is crucial. We can identify this as the main goal of Explainable AI (XAI).

We present PivotTree, a distance based interpretable classification approach inspired by Decision Tree classifiers structure. PivotTree learns a tree-like structure of rules used to classify unseen instances based on the similarity they share with respect to training instances considered prototypical for the node they are associated with. The model can be used both as a prototype selection method and as an autonomous classification model, as it returns both the candidate prototypes identified for each non-terminal node and a tree of if-then decision rules used for data partitioning.

The number of prototypes identified is not fixed in advance, as it depends on the training process over the data. Moreover, the prototypes selected during this process can be organized hierarchically based on the depth of the node to which they contribute in the splitting function parameters learning step.

Results for the selected experimental data sets indicate that PivotTree, when employed as a classification tool, tends to perform worse than state-of-the-art rule-based or instance-based classifiers on tabular datasets but shows competitiveness on textual ones. On the other hand, PivotTree exhibits better overall performance when utilized as a prototype selection method compared to existing competitors.
File