logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-11072022-161039


Tipo di tesi
Tesi di laurea magistrale
Autore
ABBATTISTA, MARIANNA
Indirizzo email
m.abbattista1@studenti.unipi.it, mariabba97@gmail.com
URN
etd-11072022-161039
Titolo
Interpretable Data Partitioning through Tree-based Clustering Methods
Dipartimento
INFORMATICA
Corso di studi
DATA SCIENCE AND BUSINESS INFORMATICS
Relatori
relatore Prof. Guidotti, Riccardo
Parole chiave
  • Clustering
  • Data Partitioning
  • Tree
  • Tree-based Clustering Methods
  • Unsupervised Methods
Data inizio appello
02/12/2022
Consultabilità
Non consultabile
Data di rilascio
02/12/2025
Riassunto
The expanding field of eXplainable Artificial Intelligence research is primarily concerned with the development of methods for interpreting Supervised learning approaches.
But we know that exists also the Unsupervised approach that maybe can have benefit, but are not so much interpretative as the Supervised one.
We know that with existing clustering methods we have a result that is not easy to understand clearly. They typically return the assignment of each record to the corresponding cluster without providing the reason why we have that partitioning.

Unlike previous works, we have decided to define various Tree-based clustering methods that can explain the data partitioning using a shallow Decision Tree.
They make it possible to explain each cluster assignment by using a concise and simple set of split conditions.
Numerous experiments demonstrate that, in terms of standard evaluation metrics and run-time on both synthetic and real datasets, our proposals are in line with both the traditional and interpretable clustering approaches that are currently in use.
Last but not least, a Case Study involving real humans demonstrates the value of the interpretable clustering trees our proposal returns.
File