Tesi etd-09302025-130414 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
CORALLO, MARCO ANTONIO
URN
etd-09302025-130414
Titolo
Towards Trustworthy Machine Learning: An Abstract Interpretation Framework for Classification Preservation
Dipartimento
INFORMATICA
Corso di studi
INFORMATICA
Relatori
relatore Prof. Bruni, Roberto
relatore Prof.ssa Gori, Roberta
relatore Prof.ssa Gori, Roberta
Parole chiave
- abstract interpretation
- black-box verification
- model interpretability
- model verification
- trustworthy machine learning
Data inizio appello
17/10/2025
Consultabilità
Completa
Riassunto
The rapid adoption of machine learning (ML) models has led to transformative advances across complex domains and tasks.
However, their increasing deployment in safety- and security-critical contexts has raised pressing concerns about their trustworthiness.
Challenges related to interpretability, fairness, robustness, and security underscore the gap between high performance on benchmark datasets and reliable behavior in real-world applications.
Traditional certification methods attempt to address these issues but typically rely on white-box access to model internals, which is infeasible in many practical settings such as proprietary services and cloud-based deployments.
This thesis explores an alternative approach towards black-box verification.
The proposed framework reconstructs the behavior of a reference classifier through a collection of fine-grained, interpretable abstractions.
We introduce a novel method to define, synthesize, and evaluate non-relational abstractions capable of discriminating between selected classes while preserving the behavior of the reference model.
Finally, we show that this compositional abstraction framework can faithfully approximate complex classifiers on real-world datasets, while enhancing interpretability and transparency, thus providing a foundation for sound-by-composition verification.
However, their increasing deployment in safety- and security-critical contexts has raised pressing concerns about their trustworthiness.
Challenges related to interpretability, fairness, robustness, and security underscore the gap between high performance on benchmark datasets and reliable behavior in real-world applications.
Traditional certification methods attempt to address these issues but typically rely on white-box access to model internals, which is infeasible in many practical settings such as proprietary services and cloud-based deployments.
This thesis explores an alternative approach towards black-box verification.
The proposed framework reconstructs the behavior of a reference classifier through a collection of fine-grained, interpretable abstractions.
We introduce a novel method to define, synthesize, and evaluate non-relational abstractions capable of discriminating between selected classes while preserving the behavior of the reference model.
Finally, we show that this compositional abstraction framework can faithfully approximate complex classifiers on real-world datasets, while enhancing interpretability and transparency, thus providing a foundation for sound-by-composition verification.
File
| Nome file | Dimensione |
|---|---|
| MSc_Thes...rallo.pdf | 984.84 Kb |
Contatta l’autore |
|