ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-10312025-183822

Tipo di tesi

Tesi di dottorato di ricerca

URN

etd-10312025-183822

Titolo

Explainability in Federated Learning and Selective Classification

Settore scientifico disciplinare

INF/01 - INFORMATICA

Corso di studi

DOTTORATO NAZIONALE IN INTELLIGENZA ARTIFICIALE

Parole chiave

ethical artificial intelligence
explainable artificial intelligence
federated learning
selective classification

Data inizio appello

03/11/2025

Consultabilità

Completa

Riassunto (Inglese)

Riassunto (Italiano)

Modern Machine Learning has achieved remarkable predictive accuracy, yet its wide deployment demands interpretability alongside performance. While the field of Explainable Artificial Intelligence has matured considerably, offering sophisticated methods from SHAP to counterfactual explanations, these techniques often assume centralised data access and deterministic predictions. This thesis addresses the gap in extending explainability methodologies to complex deployment contexts where such assumptions do not hold.
This work explores two paradigmatic challenges in modern Artificial Intelligence deployment. First, Federated Learning environments where privacy constraints do not allow data centralisation, making traditional explanation methods that require data or IID feature distributions incompatible with distributed architectures. Second, selective classification systems where models can abstain from predictions, requiring explanations not only for the outcome but also for the abstention. These contexts represent the norm as Artificial Intelligence operates under regulatory constraints, privacy requirements, safety and ethical considerations.
The thesis presents three main contributions addressing these challenges. iFLASH (Interpretable Federated Learning Aggregation of SHAP values) enables high-quality SHAP explanations in federated settings by having each client generate local explanations using their private data, then smartly aggregating these at the server using faithfulness-based weighting strategies. Experiments across multiple datasets demonstrate that federated explanations can match or exceed centralised quality, with the aggregation strategy consistently outperforming naive averaging, particularly in cross-silo scenarios.
Fastshap++ extends this work by training neural explainers directly in federated settings with differential privacy guarantees. Rather than aggregating explanations, clients jointly train surrogate and explainer networks, sharing only model weights. The integration of differential privacy throughout the pipeline ensures formal privacy protection while maintaining explanation faithfulness comparable to centralised approaches.
SC-CE (Selective Classification via Counterfactual Explanations) addresses the challenge of explainable abstention. By using the distance between instances and their counterfactuals as a confidence proxy, the method creates an interpretable-by-design rejection policy where the counterfactual explains why the model abstained. Experimental validation shows SC-CE matches state-of-the-art selective classifiers in predictive performance while uniquely providing human-interpretable explanations for rejection decisions. The experimental validations, spanning five datasets, and various model architectures, establish that explanation quality can be maintained or even enhanced when working within real-world constraints.
This thesis enriches XAI theory and practical deployment, enabling trustworthy Artificial Intelligence systems that can justify their decisions even when data cannot be centralised or when abstention is the most responsible choice.

File

Nome file	Dimensione
PhD_Thes...i_6_1.pdf	9.67 Mb
Contatta l’autore