Tesi etd-01162023-001732

Tipo di tesi

Tesi di laurea magistrale

Autore

RIZZO, SIMONE

URN

etd-01162023-001732

Titolo

Attacking machine learning models and their global explainers

Dipartimento

INFORMATICA

Corso di studi

INFORMATICA

Relatori

relatore Prof.ssa Monreale, Anna
correlatore Prof.ssa Naretto, Francesca

Parole chiave

adversarial attack
data privacy
explainable ai
global explainer
machine learning
privacy
xai

Data inizio appello

24/02/2023

Consultabilità

Non consultabile

Data di rilascio

24/02/2093

Riassunto

We designed a new model agnostic attack for membership inference. This attack is able to target black box models without access to label confidence, and it works by estimating the confidence of a model's predictions by perturbing the input and observing the model's robustness. This attack generates a batch of perturbations for a given input point and assigns a robustness score to that point, which represents how confident the model is about that point. A higher robustness score indicates that the point is farther from the decision boundary and was likely part of the training set, resulting in higher confidence in the model's prediction. Conversely, a lower robustness score indicates that the point is closer to the decision boundary and the model's confidence in its prediction is lower. The attack was tested on three datasets (Adult, Bank, and a Synthetic one) and applied to three different models (Decision Tree, Random Forest, and Neural network). The results showed that the attack performed well, and it also showed a relationship between the overfitting of the model and the risk of privacy. Additionally, the attack was used against tree-based global explainers of the models to see if there is a leakage of privacy also from the explainers. The results highlighted that explainers trained on overfitted models have a higher leakage of privacy which represent a threat to privacy. To mitigate this privacy threat we designed also a model selection algorithm for explainers that grant a minimum of 85% of fidelity while protecting privacy.

File

Nome file	Dimensione
Tesi non consultabile. Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-01162023-001732