logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-03212021-145443


Tipo di tesi
Tesi di laurea magistrale
Autore
FIRMANI, CHIARA
URN
etd-03212021-145443
Titolo
Membership Inference Attack against Copy's framework
Dipartimento
INFORMATICA
Corso di studi
DATA SCIENCE AND BUSINESS INFORMATICS
Relatori
relatore Prof.ssa Monreale, Anna
correlatore Prof. Pujol, Oriol
tutor Naretto, Francesca
Parole chiave
  • machine learning
  • deep
  • neural network
  • random forest
  • privacy issue
  • inference attack
  • copy
Data inizio appello
07/05/2021
Consultabilità
Non consultabile
Data di rilascio
07/05/2091
Riassunto
Nowadays Machine Learning models have been employed in many domains due to their extremely good performance. However, the use of ML models, especially black boxes, may arise many privacy concerns, since they are often trained on real-worldsensitive datasets.One of the worst issue arises when a malicious adversary is able to infer somesensitive information about the users or objects belonging to the corporate databaseused for training the original classification model. In particular, in Shokri et al., 2017, the authors explore a methodology to attacka black box model, with the goal of determining whether a given record was partor not of the originally used training dataset, hence exposing privacy threats. Thisis the so called Membership Inference Attack, which exploits some limitations, also known as weaker points, of black box models, such as the tendency to overfitting,adapting too much to the training data. We meticulously assess the actual privacy risk in 18 different case scenarios, with a different level of risk, by making different assumptions on the amount of data information the attacker can access in advanced, including extreme cases, that isnone and all. Therefore, we assess the potentiality of the Membership Inference Attack byconducting a substantial investigation of its structure and of the methodology inwhich it has its roots. Thus, we progressively enhance its effectiveness and impacton black boxes, through an ad-hoc tuning of its parameters.We end up with an in-depth analysis of the actual risk cases and with a proposalfor its mitigation. It is a copy framework [Unceta, Nin, and Pujol, 2020], able to overcome many limitations of standard machine learning models, including over-sensitivity to inference attacks. We exploit the copy model in order to obtain a finalmodel with equal (or even better) performance with respect to the related original one, with the great advantage to be trained on synthetically generated data. Then, we try to attack the copy framework with the Membership Inference Attack, empirically demonstrating that it is not effective anymore, or at least its effectiveness is decreased to the point to not be able to affirm anymore that a consistent privacy risk is present.
File