Tesi etd-06012022-053415 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
MAZZONI, FEDERICO
URN
etd-06012022-053415
Titolo
Genetic Fairness-Enhancing Data Generation Framework
Dipartimento
FILOLOGIA, LETTERATURA E LINGUISTICA
Corso di studi
INFORMATICA UMANISTICA
Relatori
relatore Prof. Guidotti, Riccardo
relatore Marchiori, Marta
relatore Cinquini, Martina
relatore Marchiori, Marta
relatore Cinquini, Martina
Parole chiave
- algorithmic bias
- data awareness
- data balancing
- digital discrimination
- fairness
- framework
- genetic alghorithm
- ml evaluation
- oversampling
- preprocessing
- synthetic data
Data inizio appello
11/07/2022
Consultabilità
Completa
Riassunto
The fast and recent widespread adoption of machine learning models has made an inherent flaw of the paradigm clear. Since the process is heavenly dependent on the set of data used in the training phase, any bias arising from the training collection is inherited by decision models and propagated through its automatic processes. Several techniques have been proposed to balance the training dataset with respect to sensitive attributes such as ethnicity, gender, age, or religion, aiming at developing a discrimination-free model. This thesis presents FairGen, a framework to improve the dataset’s fairness through Genetic Algorithms.
FairGen extends and improves the fairness-enhancing algorithm Preferential Sampling by generating fair and plausible data to be used as input for training machine learning classification models. We compared FairGen against state-of-the-art pre-processing algorithms and data generation approaches customized with the same ideas for fairness and plausibility implemented by FairGen. Results show that FairGen is able to successfully remove the discrimination in the training dataset, resulting in fairer models than those trained on datasets obtained with state-of-the-art approaches.
FairGen extends and improves the fairness-enhancing algorithm Preferential Sampling by generating fair and plausible data to be used as input for training machine learning classification models. We compared FairGen against state-of-the-art pre-processing algorithms and data generation approaches customized with the same ideas for fairness and plausibility implemented by FairGen. Results show that FairGen is able to successfully remove the discrimination in the training dataset, resulting in fairer models than those trained on datasets obtained with state-of-the-art approaches.
File
Nome file | Dimensione |
---|---|
Tesi_Mazzoni.pdf | 1.28 Mb |
Contatta l’autore |