Tipo di tesi
Tesi di laurea magistrale
Titolo
Feature Reduction and Outlier Detection for Unbalanced Learning
Corso di studi
DATA SCIENCE AND BUSINESS INFORMATICS
Parole chiave
- classification framework
- features projection
- features selection
- imbalanced data learning
- outlier detection
Data inizio appello
22/07/2022
Riassunto (Italiano)
In many analysis contexts, training efficient ML models can be complex because of unbalanced data. In cases such as fraud detection, oil spill, rare disease detection and many others, the available data for these uncommon events are limited. Many techniques commonly used in such situations try to rebalance instances belonging to the various classes through removal of majority instances and generation of synthetic or cloned minority ones. Such approaches, however, often achieve unsatisfactory results. In this dissertation, FROID framework is presented, which aims to solve the problem of unbalanced learning through a change of perspective: instead of rebalancing the available data, a feature extraction process is carried out through Outlier Detection and Feature Reduction techniques, to better argue the available instances and allow more accurate hypothesis generation by the models. The effectiveness of FROID is demonstrated through a series of experiments conducted on a large set of benchmark datasets and also on two real case studies.