Tesi etd-06292022-141125

Tipo di tesi

Tesi di laurea magistrale

Autore

GHOLAMI, SINA

URN

etd-06292022-141125

Titolo

Towards Lexicographic Support Vector Machines using Non-Standard Analysis

Dipartimento

INGEGNERIA DELL'INFORMAZIONE

Corso di studi

ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING

Relatori

relatore Cococcioni, Marco
relatore Lazzerini, Beatrice
relatore Fiaschi, Lorenzo

Parole chiave

lexicographic multi-objective optimization
multi-objective optimization
non-archimedean
support vector machines
weighted feature svm

Data inizio appello

22/07/2022

Consultabilità

Non consultabile

Data di rilascio

22/07/2092

Riassunto

Support vector machines (SVM), known as one of the most famous supervised learning algorithms, have been widely used and studied in a wide range of applications and research fields for both classification and regression problems. In the present study, we explore the possibility of constructing a lexicographic SVM by configuring its objective functions lexicographically, thereby converting a single objective problem (e.g. SVM) into a multi-objective optimization problem (MOP). Lexicographic ordering entails a hierarchy of priorities in which one objective can be infinitely more important than another. This analysis can be performed in a standard or non-standard field with real numbers as well as Euclidean numbers thanks to the introduction of the Non-Archimedean (NA) framework. To conduct the standard analysis, we used the Big-M methodology, and to conduct the non-standard analysis, we used the NA approach. Two distinct binary classification scenarios that closely resemble real-world problems are proposed to test both approaches. In the first scenario, an SVM with weighted features (WFSVM) considers features of varying importance in such a way that some are infinitely more or less important than others. The second scenario suggests that the empirical error for the positive class will be infinitely more significant than the empirical error for the negative class, and it will also be infinitely more significant than maximizing the margin. These problems can be solved mathematically in two distinct reformulations, thus there are two reformulations for each classification scenario; either a set of objectives called A is infinitely smaller than another set of objectives called B (reformulation one), or B is infinitely larger than A (reformulation two). Both reformulations of the problem are semantically equal, but in reformulation one, A is divided by a certain weight value, while in reformulation two, B is multiplied by the reciprocal of the weight value (called M). As well as that, each reformulation has a primal and dual format. In our experiments, reformulation one shows a much greater likelihood of convergent solution than reformulation two using the standard or Big-M approach, but when the non-standard or NA approach is employed, there is no difference between the two reformulations regarding convergence, no modification of input is necessary, and no regularization of weights is required. Furthermore, the standard analysis shows that the primal reformulation of our proposed problems is more stable than the dual, since the dual reformulation may exhibit divergent behavior depending on how big the value of M in both reformulations one and two is. Moreover, study results indicate that in specified benchmarks the dual reformulation used in the NA approach yields meaningful results, whereas the dual reformulation does not converge when M is very large using the Big-M approach. Even though this result is promising, showing that the NA framework is superior to the standard framework in some particular benchmarks when it comes to working with infinite or infinitesimal values, the NA approach also struggles sometimes. According to our investigation, the LU factorization of Euclidean numbers is responsible for the non-converging issue. Further research is required to determine if there are any methods through which the instability of the LU factorization of Euclidean numbers can be resolved, and it is this goal that will be the focus of future works.

File

Nome file	Dimensione
La tesi non è consultabile. Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-06292022-141125