Tesi etd-09012022-093843

Tipo di tesi

Tesi di laurea magistrale

Autore

MANTI, SIMONE

URN

etd-09012022-093843

Titolo

Fenchel-Young losses and optimization methods for learning

Dipartimento

MATEMATICA

Corso di studi

MATEMATICA

Relatori

relatore Prof. Bigi, Giancarlo

Parole chiave

branch and bound
Fenchel-Young losses
global optimization
machine learning
neural networks

Data inizio appello

23/09/2022

Consultabilità

Non consultabile

Data di rilascio

23/09/2025

Riassunto

Machine Learning grew exponentially in the last decade and it is for sure a central topic in every scientific subject. The basic idea of ML is ``learn through examples''. To do so, one has to minimize the regularized empirical risk, which involves the choice of an error measure, called loss function and of a regularization term (introduced to prevent overfitting).

It seems quite clear that choosing the right loss for the right problem can significantly make the difference. In this work, we present a large class of loss functions, named Fenchel-Young losses and introduced by Blondel et al. The name comes from a well-known inequality in convex analysis, named Fenchel-Young inequality, which binds a function with its conjugate. These loss functions present a lot of benefits: first of all, they recover a lot of well-known losses, such as the mean squared error and the logistic loss. They also inherit all the good properties of the conjugate of a function.

Moreover, it is possible to define the concept of separation margin for Fenchel-Young losses, both in the unstructured and in the structured case.
In the unstructured setting, it is possible to characterize the loss with separation margin with some specific assumptions.

The central part of this thesis is the study of learning algorithm with Fenchel-Young losses.
We firstly review some learning algorithms presented in Blondel et al. for linear models.
After that, we study the case where the model is a feed-forward ReLu Neural Network.

Inspired by Zhang et al. we reformulate the problem using the fact that the ReLU function can be seen as the projection in the positive octant. We conclude with some remarks and some future projects.

The thesis is organized as follows.

In the first chapter we review some basic concepts of conjugacy in convex analysis and we present some results that we will use in the following chapters.

In the second chapter we present all the theory around Fenchel-Young losses and regularized prediction functions. We deepen the analysis in the unstructured case and we conclude with a small digression on binary classification case.

In the third chapter we present the concept of separation margin for Fenchel-Young losses. Moreover, we see how to construct the Fenchel-Young loss ``closest'' to a given loss.

In the fourth chapter we give a review of the learning algorithms with Fenchel-Young losses for linear model.

In the fifth chapter we face the learning problem in the case of ReLu Neural Network. At the end, we conclude with some future projects.

File

Nome file	Dimensione
La tesi non è consultabile. Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-09012022-093843