logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-09012022-093843


Tipo di tesi
Tesi di laurea magistrale
Autore
MANTI, SIMONE
URN
etd-09012022-093843
Titolo
Fenchel-Young losses and optimization methods for learning
Dipartimento
MATEMATICA
Corso di studi
MATEMATICA
Relatori
relatore Prof. Bigi, Giancarlo
Parole chiave
  • Fenchel-Young losses
  • global optimization
  • machine learning
  • neural networks
  • branch and bound
Data inizio appello
23/09/2022
Consultabilità
Non consultabile
Data di rilascio
23/09/2025
Riassunto
Machine Learning grew exponentially in the last decade and it is for sure a central topic in every scientific subject. The basic idea of ML is ``learn through examples''. To do so, one has to minimize the regularized empirical risk, which involves the choice of an error measure, called loss function and of a regularization term (introduced to prevent overfitting).

It seems quite clear that choosing the right loss for the right problem can significantly make the difference. In this work, we present a large class of loss functions, named Fenchel-Young losses and introduced by Blondel et al. The name comes from a well-known inequality in convex analysis, named Fenchel-Young inequality, which binds a function with its conjugate. These loss functions present a lot of benefits: first of all, they recover a lot of well-known losses, such as the mean squared error and the logistic loss. They also inherit all the good properties of the conjugate of a function.

Moreover, it is possible to define the concept of separation margin for Fenchel-Young losses, both in the unstructured and in the structured case.
In the unstructured setting, it is possible to characterize the loss with separation margin with some specific assumptions.

The central part of this thesis is the study of learning algorithm with Fenchel-Young losses.
We firstly review some learning algorithms presented in Blondel et al. for linear models.
After that, we study the case where the model is a feed-forward ReLu Neural Network.

Inspired by Zhang et al. we reformulate the problem using the fact that the ReLU function can be seen as the projection in the positive octant. We conclude with some remarks and some future projects.

The thesis is organized as follows.

In the first chapter we review some basic concepts of conjugacy in convex analysis and we present some results that we will use in the following chapters.

In the second chapter we present all the theory around Fenchel-Young losses and regularized prediction functions. We deepen the analysis in the unstructured case and we conclude with a small digression on binary classification case.

In the third chapter we present the concept of separation margin for Fenchel-Young losses. Moreover, we see how to construct the Fenchel-Young loss ``closest'' to a given loss.

In the fourth chapter we give a review of the learning algorithms with Fenchel-Young losses for linear model.

In the fifth chapter we face the learning problem in the case of ReLu Neural Network. At the end, we conclude with some future projects.
File