Tesi etd-10072020-090015

Tipo di tesi

Tesi di laurea magistrale

Autore

DEMELAS, FRANCESCO

URN

etd-10072020-090015

Titolo

A mean field analysis of two-layers neural networks with general convex loss function

Dipartimento

MATEMATICA

Corso di studi

MATEMATICA

Relatori

relatore Romito, Marco

Parole chiave

convex loss functions
distributional dynamics
mean field
neural networks
propagation of chaos
stochastic gradient descent

Data inizio appello

23/10/2020

Consultabilità

Completa

Riassunto

Nowadays neural networks are a powerful tool, even if there are few mathematical results that explain the effectiveness of this approach.Until a few years ago, one of the powerful results guaranteed that any continuous function can be well approximated by a two-layers neural network with convex activation functions and enough hidden nodes.However this tells us nothing about the practical choice of the
parameters.Typically the Stochastic Gradient Descent (SGD), or one of its variants, is used to update them.In the last years several results have been discovered in order to analyse the convergence of parameters using the SGD, in particular using the mean field approach.The key idea is to consider a risk
function defined over a set of distributions of the parameters. This allows us to study the convergence through a PDE, known as distributional dynamics (DD), using common tools of mathematical analysis. Many results use a quadratic loss function, thus optimize the mean square error. In this works we extend this analysis for a general convex loss function.This generalization is fundamental, because the success of a learning problem can be enhanced by the choice of the most suitable loss function.We start by proving that the empirical distributions weakly converge to the solution of the DD for any finite time. Then we analyse the time convergence of the distributions, finding that, under suitable assumptions, the continuous distributions weakly converge to a fixed point of the DD.

File

Nome file	Dimensione
Tesi.pdf	1.39 Mb
Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-10072020-090015