Tesi etd-10032019-105944

Tipo di tesi

Tesi di laurea magistrale

Autore

PAPINI, ANDREA

URN

etd-10032019-105944

Titolo

A Mathematical Framework for Stochastic Gradient Descent Algorithms

Dipartimento

MATEMATICA

Corso di studi

MATEMATICA

Relatori

relatore Dott. Bacciu, Davide
correlatore Prof. Romito, Marco
controrelatore Dott. Trevisan, Dario

Parole chiave

deep learning
dynamical system
non-convex optimization
stochastic differential equation
stochastic gradient descent

Data inizio appello

25/10/2019

Consultabilità

Completa

Riassunto

We develop the mathematical foundations of the stochastic modified equations (SME) framework for analyzing the dynamics of stochastic gradient algorithms, where the latter is approximated by a class of stochastic differential equations with small noise parameters. We prove that this approximation can be understood mathematically as a weak approximation, which leads to a number of precise and useful results on the approximations of stochastic gradient descent (SGD), momentum SGD and stochastic Nesterov's accelerated gradient method in the general setting of stochastic objectives. We also demonstrate through explicit calculations that this continuous-time approach can uncover important analytical insights into the stochastic gradient algorithms under consideration that may not be easy to obtain in a purely discrete-time setting. In particular we prove that SGD minimizes an average potential over the posterior distribution of weights along with an entropic regularization term. This potential is however not the original loss function in general. So SGD does perform variational inference, but for a different loss than the one used to compute the gradients. We conclude the thesis giving some new insight in the Gradient Noise in the stochastic gradient descent questioning the Guassianity assumption in the large data regime.

File

Nome file	Dimensione
TesiMagi...ndrea.pdf	2.70 Mb
Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-10032019-105944