Tesi etd-06302021-103021

Tipo di tesi

Tesi di dottorato di ricerca

Autore

CARTA, ANTONIO

URN

etd-06302021-103021

Titolo

Memorization in Recurrent Neural Networks

Settore scientifico disciplinare

INF/01

Corso di studi

INFORMATICA

Relatori

tutor Prof. Bacciu, Davide

Parole chiave

continual learning
recurrent neural networks
sequence autoencoding
short-term memory

Data inizio appello

20/07/2021

Consultabilità

Completa

Riassunto

Rich sources of data such as text, video, or time series, are composed of a sequence of elements. Traditionally, recurrent neural networks have been used to process sequences by keeping a trace of the past in a recursively updated hidden state. The ability of recurrent networks to memorize the past is fundamental to their success.
In this thesis, we study recurrent networks and their short-term memory, with the objective of maximizing it. In the literature, most models either do not optimize the short-term memory or they do so in a data-independent way. We propose a conceptual framework that splits recurrent networks into two separate components: a feature extractor and a memorization component. Following this separation, we show how to optimize the short-term memory of recurrent networks. This is a challenging problem, hard to solve by end-to-end backpropagation. We propose several solutions that allow us to efficiently optimize the memorization component. Finally, we apply our approach to two application domains: sentence embeddings for natural language processing and continual learning on sequential data.
Overall, we find that optimizing the short-term memory improves the ability of recurrent models to learn long-range dependencies, helps the training process, and provides features that generalize well to unseen data.
The findings of this thesis provide a better understanding of short-term memory in recurrent networks and suggest general principles that may be useful to design novel recurrent models with expressive memorization components.

File

Nome file	Dimensione
main.pdf	2.40 Mb
relazione_finale.pdf	137.69 Kb
Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-06302021-103021