logo SBA

ETD

Digital archive of theses discussed at the University of Pisa

 

Thesis etd-06302021-103021


Thesis type
Tesi di dottorato di ricerca
Author
CARTA, ANTONIO
URN
etd-06302021-103021
Thesis title
Memorization in Recurrent Neural Networks
Academic discipline
INF/01
Course of study
INFORMATICA
Supervisors
tutor Prof. Bacciu, Davide
Keywords
  • continual learning
  • recurrent neural networks
  • sequence autoencoding
  • short-term memory
Graduation session start date
20/07/2021
Availability
Full
Summary
Rich sources of data such as text, video, or time series, are composed of a sequence of elements. Traditionally, recurrent neural networks have been used to process sequences by keeping a trace of the past in a recursively updated hidden state. The ability of recurrent networks to memorize the past is fundamental to their success.
In this thesis, we study recurrent networks and their short-term memory, with the objective of maximizing it. In the literature, most models either do not optimize the short-term memory or they do so in a data-independent way. We propose a conceptual framework that splits recurrent networks into two separate components: a feature extractor and a memorization component. Following this separation, we show how to optimize the short-term memory of recurrent networks. This is a challenging problem, hard to solve by end-to-end backpropagation. We propose several solutions that allow us to efficiently optimize the memorization component. Finally, we apply our approach to two application domains: sentence embeddings for natural language processing and continual learning on sequential data.
Overall, we find that optimizing the short-term memory improves the ability of recurrent models to learn long-range dependencies, helps the training process, and provides features that generalize well to unseen data.
The findings of this thesis provide a better understanding of short-term memory in recurrent networks and suggest general principles that may be useful to design novel recurrent models with expressive memorization components.
File