Tesi etd-09232024-202208 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
LOMBARDI, GIUSEPPE
URN
etd-09232024-202208
Titolo
Reservoir Structured State Space Models
Dipartimento
INFORMATICA
Corso di studi
INFORMATICA
Relatori
relatore Prof. Gallicchio, Claudio
relatore Dott. Ceni, Andrea
relatore Dott. Ceni, Andrea
Parole chiave
- long range dependencies
- recurrent neural network
- reservoir computing
- state space model
Data inizio appello
11/10/2024
Consultabilità
Non consultabile
Data di rilascio
11/10/2027
Riassunto
This thesis introduces a novel neural network architecture called the Reservoir State Space Model (RSSM), which combines state space models (SSMs) with reservoir computing to effectively handle long-term dependencies in sequence modeling.
The linearity of SSMs allows the derivation of structured, efficient convolutional operations that maintain a latent internal state tracking the input sequence's history, similar to Recurrent Neural Networks (RNNs). Our stability analysis of SSMs enhances the memory capacity of this internal state. As a result, the hidden representations are highly expressive and accurately represent the input sequence.
The model's core innovation lies in its use of untrained convolutional networks within a reservoir framework, which reduces training complexity and computational cost by limiting learning to a feed-forward readout layer.
Experimental results demonstrate that our architecture significantly enhances computational efficiency while maintaining competitive accuracy, making it suitable for real-world applications.
In conclusion, this thesis presents an effective solution for sequence modeling, balancing computational efficiency and accuracy, and sets the stage for future advancements in this field.
The linearity of SSMs allows the derivation of structured, efficient convolutional operations that maintain a latent internal state tracking the input sequence's history, similar to Recurrent Neural Networks (RNNs). Our stability analysis of SSMs enhances the memory capacity of this internal state. As a result, the hidden representations are highly expressive and accurately represent the input sequence.
The model's core innovation lies in its use of untrained convolutional networks within a reservoir framework, which reduces training complexity and computational cost by limiting learning to a feed-forward readout layer.
Experimental results demonstrate that our architecture significantly enhances computational efficiency while maintaining competitive accuracy, making it suitable for real-world applications.
In conclusion, this thesis presents an effective solution for sequence modeling, balancing computational efficiency and accuracy, and sets the stage for future advancements in this field.
File
Nome file | Dimensione |
---|---|
La tesi non è consultabile. |