ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-02092026-135909

Tipo di tesi

Tesi di laurea magistrale

URN

etd-02092026-135909

Titolo

Neural Network Training Acceleration Analysis using Forecasting in Latent Weight Space

Dipartimento

INGEGNERIA DELL'INFORMAZIONE

Corso di studi

ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING

Parole chiave

autoencoders
latent weight space
neural network training acceleration
representation learning
ResNet
weight as data modality
weight space learning

Data inizio appello

27/02/2026

Consultabilità

Non consultabile

Data di rilascio

27/02/2096

Riassunto (Inglese)

Riassunto (Italiano)

Neural network (NN) training, whilst remarkably effective, remains constrained by significant computational demands, costs, and energy consumption. Recent research has demonstrated that the evolution of NN weights follows predictable trajectories, allowing for training acceleration by alternating gradient-based optimisation with periodic forecasting of future weight states. Beyond viewing weights as fixed parameters, a new paradigm treats them as a data modality, enabling a variety of tasks. Within this paradigm, weight space representation learning is central, using weights as inputs to other NNs.
Existing acceleration methods through weight prediction operate in the high-dimensional weight space (WS), predicting weights element-wise. However, evidence suggests that modern NN architectures are largely overparametrised and that their optimisation dynamics can be described in lower-dimensional subspaces. Building on this, this work leverages WS representation learning to compress NNs, aiming to improve the efficiency of weight prediction for training speedup. We propose an approach operating within the latent space of a weight space autoencoder (AE).
This work investigates the feasibility of this latent forecasting approach for NN training acceleration. We train the AE on a zoo of ResNet-18 models and present a comparison against existing weight prediction methods on standard computer vision benchmarks, assessing the impact of WS compression on training speedup.

File

Nome file	Dimensione
La tesi non è consultabile. Contatta l’autore