Tesi etd-02102026-194218 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
SUMA, GABRIELE
URN
etd-02102026-194218
Titolo
A Federated Approach to Anomaly Detection for Predictive Maintenance of Heavy Vehicles
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
COMPUTER ENGINEERING
Relatori
relatore Prof. Bechini, Alessio
relatore Prof. Gama, João
relatore Prof. Veloso, Bruno
relatore Prof. Gama, João
relatore Prof. Veloso, Bruno
Parole chiave
- anomaly detection
- federated learning
- machine learning
- predictive maintenance
Data inizio appello
27/02/2026
Consultabilità
Non consultabile
Data di rilascio
27/02/2029
Riassunto (Inglese)
The advent of Industry 4.0 has established Predictive Maintenance (PdM) as a cornerstone for operational efficiency in the heavy-duty automotive sector. However, traditional centralized machine learning architectures face critical scalability bottlenecks, including prohibitive bandwidth costs for transmitting high-frequency telemetry and significant privacy concerns regarding proprietary fleet data. To address these limitations, this thesis proposes a privacy-preserving Federated Learning (FL) framework for anomaly detection in heavy-duty vehicle fleets.
The core architecture utilizes an Unsupervised Long Short-Term Memory (LSTM) Autoencoder, designed to learn latent representations of normal vehicle operation from decentralized, unlabeled data. By training locally on edge devices and aggregating model updates rather than raw data, the system adheres to strict privacy constraints while significantly reducing communication overhead. The study utilizes the Scania Component X industrial dataset to simulate a distributed network of clients. The experimental campaign evaluates the framework across two distinct operational paradigms: Offline (Static) and Online (Streaming). In the Offline scenario, the federated model achieves an F2-Score and AUC-ROC highly comparable to a centralized baseline. A comparative analysis of aggregation strategies reveals that robust statistics, specifically the Coordinate-wise Median (FedMedian), effectively mitigate the impact of heterogeneous (Non-IID) client data, outperforming standard FedAvg. Furthermore, a scalability analysis demonstrates that while centralized optimization exhibits a steeper initial learning curve, the federated approach successfully leverages horizontal scaling, proving its potential to manage large-scale fleets while circumventing the vertical scaling limits of centralized systems.
Conversely, the Online scenario, which simulates real-time data streaming via chunk-based processing, revealed significant challenges inherent to edge deployment. The LSTM model struggled with "Data Starvation" and "Cold Start" phenomena, where the limited historical window available in individual data chunks prevented effective convergence compared to the static baseline.
To overcome the limitations of the pure streaming approach, an extended evaluation introduced a self-supervised Test-Time Adaptation mechanism. By utilizing high-confidence predictions to incrementally fine-tune the model on new operational data, the system demonstrated non-negligible performance improvements. This validates that combining the robustness of global offline pre-training with continuous, unsupervised local adaptation represents the most viable and effective path for deploying personalized predictive maintenance in real-world edge computing environments.
The core architecture utilizes an Unsupervised Long Short-Term Memory (LSTM) Autoencoder, designed to learn latent representations of normal vehicle operation from decentralized, unlabeled data. By training locally on edge devices and aggregating model updates rather than raw data, the system adheres to strict privacy constraints while significantly reducing communication overhead. The study utilizes the Scania Component X industrial dataset to simulate a distributed network of clients. The experimental campaign evaluates the framework across two distinct operational paradigms: Offline (Static) and Online (Streaming). In the Offline scenario, the federated model achieves an F2-Score and AUC-ROC highly comparable to a centralized baseline. A comparative analysis of aggregation strategies reveals that robust statistics, specifically the Coordinate-wise Median (FedMedian), effectively mitigate the impact of heterogeneous (Non-IID) client data, outperforming standard FedAvg. Furthermore, a scalability analysis demonstrates that while centralized optimization exhibits a steeper initial learning curve, the federated approach successfully leverages horizontal scaling, proving its potential to manage large-scale fleets while circumventing the vertical scaling limits of centralized systems.
Conversely, the Online scenario, which simulates real-time data streaming via chunk-based processing, revealed significant challenges inherent to edge deployment. The LSTM model struggled with "Data Starvation" and "Cold Start" phenomena, where the limited historical window available in individual data chunks prevented effective convergence compared to the static baseline.
To overcome the limitations of the pure streaming approach, an extended evaluation introduced a self-supervised Test-Time Adaptation mechanism. By utilizing high-confidence predictions to incrementally fine-tune the model on new operational data, the system demonstrated non-negligible performance improvements. This validates that combining the robustness of global offline pre-training with continuous, unsupervised local adaptation represents the most viable and effective path for deploying personalized predictive maintenance in real-world edge computing environments.
Riassunto (Italiano)
File
| Nome file | Dimensione |
|---|---|
La tesi non è consultabile. |
|