Tesi etd-06302025-110805 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
NAWAZ, NIMRA
URN
etd-06302025-110805
Titolo
BrainWash Adversarial Attack on Continual Learning to Maximize Forgetting
Dipartimento
INFORMATICA
Corso di studi
DATA SCIENCE AND BUSINESS INFORMATICS
Relatori
relatore Bacciu, Davide
supervisore Carta, Antonio
supervisore Carta, Antonio
Parole chiave
- adversarial attack
- black-box attacks
- brainwash
- catastrophic forgetting
- continual learning
- data poisoning
Data inizio appello
18/07/2025
Consultabilità
Completa
Riassunto
In this work, I undertook an implementation of a poisoning attack, BrainWash, an attack that targets systems that use continuous learning. It poisons a task so that when it is learned, the system forgets previously learned tasks more frequently. The original paper was evaluated on regularization-based continual learning frameworks, but I broadened the scope of the evaluation to include replay-based methods, specifically Experience Replay (ER) and Experience Replay with Asymmetric Cross-Entropy (ER-ACE). These methods seek to mitigate the problem of forgetting by retaining a limited set of past examples and replaying them, which is computationally efficient and preferred in practice because these methods have a small memory footprint, enable on-device learning, and support privacy by not collecting large amounts of data.
Our main objective is to present an approach to data poisoning that specifically seeks to create forgetting in continual learning models. Using the Brainwash attack, I show that across a broad range of continual learning baselines, including Experience Replay (ER) and ER-ACE, the forgetting goals of adversarial noise are met even with historically established methods. Results indicate that only via an appropriate noise budget did a trained continual learner catastrophically forget previously learned tasks. These findings emphasize underlying weaknesses in many of the current replay-based strategies.
Furthermore, we extended the original work on regularization-based methods where the attacker has access to the victim's model (white-box setting), to black-box attack settings, where the attacker does not have direct knowledge of the victim model’s architecture or parameters. The attacker trains an independent model and then runs the BrainWash noise optimization on this model, so when the victim model learns a task, which contains noise optimized by the black-box model still forgets the previously learnt tasks.
Our main objective is to present an approach to data poisoning that specifically seeks to create forgetting in continual learning models. Using the Brainwash attack, I show that across a broad range of continual learning baselines, including Experience Replay (ER) and ER-ACE, the forgetting goals of adversarial noise are met even with historically established methods. Results indicate that only via an appropriate noise budget did a trained continual learner catastrophically forget previously learned tasks. These findings emphasize underlying weaknesses in many of the current replay-based strategies.
Furthermore, we extended the original work on regularization-based methods where the attacker has access to the victim's model (white-box setting), to black-box attack settings, where the attacker does not have direct knowledge of the victim model’s architecture or parameters. The attacker trains an independent model and then runs the BrainWash noise optimization on this model, so when the victim model learns a task, which contains noise optimized by the black-box model still forgets the previously learnt tasks.
File
Nome file | Dimensione |
---|---|
Nimra_Th...Final.pdf | 2.73 Mb |
Contatta l’autore |