logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-05112025-191738


Tipo di tesi
Tesi di laurea magistrale
Autore
VERSARI, ALESSANDRO
URN
etd-05112025-191738
Titolo
Enhancing Software Supply Chain Security: Leveraging AI for Suspicious Code Detection in Open Source Repositories
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Lettieri, Giuseppe
relatore Galatolo, Federico Andrea
Parole chiave
  • Backdoor Detection
  • LLMs
  • Open Source
  • Security
  • Software Supply Chain
  • Zero-Shot Classification
Data inizio appello
27/05/2025
Consultabilità
Completa
Riassunto
The XZ backdoor incident, where malicious code was stealthily introduced into a widely used compression library, exposed the growing risks in the open source software supply chain and highlighted the enormous challenges faced by open source maintainers. These maintainers are often burdened with reviewing massive volumes of code contributions, which makes it difficult to manually catch subtle security threats such as backdoors. This thesis proposes to detect suspicious code in open source repositories using AI, helping maintainers by automating the identification of potentially harmful code.
This elaborate will concentrate mostly on Trojan horse attack, the technique used to obfuscate them, and how AI, specifically Large Language Models (LLMs), can be employed to detect those kind of attacks.
The study also explores the use of the Perplexity measure. Initial experiments suggested that Perplexity could effectively flag anomalies. However, the results also revealed its limitations, particularly in distinguishing between truly malicious behavior and unconventional, yet benign, coding styles.
Lastly, state-of-the-art models were evaluated for their ability to detect malicious patterns in codebases using zero-shot classification. Among them, the LLamaguard 3 model, despite its compact size, demonstrated remarkable capabilities, outperforming larger models. It achieved an accuracy of 98.04% and an F1-score of 91.73% on the evaluated dataset.
File