logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-09052023-093510


Tipo di tesi
Tesi di laurea magistrale
Autore
MORUCCI, EDOARDO
URN
etd-09052023-093510
Titolo
Deep Learning-based Classification for Acoustic Noise Monitoring: From Convolutional Neural Networks to Vision Transformers
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Prof. Cimino, Mario Giovanni Cosimo Antonio
relatore Prof. Galatolo, Federico Andrea
relatore Dott. Marchesotti, Luca
Parole chiave
  • transformers
  • vision transformers
  • cnn
  • convolutional neural networks
  • audio tagging
  • audio classification
  • noise monitoring
  • noise tagging
  • spectrograms
  • deep learning
Data inizio appello
22/09/2023
Consultabilità
Non consultabile
Data di rilascio
22/09/2026
Riassunto
This thesis investigates audio tagging and noise recognition, an increasingly significant domain due to its broad applications. The study processes and classifies specific noises for various use-cases, highlighting its utility from machinery malfunction detection to urban security. While Convolutional Neural Networks were traditionally employed for such tasks, the trend is now shifting towards Vision Transformers. This work delves deep into both architectures, comparing their merits.
The main objective is to test these models on benchmark and real-world datasets, aiming to gauge the true performance of Vision Transformers in real scenarios. This involved creating custom datasets, developing new training pipelines, and conducting a detailed model comparison.
The findings were notable. Vision Transformers performed exceptionally on benchmark datasets, but their superiority varied across real-world datasets. This highlights the importance of context in model application and the distinct challenges of real-world situations.
Overall, this thesis offers crucial insights into the complexities and opportunities of audio tagging and noise recognition, setting the stage for future research in this field.
File