ETD

Archivio digitale delle tesi discusse presso l'Università di Pisa

Tesi etd-04262022-163702


Tipo di tesi
Tesi di dottorato di ricerca
Autore
CIAMPI, LUCA
URN
etd-04262022-163702
Titolo
Deep Learning Techniques for Visual Counting
Settore scientifico disciplinare
ING-INF/05
Corso di studi
INGEGNERIA DELL'INFORMAZIONE
Relatori
tutor Dott. Amato, Giuseppe
tutor Prof. Avvenuti, Marco
tutor Dott. Gennaro, Claudio
Parole chiave
  • domain adaptation
  • counting objects in images
  • computer vision
  • deep learning
  • convolutional neural nertworks
  • synthetic data
Data inizio appello
03/05/2022
Consultabilità
Completa
Riassunto
In this dissertation, we investigated and enhanced Deep Learning (DL) techniques for counting objects, like pedestrians, cells or vehicles, in still images or video frames. In particular, we tackled the challenge related to the lack of data needed for training current DL-based solutions. Given that the budget for labeling is limited, data scarcity still represents an open problem that prevents the scalability of existing solutions based on the supervised learning of neural networks and that is responsible for a significant drop in performance at inference time when new scenarios are presented to these algorithms. We introduced solutions addressing this issue from several complementary sides, collecting datasets gathered from virtual environments automatically labeled, proposing Domain Adaptation strategies aiming at mitigating the domain gap existing between the training and test data distributions, and presenting a counting strategy in a weakly labeled data scenario, i.e., in the presence of non-negligible disagreement between multiple annotators. Moreover, we tackled the non-trivial engineering challenges coming out of the adoption of Convolutional Neural Network-based techniques in environments with limited power resources, introducing solutions for counting vehicles and pedestrians directly onboard embedded vision systems, i.e., devices equipped with constrained computational capabilities that can capture images and elaborate them.
File