Tesi etd-01282026-113737 |
Link copiato negli appunti
Tipo di tesi
Tesi di dottorato di ricerca
Autore
DALL'ASEN, NICOLA
URN
etd-01282026-113737
Titolo
Advancing Generative AI for Creative Partnership, Downstream Utility, and Data-Scarce Domains
Settore scientifico disciplinare
ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
Corso di studi
DOTTORATO NAZIONALE IN INTELLIGENZA ARTIFICIALE
Relatori
tutor Prof.ssa Ricci, Elisa
correlatore Dott.ssa Wang, Yiming
correlatore Dott.ssa Wang, Yiming
Parole chiave
- collaborative
- data scarcity
- diffusion
- downstream
- generative
Data inizio appello
19/02/2026
Consultabilità
Completa
Riassunto
This thesis explores Diffusion Models through three pillars: human-AI collaboration, downstream applications, and data scarcity.
First, we address interactive generative processes. Current text-driven models offer limited fine-grained control. We introduce Collaborative Neural Painting (CNP), reframing image generation as sequential stroke-based painting. This enables users to guide and collaborate with an AI iteratively, centering the human creator in the artistic process.
Second, we investigate practical applications. For medical imaging, we develop MAMBO, a high-resolution mammography model that generates realistic images to augment datasets, improving breast cancer classification. Additionally, we leverage diffusion mechanics for unsupervised video anomaly detection, using reconstruction error on motion representations to identify anomalous events.
Third, we confront data scarcity across scenarios. We present CoRE, a training-free approach enhancing zero-shot VLM classification through contextual retrieval. For few-shot learning, we propose DISEF, synthesizing diverse in-domain images for dataset augmentation combined with efficient VLM fine-tuning. We also introduce Chamfer Guidance, a training-free inference technique that steers models toward better distributional coverage relative to a few real examples, improving generation diversity and downstream classifier performance.
Collectively, this thesis advances generative AI from a passive tool to an active collaborator, practical problem-solver, and effective solution in data-constrained environments.
First, we address interactive generative processes. Current text-driven models offer limited fine-grained control. We introduce Collaborative Neural Painting (CNP), reframing image generation as sequential stroke-based painting. This enables users to guide and collaborate with an AI iteratively, centering the human creator in the artistic process.
Second, we investigate practical applications. For medical imaging, we develop MAMBO, a high-resolution mammography model that generates realistic images to augment datasets, improving breast cancer classification. Additionally, we leverage diffusion mechanics for unsupervised video anomaly detection, using reconstruction error on motion representations to identify anomalous events.
Third, we confront data scarcity across scenarios. We present CoRE, a training-free approach enhancing zero-shot VLM classification through contextual retrieval. For few-shot learning, we propose DISEF, synthesizing diverse in-domain images for dataset augmentation combined with efficient VLM fine-tuning. We also introduce Chamfer Guidance, a training-free inference technique that steers models toward better distributional coverage relative to a few real examples, improving generation diversity and downstream classifier performance.
Collectively, this thesis advances generative AI from a passive tool to an active collaborator, practical problem-solver, and effective solution in data-constrained environments.
File
| Nome file | Dimensione |
|---|---|
| phd_thes..._pdfa.pdf | 170.05 Mb |
Contatta l’autore |
|