Tesi etd-02182026-181933 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
RICCI, ETTORE
URN
etd-02182026-181933
Titolo
A Graph-Encoding Network System for Time-Coherent Scene Understanding
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Prof. Cimino, Mario Giovanni Cosimo Antonio
supervisore Prof. Galatolo, Federico Andrea
supervisore Prof. Galatolo, Federico Andrea
Parole chiave
- ai
- computer vision
- generative ai
- llm
- neural network
- scene understanding
Data inizio appello
15/04/2026
Consultabilità
Non consultabile
Data di rilascio
15/04/2096
Riassunto (Inglese)
Scene understanding is a fundamental task in computer vision and recent advancements in this field have been driven by the development of Multimodal Large Language Models (MLLMs). However, these models can suffer from hallucinations and are not entirely reliable, especially when it comes to a sequence of images, such as video data. This work proposes a novel approach that combines the strengths of MLLMs with an Object Detection model and a graph-based representation of the scene to enhance the reliability and coherence of the outputs generated by MLLMs.
Riassunto (Italiano)
File
| Nome file | Dimensione |
|---|---|
La tesi non è consultabile. |
|