Tesi etd-02062026-123950 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
BURGISI, MARTINA
URN
etd-02062026-123950
Titolo
YOLO-based Multimodal Rock Detection for a Lunar Rover with Sim-to-Real Domain Bridging
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Prof. Cimino, Mario Giovanni Cosimo Antonio
relatore Ing. Cowley, Aidan
relatore Ing. Cowley, Aidan
Parole chiave
- autonomous driving
- autonomous navigation
- curriculum learning
- depth
- EAC
- ESA
- European Astronaut Centre
- European Space Agency
- hybrid dataset
- LUNA Analog Facility
- lunar surface
- moon
- moon's surface
- multimodal
- object detection
- obstacle detection
- planetary exploration
- rgbd
- robot
- robotic
- rock detection
- rover
- sim-to-real
- space exploration
- style transfer
- YOLO
- YOLO-based
Data inizio appello
27/02/2026
Consultabilità
Non consultabile
Data di rilascio
27/02/2096
Riassunto (Inglese)
Riassunto (Italiano)
This work addresses the problem of obstacle detection in lunar scenarios characterized by extreme illumination conditions, investigating the combined role of sim-to-real domain adaptation and multimodal RGB–Depth perception. The study first establishes strong RGB-only baselines trained under controlled illumination, analyzing the impact of dataset composition on generalization performance. Real images, synthetic data, and sim-to-real stylized samples are evaluated to quantify their contribution to detection accuracy. Results show that, while style-transferred synthetic data improves generalization under favorable lighting, RGB-only detectors suffer significant performance degradation under strong illumination variations and low-visibility scenarios.
To address these limitations, a mid-fusion RGB–Depth detection architecture is proposed, integrating geometric information into a YOLO-based framework. A progressive fine-tuning strategy is introduced to stabilize training and mitigate catastrophic forgetting when extending RGB-pretrained models to multimodal inputs. Extensive experiments demonstrate that the proposed RGB–Depth approach significantly improves detection robustness compared to RGB-only baselines. In particular, depth information enables reliable object detection in scenarios where photometric cues become unreliable, without degrading performance under standard illumination conditions.
To address these limitations, a mid-fusion RGB–Depth detection architecture is proposed, integrating geometric information into a YOLO-based framework. A progressive fine-tuning strategy is introduced to stabilize training and mitigate catastrophic forgetting when extending RGB-pretrained models to multimodal inputs. Extensive experiments demonstrate that the proposed RGB–Depth approach significantly improves detection robustness compared to RGB-only baselines. In particular, depth information enables reliable object detection in scenarios where photometric cues become unreliable, without degrading performance under standard illumination conditions.
File
| Nome file | Dimensione |
|---|---|
La tesi non è consultabile. |
|