ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-02062026-174824

Tipo di tesi

Tesi di laurea magistrale

URN

etd-02062026-174824

Titolo

Reinforcement Learning for obstacle avoidance on Lunar surfaces

Dipartimento

INGEGNERIA DELL'INFORMAZIONE

Corso di studi

ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING

Parole chiave

autonomous
embedded
exploration
lunar
navigation
observability
partial
reinforcement learning
robotic
rover
surface
system

Data inizio appello

27/02/2026

Consultabilità

Non consultabile

Data di rilascio

27/02/2096

Riassunto (Inglese)

Riassunto (Italiano)

Autonomous navigation on planetary surfaces is a key enabling technology for future space exploration missions. In particular, lunar environments pose significant challenges due to their unstructured terrain, absence of reliable global maps, limited sensing capabilities, and strict onboard computational constraints. Traditional rover operations are often based on teleoperation, which suffers from high communication latency and limited adaptability, motivating the need for increased autonomy.

This thesis addresses autonomous lunar rover navigation by jointly considering perception and decision making. An obstacle-centric perception pipeline is employed to represent the environment through discrete obstacles, yielding compact and interpretable state representations suitable for local and noisy sensing conditions. On top of this perceptual framework, several reinforcement learning approaches are investigated to evaluate their effectiveness in goal-directed navigation and obstacle avoidance.

A central focus of the work is partial observability, which naturally arises from limited sensor fields of view. The impact of memory in reinforcement learning is systematically analyzed by comparing memory-less policies with memory-based approaches, including recurrent architectures and model-based methods. Training is performed using a curriculum learning strategy to progressively increase task difficulty.

All experiments are conducted in a simulated lunar environment inspired by terrestrial lunar analog facilities, ensuring a fair and controlled evaluation. The results demonstrate that memory-based and model-based policies significantly improve navigation performance under partial observability, highlighting the importance of integrating perception, memory, and planning for robust autonomous rover navigation.

File

Nome file	Dimensione
La tesi non è consultabile. Contatta l’autore