logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-02062026-174824


Tipo di tesi
Tesi di laurea magistrale
Autore
MENCHINI, LORENZO
URN
etd-02062026-174824
Titolo
Reinforcement Learning for obstacle avoidance on Lunar surfaces
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Prof. Cimino, Mario Giovanni Cosimo Antonio
relatore Ing. Cowley, Aidan
Parole chiave
  • autonomous
  • embedded
  • exploration
  • lunar
  • navigation
  • observability
  • partial
  • reinforcement learning
  • robotic
  • rover
  • surface
  • system
Data inizio appello
27/02/2026
Consultabilità
Non consultabile
Data di rilascio
27/02/2096
Riassunto (Inglese)
Riassunto (Italiano)
Autonomous navigation on planetary surfaces is a key enabling technology for future space exploration missions. In particular, lunar environments pose significant challenges due to their unstructured terrain, absence of reliable global maps, limited sensing capabilities, and strict onboard computational constraints. Traditional rover operations are often based on teleoperation, which suffers from high communication latency and limited adaptability, motivating the need for increased autonomy.

This thesis addresses autonomous lunar rover navigation by jointly considering perception and decision making. An obstacle-centric perception pipeline is employed to represent the environment through discrete obstacles, yielding compact and interpretable state representations suitable for local and noisy sensing conditions. On top of this perceptual framework, several reinforcement learning approaches are investigated to evaluate their effectiveness in goal-directed navigation and obstacle avoidance.

A central focus of the work is partial observability, which naturally arises from limited sensor fields of view. The impact of memory in reinforcement learning is systematically analyzed by comparing memory-less policies with memory-based approaches, including recurrent architectures and model-based methods. Training is performed using a curriculum learning strategy to progressively increase task difficulty.

All experiments are conducted in a simulated lunar environment inspired by terrestrial lunar analog facilities, ensuring a fair and controlled evaluation. The results demonstrate that memory-based and model-based policies significantly improve navigation performance under partial observability, highlighting the importance of integrating perception, memory, and planning for robust autonomous rover navigation.
File