Tesi etd-05112025-215810 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
CAPRIOLI, LUCA
URN
etd-05112025-215810
Titolo
Development of Lunar Rover Steering Systems based on Synthetic Computer Vision and Reinforcement Learning
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Prof. Cimino, Mario Giovanni Cosimo Antonio
relatore Dott. Cowley, Aidan
relatore Dott. Cowley, Aidan
Parole chiave
- Autonomous lunar navigation
- Behavioral cloning
- Canadian Space Agency
- CNN-LSTM
- Deep learning for robotics
- Deep Q-Network (DQN)
- ESA
- LiDAR-based control
- Multi-view perception
- Reinforcement learning
- Rover obstacle avoidance
- Sensor fusion
- Simulation-to-reality transfer
- Temporal modeling
- Unreal Engine
- VORTEX simulation
Data inizio appello
27/05/2025
Consultabilità
Non consultabile
Data di rilascio
27/05/2095
Riassunto
As humanity prepares to return to the Moon and establish long-term presence, the demand for autonomous systems capable of navigating unstructured, GPS-denied environments becomes critical. Lunar rovers, tasked with exploration, science, and logistics, must traverse challenging terrains where human teleoperation is limited by communication latency and environmental unpredictability. This thesis investigates two complementary machine learning approaches for autonomous lunar rover navigation and obstacle avoidance: supervised learning via computer vision-based behavioral cloning, and reinforcement learning (RL) using LiDAR data in a simulated environment.
The first approach focuses on a CNN-LSTM architecture trained via behavioral cloning. The system learns to imitate human driving behavior by mapping sequences of images and rover state information to steering commands. A multi-branch convolutional neural network processes three visual perspectives (front, left, right), which are then concatenated with state inputs and passed to a Long Short-Term Memory (LSTM) module to capture temporal dependencies. The model was trained using a combination of real-world data from the Canadian Space Agency (CSA) and synthetic data generated in ESA’s VORTEX simulation framework using Unreal Engine. Domain randomization and data augmentation were used to enhance generalization. Experiments revealed that combining synthetic and real data improves prediction accuracy and robustness, while the temporal modeling of the LSTM reduces erratic steering behaviors.
The second approach applies Deep Q-Networks (DQN) for end-to-end obstacle avoidance, using LiDAR readings as state input and discrete steering actions as outputs. Implemented in simulation environments like Gazebo, the agent learns optimal navigation strategies by interacting with its environment, receiving positive rewards for safe progress and penalties for collisions. Unlike behavioral cloning, this RL-based method does not rely on human demonstrations and can adapt to new environments through trial and error. Although slower to converge and requiring careful reward shaping, the reinforcement learning policy exhibited strong generalization in unseen scenarios after sufficient training.
By exploring both paradigms—supervised imitation learning and reinforcement learning—this thesis offers a comparative analysis of their strengths, limitations, and suitability for lunar robotics. Behavioral cloning benefits from data efficiency and rapid deployment when human demonstrations are available, but struggles in out-of-distribution settings. In contrast, reinforcement learning shows promise in adaptive decision-making and long-term planning but faces challenges in training stability and sample efficiency.
This work also emphasizes the importance of sim-to-real transfer, simulation fidelity, and multi-modal perception in the development of autonomous systems for planetary exploration. The integration of computer vision, spatiotemporal reasoning, and sensor-based learning highlights the need for hybrid systems that combine the robustness of learned perception with the adaptability of reinforcement-based control.
The first approach focuses on a CNN-LSTM architecture trained via behavioral cloning. The system learns to imitate human driving behavior by mapping sequences of images and rover state information to steering commands. A multi-branch convolutional neural network processes three visual perspectives (front, left, right), which are then concatenated with state inputs and passed to a Long Short-Term Memory (LSTM) module to capture temporal dependencies. The model was trained using a combination of real-world data from the Canadian Space Agency (CSA) and synthetic data generated in ESA’s VORTEX simulation framework using Unreal Engine. Domain randomization and data augmentation were used to enhance generalization. Experiments revealed that combining synthetic and real data improves prediction accuracy and robustness, while the temporal modeling of the LSTM reduces erratic steering behaviors.
The second approach applies Deep Q-Networks (DQN) for end-to-end obstacle avoidance, using LiDAR readings as state input and discrete steering actions as outputs. Implemented in simulation environments like Gazebo, the agent learns optimal navigation strategies by interacting with its environment, receiving positive rewards for safe progress and penalties for collisions. Unlike behavioral cloning, this RL-based method does not rely on human demonstrations and can adapt to new environments through trial and error. Although slower to converge and requiring careful reward shaping, the reinforcement learning policy exhibited strong generalization in unseen scenarios after sufficient training.
By exploring both paradigms—supervised imitation learning and reinforcement learning—this thesis offers a comparative analysis of their strengths, limitations, and suitability for lunar robotics. Behavioral cloning benefits from data efficiency and rapid deployment when human demonstrations are available, but struggles in out-of-distribution settings. In contrast, reinforcement learning shows promise in adaptive decision-making and long-term planning but faces challenges in training stability and sample efficiency.
This work also emphasizes the importance of sim-to-real transfer, simulation fidelity, and multi-modal perception in the development of autonomous systems for planetary exploration. The integration of computer vision, spatiotemporal reasoning, and sensor-based learning highlights the need for hybrid systems that combine the robustness of learned perception with the adaptability of reinforcement-based control.
File
Nome file | Dimensione |
---|---|
La tesi non è consultabile. |