Tesi etd-02062024-200937

Tipo di tesi

Tesi di laurea magistrale

Autore

LI, MALIO

URN

etd-02062024-200937

Titolo

Policy Ensemble with Indirect Interaction

Dipartimento

INFORMATICA

Corso di studi

INFORMATICA

Relatori

relatore Prof. Lomonaco, Vincenzo
relatore Dott. Piccoli Elia

Parole chiave

policy ensemble
reinforcement learning

Data inizio appello

23/02/2024

Consultabilità

Completa

Riassunto

Reinforcement Learning (RL) is a branch of Machine Learning that teaches agents how to act optimally in a given environment.
Recently, thanks to the rapid growth in Deep Learning, many complex algorithms that take advantage of Neural Networks have been developed, and in many tasks, such as games, RL agents can easily outperform the best human player in the world.
Despite the potential, RL has some more complex tasks that are still unsolved or extremely difficult to handle, for example, autonomous car driving, where the observation space is huge.
Currently, the main strategy in RL is to train a single policy to solve a given task, maybe with knowledge transfer from previous policies to make the training faster.
In this thesis, the aim is to try to simplify the given problem by breaking it into many sub-tasks, then train optimal sub-policies on simpler tasks, and finally use a Master Policy to learn the combination of sub-policies to solve the initial complex task without any prior knowledge.
Each sub-policy can be seen as a skill that the agent can exploit.
The proposed method is highly different from classical RL algorithms, thus additional metrics instead of rewards are used to see how Master Policy outperforms other state-of-the-art methods.
Finally, it shows some studies on how the simple weighted ensemble is done using different skill sets and the importance of each skill during different road scenarios.

File

Nome file	Dimensione
Master_Thesis_6.pdf	1.72 Mb
Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-02062024-200937