Tesi etd-11022021-124740 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
CRUDELINI, MIRIAM
URN
etd-11022021-124740
Titolo
What should I learn next? A Data-driven tool for helping data scientist defining their learning path
Dipartimento
INFORMATICA
Corso di studi
DATA SCIENCE AND BUSINESS INFORMATICS
Relatori
relatore Prof. Chiarello, Filippo
Parole chiave
- Github
- MOOCs
- NLP
- online learning
- skills extraction
Data inizio appello
03/12/2021
Consultabilità
Non consultabile
Data di rilascio
03/12/2091
Riassunto
The aim of the thesis is to design and implement an online course recommendation system for Github users, associating courses based on the skills of the users. In particular, the focus is on users classified as "data scientist" which use Python as programming language, and online educational resources related to "Data Science".
After introducing a review of the existing literature, the methodology used to achieve the objective is presented. In particular, first the necessary data concerning users and online courses are collected, then three methods are proposed to extract the skills obtainable by attending a course. These three approaches differ in the algorithm they use at the base, thus leading to different results.
After the evaluation of the accuracy of these results, only one of these models is used in the final stages. Thus, a matrix of similarities between courses and users has been built, based on the skills learnable from a course and the libraries used by the user.
Finally, in discussing the results, the main limitations and advantages of the work are presented.
After introducing a review of the existing literature, the methodology used to achieve the objective is presented. In particular, first the necessary data concerning users and online courses are collected, then three methods are proposed to extract the skills obtainable by attending a course. These three approaches differ in the algorithm they use at the base, thus leading to different results.
After the evaluation of the accuracy of these results, only one of these models is used in the final stages. Thus, a matrix of similarities between courses and users has been built, based on the skills learnable from a course and the libraries used by the user.
Finally, in discussing the results, the main limitations and advantages of the work are presented.
File
Nome file | Dimensione |
---|---|
Tesi non consultabile. |