logo SBA

ETD

Digital archive of theses discussed at the University of Pisa

 

Thesis etd-11022021-124740


Thesis type
Tesi di laurea magistrale
Author
CRUDELINI, MIRIAM
URN
etd-11022021-124740
Thesis title
What should I learn next? A Data-driven tool for helping data scientist defining their learning path
Department
INFORMATICA
Course of study
DATA SCIENCE AND BUSINESS INFORMATICS
Supervisors
relatore Prof. Chiarello, Filippo
Keywords
  • Github
  • MOOCs
  • NLP
  • online learning
  • skills extraction
Graduation session start date
03/12/2021
Availability
Withheld
Release date
03/12/2091
Summary
The aim of the thesis is to design and implement an online course recommendation system for Github users, associating courses based on the skills of the users. In particular, the focus is on users classified as "data scientist" which use Python as programming language, and online educational resources related to "Data Science".
After introducing a review of the existing literature, the methodology used to achieve the objective is presented. In particular, first the necessary data concerning users and online courses are collected, then three methods are proposed to extract the skills obtainable by attending a course. These three approaches differ in the algorithm they use at the base, thus leading to different results.
After the evaluation of the accuracy of these results, only one of these models is used in the final stages. Thus, a matrix of similarities between courses and users has been built, based on the skills learnable from a course and the libraries used by the user.
Finally, in discussing the results, the main limitations and advantages of the work are presented.
File