ETD

Archivio digitale delle tesi discusse presso l'Università di Pisa

Tesi etd-10232017-185215


Tipo di tesi
Tesi di laurea magistrale
Autore
FARAONI, ERICA
URN
etd-10232017-185215
Titolo
Design and implementation of a tool for hospital data analytics
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
COMPUTER ENGINEERING
Relatori
relatore Prof. Marcelloni, Francesco
relatore Prof. Ducange, Pietro
Parole chiave
  • spark framework
  • pattern mining
  • health care
  • data mining
Data inizio appello
24/11/2017
Consultabilità
Non consultabile
Data di rilascio
24/11/2087
Riassunto
The massive amount of data collected everyday by healthcare institutes represents an important resource to better understand the quality of the service provided to patients and how to improve it. This raw data can be turned into insights for the stakeholders by means of data mining techniques. These methods, at the intersection of machine learning, statistics, and database systems, extract information from a data set and transform it into an understandable structure for further use. The methods used in the healthcare field include frequent pattern mining, clustering, classification and outlier detection.

This study investigates the efficacy of frequent pattern mining and sequential pattern mining in providing information on the behavior of patients in a hospital. The analysis involved the transformation of raw data collected at a real hospital into basket-like sequences of items for each patient and the use of algorithms like FPGrowth and PrefixSpan on a Spark distributed framework.

The results showed that narrowing the scope of the mining process by applying filters increases the significance of the set of patterns found. An important factor in interpreting the results was the use of interestingness measures such as lift, kulczynski measure and imbalance ratio. A more precise and narrowed analysis is possible if the filters are designed by medical personnel having the right technical knowledge.

The results of the mining process, along with a statistical analysis, are presented in a web application such that they can be easily displayed and understood by medical personnel.
File