ETD

Archivio digitale delle tesi discusse presso l'Università di Pisa

Tesi etd-04192018-140830


Tipo di tesi
Tesi di dottorato di ricerca
Autore
GABRIELLI, LORENZO
URN
etd-04192018-140830
Titolo
TOWARDS BIG DATA METHODS AND TECHNOLOGIES FOR OFFICIAL STATISTICS
Settore scientifico disciplinare
ING-INF/05
Corso di studi
INGEGNERIA DELL'INFORMAZIONE
Relatori
tutor Prof.ssa Giannotti, Fosca
tutor Prof. Nanni, Mirco
tutor Prof. Marcelloni, Francesco
Parole chiave
  • city dynamics
  • data science
  • big data
  • real-time demography
  • wellbeing
  • data mining
  • mobility
  • population estimation
Data inizio appello
11/05/2018
Consultabilità
Completa
Riassunto

This thesis aims to demonstrate in a tangible way how mobile phone data, private vehicle tracks, and scanner data are useful for measuring complex systems.
The three main areas of application concerned use of Big Data: i) for measuring the presence within a territory through Data Mining techniques, ii) to now-casting socio-economic development of a country, and iii) for measuring the dynamics of cities.

First, it has been developed a tool for real-time demography demonstrating how to use mobile phone data over a wide area to achieve a new Official Statistic indicators. The study showed how Big Data, either using mobile phone data or scanner data are useful and effective for carrying out a continuous census of the population.

Second, it has been proposed an analytical framework able to evaluate relations between relevant aspects of human behavior and the well-being of a territory. We found out that the diversity of human mobility is a mirror of some aspects of socio-economic development and well-being. Then, we showed how mobility features help to improve the performance of state-of-the-art methodology such as small area estimation methodologies.

Finally, it has been analyzed how mobility interacts with the territory due to the movement of people. We proposed to use mobile phone data and GPS tracks for city government measuring the attractiveness of cities. Furthermore, a data analysis approach aimed to identify mobility functional areas in a completely data-driven way has been proposed.

The main findings of the thesis concern the statistical and ethical evaluation of results with official sources and showed that methodologies could be applied in other contexts and with different data sources as well. We showed how the geographic information contained in the data sources is incredibly useful to observe our society with a new ``microscope''. Thanks to the opportunity provided by the varied scientific context of SoBigData, the European Research Infrastructure for Big Data and Social Mining. the Ph.D. also contributed to develop and promote responsible data science because the ethical framework is considered as part of the CRISP model, not a problem to treat apart.
File