ETD

Archivio digitale delle tesi discusse presso l'Università di Pisa

Tesi etd-09112017-103448


Tipo di tesi
Tesi di laurea magistrale
Autore
ZACCONE, TOMMASO
URN
etd-09112017-103448
Titolo
Design and implementation of a framework for profiling city areas from web service data
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
COMPUTER ENGINEERING
Relatori
relatore Prof. Marcelloni, Francesco
relatore Prof. Ducange, Pietro
Parole chiave
  • machine learning
  • smart city
  • clustering
  • data mining
  • city area profiling
  • web framework
Data inizio appello
03/10/2017
Consultabilità
Non consultabile
Data di rilascio
03/10/2087
Riassunto
The work of this thesis revolves around the design and implementation of a framework that allows the characterization, profiling and clustering of city areas based on their similarities in terms of activities, social dynamics and living costs.
The data allowing the framework’s operation originates from various kind of online platforms such as map services, touristic services aggregators and sell or rental advertisement websites.
The project analyzes the online platform retrieved information via data mining techniques, giving the user various functionalities - such as the choices of the city to be profiled, the data source used, the clustering algorithm utilized, the result’s graphical visualization and the presentation of the analysis reports.
At the beginning, we analyzed the data sources that better fit the project, and how it was possible to access the information there contained.
Then, we defined the functional requirements of the framework and made a feasibility study. Moreover, we identified and described the use cases. During the analysis workflow, we defined the classes that describe the main data structure of the framework.
The profiling of a city and the framework development are based on the concept of "City Grid", an imaginary grid that dissects the city in cells. Finally we carried out the implementation stage of a working prototype of the designed framework.
The first part of the implementation regarded data collection, data cleaning and data aggregation. We carried out a statistical analysis of data in order to identify the most suitable strategies to aggregate the data. To this aim, we adopted Python programming language and some specific libraries for statistical analysis.
The implementation phase turned its attention to the clustering algorithms to be applied to the data. The implementation of these algorithms are available into the following machine learning libraries: Sklearn and Scipy.Finally, we experimented the prototype considering the city of Milan as case study.
We have shown an example of GUI use and the results of the city areas profiling.
File