logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-10162018-115120


Tipo di tesi
Tesi di dottorato di ricerca
Autore
VAZIRI, FARZAD
URN
etd-10162018-115120
Titolo
Social Media Data Analytics for Tourists' Mobility Modeling and Prediction
Settore scientifico disciplinare
INF/01
Corso di studi
INFORMATICA
Relatori
tutor Prof. Pedreschi, Dino
relatore Dott. Nanni, Mirco
Parole chiave
  • flickr
  • geotagged photo
  • prediction
  • social media
  • tourism
  • tourist movement
Data inizio appello
25/10/2018
Consultabilità
Completa
Riassunto
Understanding and predicting human mobility is currently one of the most interesting and challenging objectives of big data analytics, with many scientific issues and large impact applications.
In this context, the mobility of visitors within a touristic area (from small cities to whole countries) represents a very specific yet important case, with its own specificities and high economical and social impact.
In this thesis, we aim to study methods and algorithms for modeling tourists’ mobility in urban settings, with the twofold objective of better understanding the choice criteria adopted to plan the visit, and predicting the destinations of visitors, which can be valuable tools for city management and the simulation of what-if scenarios.
The approaches considered in this work are tailored around social network data sources providing positioning information about their users, and in particular experimental evaluations are performed on Flickr data.
In the first part of the work, two main driving criteria for choosing the next visited location of a user are considered: Willingness to move far away vs the popularity of the place to visit. Empirical results on Venice – which is a representative of massively touristic cities – suggest that both play an important role for most visitors, with some minorities almost exclusively driven by only one of them, and virtually nobody moving randomly.
In the second and largest part of the thesis, we compare several sequence prediction approaches on the task of predicting the next point-of-interest in a user’s itinerary. The candidate solutions include standard Hidden Markov Models (HMMs), Sequential Rule Mining (SRM), Recurrent Neural Networks (RNNs) and a Hybrid model that mixes the first two basic methods. Empirical evaluations suggest that HMMs and SRM have a limited accuracy in this kind of task, yet with complementary strengths that are successfully exploited by the Hybrid combination model reaching significant improvements. Finally, RNNs showed the best performances among all the approaches, in spite of the relatively limited size of the training dataset available.
File