ETD system

Electronic theses and dissertations repository


Tesi etd-10112018-134117

Thesis type
Tesi di dottorato di ricerca
Big Data Analytics for Nowcasting and Forecasting Social Phenomena
Settore scientifico disciplinare
Corso di studi
tutor Prof. Pedreschi, Dino
correlatore Dott. Rinzivillo, Salvatore
Parole chiave
  • Nowcasting
  • Big Data Analytics
  • Forecasting
Data inizio appello
Data di rilascio
Riassunto analitico
One of the most pressing, and fascinating challenges of our time is understanding the complexity
of the global interconnected society we inhabit. This connectedness reveals in many
phenomena: in the rapid growth of the Internet and Web, in the ease with which global communication
and trade now takes place, and in the ability of news and information as well as
epidemics, trends, financial crises and social unrest to spread around the world with surprising
speed and intensity. Ours is also a time of opportunity to observe and measure how our society
intimately works: Big Data originating from the digital breadcrumbs of human activities promise
to let us scrutinize the ground truth of individual and collective behavior at an unprecedented
detail in real time. Multiple dimensions of our social life have Big Data proxies nowadays. We
can use Big Data, as signals, as proxies for forecast and nowcast different phenomena, and even
more social phenomena. We can manage to describe and predict how humans and society works.
We can use geolocated data to observe and measure the behavior of a population, to build better
cities tailored to the movement of the population, with lower commuting times and lower
pollution. We can exploit medical data to build classifiers able to help in diagnosing and curing
diseases. We can use industrial data to improve the production processes, and create smarter
and more secure factories. We can do a lot of other incredible and useful things with the support
of data and analytical tools able to extract useful knowledge from raw data.
In this thesis we introduce data-driven as well as model-driven approaches to predict different
phenomena, from epidemics to socio-economic attraction. We use Big Data deriving from our
everyday life as external proxies to nowcast and forecast the evolution of phenomena whose study
relies only on historical data or data that come only with a significant lag. We use supermarket
retail data as an external signal in order to predict the curve of an internal time series, the
influenza one. When the flu season arrives, people are starting to get sick. Getting sick affects
their everyday life and behavior. This change in behavior should propagate in their purchases
in the supermarket. So they will buy products that will reflect the fact that they are sick.
We also study human movements that are inherently massive, dynamical, and complex. But
understanding the individual mobility patterns, could be of such a fundamental importance for
so many different phenomena. We decided to exploit these patterns in order to study and predict
the attraction of different socio-economic factors of human environment. In our first approach
we study the distribution of the travelling sub-populations in Tuscany region in Italy, to the
airports of the region and we built a dynamic model for the interplay of attraction of availability
of air travel and an airport’s popularity among the population. Based on this model, we forecast
the future evolution of the airports in the region. In our second approach, we identifiy and
categorize industrial clusters in Veneto region in Italy, by size and population dynamics and
measured their attraction. We create a real-time system which help us to feel the pulse of a city,
and predict the rise of new industrial clusters or the death of existing ones. Finally, we attempt
prediction in social networks, introducing the interaction prediction problem, trying to predict
intra-community interactions, interactions that may occur in the interior of the same community,
and we applied the same approach to predict inter-community interactions, the weak links that
keep together the modular structure composing complex networks.