Tesi etd-10142019-100652

Tipo di tesi

Tesi di dottorato di ricerca

Autore

POLLACCI, LAURA

URN

etd-10142019-100652

Titolo

Superdiversity: (Big) Data analytics at the crossroads of geography, language and emotions

Settore scientifico disciplinare

INF/01

Corso di studi

INFORMATICA

Relatori

tutor Prof. Pedreschi, Dino

Parole chiave

human migration
sentiment analysis
superdiversity

Data inizio appello

21/10/2019

Consultabilità

Completa

Riassunto

In a series of articles, Vertovec focused on the changes and contexts that have affected migratory flows around the world. These demographic changes, which Vertovec defines Superdiversity, are the result of the globalisation and they outline a change in the overall level of migration patterns. Over time, the migration routes have increased both their diversity and complexity. The nature of immigration has brought with it a transformative ``diversification of diversity''. Strictly connected with ethnicity and Superdiversity studies, the phenomenon of human migration has been a constant during human history. In the era of Big Data, every single user lives in a hyper-connected world. More than 75\% of the world's population has a mobile phone, and over half of these are smartphones. The use of social media grows together with the number of connected people. In these \emph{social} Big data, User-Generated Content incorporate a high number of discriminating information. Language, space and time are three of the best features that can be employed to detect Superdiversity. The strongest point of social Big Data is that they typically natively include various information about different dimensions.

Starting from these observations, in this thesis, we define a measure of Superdiversity, a Superdiversity Index, by adding the emotional dimension and placing it in the context of social Big Data. Our measure is based on an epidemic spreading algorithm that is able to automatically extend the dictionary used in lexicon-based sentiment analysis. It is easily applicable to various languages and suitable for Big Data. Our Superdiversity Index allows for comparing diversity from the point of view of the emotional content of language in different communities. An important characteristic of our Superdiversity Index is the high correlation with immigration rates.
For this reason, we believe this can be used as an essential feature in a nowcasting model of migration stocks. Our framework can be applied with higher time and space resolution compared to official statistics. Moreover, we apply our method to a different context and data to measure the Superdiversity of the music world.

File

Nome file	Dimensione
Superdiv...hesis.pdf	3.18 Mb
Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-10142019-100652