ETD

Archivio digitale delle tesi discusse presso l'Università di Pisa

Tesi etd-05272016-181655


Tipo di tesi
Tesi di laurea magistrale
Autore
BATTISTA, DANIELE
URN
etd-05272016-181655
Titolo
Mining Twitter: Graph Analysis of Interactions among Users
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
COMPUTER ENGINEERING
Relatori
relatore Prof. Avvenuti, Marco
relatore Prof. Tesconi, Maurizio
relatore Ing. Del Vigna, Fabio
relatore Dott. Bellomo, Salvatore
Parole chiave
  • users interaction graph
  • social media analysis
  • activity network
Data inizio appello
20/06/2016
Consultabilità
Completa
Riassunto
Starting from early 2000s, social network websites became very popular; these social media allow users to interact and share content using social links. Users of these platforms often have the possibility to establish hundreds or thousands of social links with other users. While initial studies have focused on social networks topology, a natural and important aspect of networks has been neglected: the focus on user interactions. These links can be monitored to generate knowledge on said users as well as their relationships with others. There has been, lately, an increasing interest on examining the activity network - network able to provide, once traversed, the actual user interactions rather than friendships links - to filter and mine patterns or communities.
The goal of this work is to exploit the Twitter traffic in order to analyze the users interactions. In order to do so, our work models tweets posted by users as activities list in a graph called activity network. Then, we traverse it looking for
Direct (e.g. mentions by user, retweets, direct replies etc.) and indirect (list of users mentioned in a tweet, users retwitting the same tweet produced by another user, etc.) relationships among users in order to create the users interactions graph. We provide a weight schema by which assign a value to interactions found. The obtained graph shows the connections among users and, thanks to their weighted links, those users who have stronger links, such as Verified Accounts or "propaganda users" or cliques of users, clusters of users interacting with each other. Those entities may be interesting to investigate in several fields like Open Source Intelligence or Business Intelligence. This work has been developed on a distributed infrastructure able to perform these tasks efficiently.
The network analysis leads to some considerations: firstly, it is necessary to identify all meaningful interactions among users, which typically depend from the social network and the activities performed. Secondly, many nodes (profiles) with high indegree are associated to mass media and famous people, and thus a filtering phase is a crucial step. Finally, it is remarkable to see that experiments carried out at different moments could lead to very different results since many similar topics may not involve the same users in different moments.
This work will describe the state-of-the-art of the network analysis, and will introduce the architectural design of the system, as well as the analysis performed with the challenges encountered. Results collected by our analysis lead us to the conclusion that, despite being in its preliminary stages, focusing on social interactions is important because it may reveal connection of particular users willing to perform actual activities which may gain interest in intellingence organizations.

File