logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-11162016-030435


Tipo di tesi
Tesi di laurea magistrale
Autore
PAUCAR SEDANO, FLORENCIO
URN
etd-11162016-030435
Titolo
FORECASTING PERUVIAN PRESIDENTIAL ELECTION USING SUPERVISED AGGREGATED SENTIMENT ANALYSIS
Dipartimento
INFORMATICA
Corso di studi
INFORMATICA PER L'ECONOMIA E PER L'AZIENDA (BUSINESS INFORMATICS)
Relatori
relatore Prof. Pedreschi, Dino
correlatore Prof. Venturini, Rossano
tutor Dott. Cresci, Stefano
Parole chiave
  • Peruvian election
  • Polls analysis
  • Prediction elections
  • Readme Method
  • SASA method
  • Sentiment Analysis
  • Survey analysis
  • Twitter analysis
Data inizio appello
02/12/2016
Consultabilità
Completa
Riassunto
This thesis discusses the political “sentiment” of the twitter data to forecast the Peruvian presidential electoral results using the Supervised Aggregated Sentiment Analysis method (SASA). The objectives of this thesis are analyzes the behaviour of the polls, predicts the results and verifies the accuracy of the SASA method in the second round presidential election in Peru realized on 5 June 2016 between two candidates Ms. Keiko Fujimori from Fuerza Popular and Mr. Pedro Pablo Kuczynski from Peruanos por el Kambio political party. The method SASA considers treatment of humor, double meanings and sarcasms presents in twitter data; indeed, this method consents to discerning noise information because it focuses on the estimation of aggregated distribution of opinions rather than on the individual classification of each single text.
The SASA forecasting poll obtained on 27 May (according Peruvian laws it is forbidden to publish electoral polls one week before the election day) and published before the election day on bigdatatales.com shows 50.04% for Mr. Pedro Pablo Kuczynski and 49.96% for Ms. Keiko Fujimori. The official electoral result published by the Oficina Nacional de Procesos Electorales is 50.12% for Kuczynski and 49.88% for Fujimori. In addition, a post-election analysis SASA considering twitter data until 05 June presented 50.09% for Kuczynski and 49.91% for Fujimori. The difference between the official results and the forecasting analysis was 0.08% and respect to the post-election day was 0.03%.
This thesis verifies the accuracy of the SASA method in emerging twitter social networks communities such as Peru, with approximately four million of twitter users and twenty-two million voters. In fact, the difference of 0.08% obtained between the SASA forecasting and the official results grant the right to confirm the correctness of the Supervised Aggregated Sentiment Analysis method. Moreover, this analysis permitted to know the influence of one of the most used social networks in the Peruvian electoral campaign. The methodology used is described in the next lines of this thesis.
File