Tesi etd-01122017-120621

Tipo di tesi

Tesi di dottorato di ricerca

Autore

PASSARO, LUCIA

URN

etd-01122017-120621

Titolo

Distributional Models of Emotions for Sentiment Analysis in Social Media

Settore scientifico disciplinare

ING-INF/05

Corso di studi

INGEGNERIA DELL'INFORMAZIONE

Relatori

tutor Prof. Lenci, Alessandro
tutor Prof. Marcelloni, Francesco
tutor Prof.ssa Vaglini, Gigliola

Parole chiave

Emotion Detection
Sentiment Analysis
Emotive Lexicons
Distributional Semantics
Computational Linguistics
NLP

Data inizio appello

21/01/2017

Consultabilità

Completa

Riassunto

With the proliferation of social media, textual emotion analysis is becoming increasingly important. Sentiment Analysis and Emotion Detection can be useful to track several applications. They can be used, for instance, in Customer Relationship Management to track sentiments towards companies and their services, or in Government Intelligence, to collect people's emotions and points of views about government decisions.
It is clear that tracking reputation and opinions without appropriate text mining tools is simple infeasible. Most of these tools are based on sentiment and emotion lexicons, in which lemmas are associated with the sentiment and/or emotions they evoke. However, almost all languages but English lack high-coverage inventories of this sort.

This thesis presents several sentiment analysis tasks to illustrate challenges and opportunities in this research area. We review different state-of-the-art methods for sentiment analysis and emotion detection and describe how we modeled a framework to build emotive resources, that can be effectively exploited for text affective computing. One of the main outcome of the work presented in this thesis is ItEM, which is a high-coverage Italian EMotive lexicon created by exploiting distributional methods.It has been built with a three stage process including the collection of a set of highly emotive words, their distributional expansion and the validation of the system. Since corpus-based methods reflect the type of the corpus from which they are build, in order to create a reliable lexicon we collected a new Italian corpus, namely FB-NEWS15. This collection has been created by crawling the Facebook pages of the most important Italian newspapers, which typically include a small number of posts written by the journalists and a very high number of comments inspired by long discussions among readers about such news.

Finally, we describe some experiments on the sentiment polarity classification of tweets. We started from a system based on supervised learning that was originally developed for the Evalita 2014 SENTIment POLarity Classification task (Basile et al., 2014) and subsequently explored the possibility to enrich this system by exploiting lexical emotive features derived from social media texts.

File

Nome file	Dimensione
PASSARO_...eport.pdf	90.38 Kb
PhD_thes...ssaro.pdf	1.85 Mb
Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-01122017-120621