logo SBA

ETD

Digital archive of theses discussed at the University of Pisa

 

Thesis etd-08292022-182213


Thesis type
Tesi di laurea magistrale
Author
SERRA, ALESSIO
URN
etd-08292022-182213
Thesis title
Cross-modal learning for sentiment analysis of social media images
Department
INGEGNERIA DELL'INFORMAZIONE
Course of study
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Supervisors
relatore Dott. Falchi, Fabrizio
relatore Dott. Tesconi, Maurizio
relatore Prof. Avvenuti, Marco
relatore Dott. Carrara, Fabio
relatore Dott.ssa Gambini, Margherita
Keywords
  • Computer Vision
  • Cross-media learning
  • Natural Language Processing
  • Transformer models
  • Twitter Sentiment Analysis
  • Visual Sentiment Analysis
Graduation session start date
23/09/2022
Availability
Full
Summary
In this thesis is presented a broad overview of the Sentiment Analysis problem in the machine learning field, both for textual and visual media, with a focus on all techniques and methods used for the experimental part.
In the latter, a big Twitter Visual Sentiment Analysis dataset was built crawling ∼3.5
images from the social media for three months. This was achieved without the need for a
human annotator, thus minimizing the effort required and allowing for the creation of a huge
data set.
The cross-modal learning approach used confirmed that, even if the textual information
associated to images is often noisy and ambiguous, it can still be useful to build a reliable
dataset, whose size is limited only by the number of images available.
This large dataset can help the future research to train robust visual models and its size would
be particularly advantageous, as the number of parameters of current SOTA models is exponentially
growing along with their need of data to avoid overfitting problems.
The effectiveness of the T4SA 2.0 was tested fine-tuning the Vision-Transformer model,
which achieved incredible results on other manual annotated visual datasets, even beating
the current State Of The Art.
File