logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-06122024-164215


Tipo di tesi
Tesi di laurea magistrale
Autore
ANIKEEVA, EVGENIIA
URN
etd-06122024-164215
Titolo
Creation and computational analysis of a corpus of fake news related to the special military operation in Ukraine from February 2022
Dipartimento
FILOLOGIA, LETTERATURA E LINGUISTICA
Corso di studi
INFORMATICA UMANISTICA
Relatori
relatore Prof. Lenci, Alessandro
correlatore Dott. Bondielli, Alessandro
Parole chiave
  • Artificial Intelligence
  • Chat GPT
  • Data Annotation
  • Dataset Collection and annotation
  • Deep Learning
  • Fake News
  • Fake News Detection
  • Language Models
  • Machine Learning
  • Multi-modality
  • Natural Language Processing
  • NLP
Data inizio appello
05/07/2024
Consultabilità
Completa
Riassunto
The challenge of automatic fake news recognition is becoming an increasingly widespread research topic across various fields. The aim of this thesis is to propose a methodology for constructing a corpus of fake news through semi-artificial intelligence annotation. For this project, we focused on a single event, the conflict between Russian Federation and Ukraine 2022, called in Russia “The Special Military Operation”. Three corpuses of fake news in Russian, English and Italian were created. The correlation of fake news from the three sides was analysed, which news in the two-sided conflict Western and Russian media consider to be fake and which are not. The artificial intelligence Copilot Microsoft based on GPT-4 also analysed the fake news that Italian and English media consider to be fake and marked certain news as true or fake in its opinion. This thesis presents results after conducting research in the direction of applicability of Artificial Intelligence for fake news analysis. In particular, a detailed key aspects of the task, including the creation of a dataset for fake news detection in Italian, Russian and English languages. The results of this work have shown that the fake news identification problem is still an open issue. This work also discusses and suggests multiple approaches to address the problem in the future.
File