logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-04042024-102451


Tipo di tesi
Tesi di dottorato di ricerca
Autore
GAMBINI, MARGHERITA
URN
etd-04042024-102451
Titolo
Digital Sentinels: Unraveling the Societal Implications and Social Media Defence Strategies Against Large Language Models
Settore scientifico disciplinare
ING-INF/05
Corso di studi
INGEGNERIA DELL'INFORMAZIONE
Relatori
tutor Prof. Avvenuti, Marco
tutor Dott. Tesconi, Maurizio
tutor Dott. Fagni, Tiziano
Parole chiave
  • conspiracy theories
  • deepfake text detection
  • large language models
  • social media
  • user stance detection
Data inizio appello
09/04/2024
Consultabilità
Completa
Riassunto
OpenAI's ChatGPT, part of the Transformer-based Large Language Models (LLMs) family, has gained popularity for its advanced text generation capabilities. However, LLMs also raise concerns, especially in the context of social media, where they can amplify the spread of misinformation, potentially threatening democracy. Malicious actors may use LLMs for creating fake social media personas or for disseminating harmful content. This thesis explores methods to detect LLM-generated texts and harmful content on social media, focusing on the challenges posed by the short texts typical of these platforms. Initial work involved developing detectors for "deepfake" texts, leading to the creation of the TweepFake dataset containing a mix of genuine and fake tweets. The most effective detection method achieved around 93.4% accuracy by fine-tuning a pre-trained LLM with a neural network classifier. Further contributions include techniques for identifying the source model of generated texts, emphasizing the role of linguistic features. Additionally, the thesis introduces an unsupervised framework for detecting users' stances on various topics through their Twitter activity and identifies linguistic patterns distinguishing conspiracy theorists from ordinary users. These efforts aim to enhance the detection of fake content and understand LLMs' impact, offering insights into mitigating the risks associated with these powerful technologies.
File