Thesis etd-04142020-093303 |
Link copiato negli appunti
Thesis type
Tesi di laurea magistrale
Author
PRIMAVERILI, ANDREA
URN
etd-04142020-093303
Thesis title
Development of a Data Collection and Analysis System for Fake News Detection
Department
INGEGNERIA DELL'INFORMAZIONE
Course of study
COMPUTER ENGINEERING
Supervisors
relatore Marcelloni, Francesco
Keywords
- Fake News Detection Clusteting Bert Streaming
Graduation session start date
05/05/2020
Availability
Withheld
Release date
05/05/2090
Summary
The diffusion of social media has greatly increased the spread of fake news: interest in automatic fake news detection tools has increased rapidly over the years and this study aims to address the problem by analysing the phenomenon in a streaming context. This particular scenario imposes limits on the tools that can be used and the data available for the construction of the model. Solutions published in literature require the presence of data not available in a live context, such as the number of reactions to social posts or data extracted from the network of users sharing such news. Frequently, the problem is treated with a supervised approach that applies poorly in this context; in the literature, fact checking sites or the intervention of a supervisor are usually used for the construction of the dataset. Moreover, the realization of a classifier to categorize events into true and fake news requires the knowledge of a groundtruth, which is not available in a streaming context where a news may not yet be born. A library for data collection is proposed to gather data from Twitter extracting the text of the tweets and the content of the articles shared by them via url. The features composing the dataset are extracted directly from the text through Word embedding techniques and the resulting data will be analyzed by proposing some techniques for the construction a model with an unsupervised approach that groups the news through clustering methods based on density.
File
| Nome file | Dimensione |
|---|---|
Thesis not available for consultation. |
|