Tesi etd-10312023-160905 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
GOMEZ GOMEZ, MARSHA
Indirizzo email
m.gomezgomez@studenti.unipi.it, marsha_gomez0609@hotmail.com
URN
etd-10312023-160905
Titolo
Leveraging financial reports for improving credit score analysis via pre-trained language models
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Prof. Cimino, Mario Giovanni Cosimo Antonio
relatore Dott. Parola, Marco
tutor Dott. Cristantielli, Andrea
relatore Dott. Parola, Marco
tutor Dott. Cristantielli, Andrea
Parole chiave
- AIDA-dataset
- BERT
- Electra
- financial reports
- fintech
- HuggingFace
- long document
- machine learning
- natural language processing
- NLP
- Shap
- text classification
- text explainability
Data inizio appello
17/11/2023
Consultabilità
Non consultabile
Data di rilascio
17/11/2093
Riassunto
This thesis aims to explore an innovative approach by investigating the feasibility and effectiveness of leveraging the public data on AIDA portal (Bureau van Dijk) as a source to collect 10-K reports, specifically the "Chamber of Commerce Record" for businesses. In the realm of business credit score analysis, the application of machine learning techniques has gained significant traction for its potential to improve accuracy and efficiency. By building an initial collection of raw financial documents, we can delve into the world of credit score analysis in a more comprehensive manner.
The primary objective of this study is to parse the collected financial reports and extract bankruptcy procedures that companies have undergone within the previous fiscal year. This endeavor will facilitate the creation of a labeled dataset comprising instances of active (default) and bankruptcy (non-default) cases. The utilization of Non Linear Programming tools will play a crucial role in parsing the financial reports and extracting relevant features.
Moreover, the extracted features will be utilized to train a classifier capable of predicting the probability of active (default) for companies. By leveraging the power of machine learning algorithms, we aim to enhance the accuracy and precision of credit score predictions, thereby empowering financial institutions and businesses with more reliable risk assessment capabilities.
The primary objective of this study is to parse the collected financial reports and extract bankruptcy procedures that companies have undergone within the previous fiscal year. This endeavor will facilitate the creation of a labeled dataset comprising instances of active (default) and bankruptcy (non-default) cases. The utilization of Non Linear Programming tools will play a crucial role in parsing the financial reports and extracting relevant features.
Moreover, the extracted features will be utilized to train a classifier capable of predicting the probability of active (default) for companies. By leveraging the power of machine learning algorithms, we aim to enhance the accuracy and precision of credit score predictions, thereby empowering financial institutions and businesses with more reliable risk assessment capabilities.
File
Nome file | Dimensione |
---|---|
La tesi non è consultabile. |