logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-09232022-104721


Tipo di tesi
Tesi di laurea magistrale
Autore
RAZZAGHNOORI, MOHAMMAD
URN
etd-09232022-104721
Titolo
Extracting Knowledge from Biomedical Data Using Natural language processing
Dipartimento
INFORMATICA
Corso di studi
INFORMATICA
Relatori
relatore Prof.ssa Sirbu, Alina
Parole chiave
  • bert
  • biobert
  • biomedical data
  • knowledge graph
  • named entity recognition
  • natural language processing
  • nlp
Data inizio appello
07/10/2022
Consultabilità
Completa
Riassunto
Natural Language Processing (NLP) has helped human beings to uncover knowledge once obscured and use gained insights for various advancements, impossible before. Consequently, advancements in NLP have yielded further discoveries in various fields, one of which could be considered of the highest importance and that is the medical field. More medical knowledge means more lives saved or facilitated. Thus, it is of high importance to apply cutting-edge NLP algorithms to the medical field. Having an inter-connected network of biomedical entities such as genes, chemicals, and diseases would bring about invaluable insights about the proximity of entities and can, in turn, be fruitful to discover related items that might be playing a part in answering questions a user is pondering in mind. Question regarding the availability of such a knowledge graph in addition to it being highly practical made us dream of building a complete pipeline of tools resulting in the effortless discovery of information stored as mentioned and finally realizing it. Precisely speaking, We implemented a Named Entity Recognition (NER) system using transfer learning employing BioBERT, created a knowledge graph of biomedical entities, and finally a web platform to interact with the aforementioned systems. Due to the fact that the NER system has been already implemented and improved upon various times, we consider our contribution to be the knowledge graph generated, the platform with search capabilities, and a biomedical summaries database generated as a by-product of our graph generation. Building on top of what we achieved, one could implement a better search engine, possibly intelligent enough to receive queries in natural language form, answer questions, or even suggest insightful articles. The possibilities are endless.
File