ETD

Digital archive of theses discussed at the University of Pisa

 

Thesis etd-08302021-121926


Thesis type
Tesi di laurea magistrale
Author
MINUTELLA, FILIPPO
URN
etd-08302021-121926
Thesis title
Design and implementation of a deep learning system for knowledge graph analysis
Department
INGEGNERIA DELL'INFORMAZIONE
Course of study
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Supervisors
relatore Prof. Falchi, Fabrizio
relatore Dott. Manghi, Paolo
relatore Dott. De Bonis, Michele
relatore Dott. Messina, Nicola
Keywords
  • knowledge graph
  • graph
  • machine learning
  • scholarly communication
  • deep learning
  • open science
  • graph neural network
  • graph machine learning
Graduation session start date
24/09/2021
Availability
Full
Summary
Nowadays a lot of data is in the form of Knowledge Graphs, i.e. a set of nodes and relationships between them. Many companies exclude relationships or don't use them to their full potential in order to convert naturally graph-like data into tabular data so that it can be organized in the usual databases and analyzed using simple, familiar processes.
This conversion process has the advantage of simplification but brings with it a loss of information that cannot always be ignored.
After a review of techniques aimed at performing different tasks on graph data types, some of these were used in the analysis of the data provided by OpenAIRE.
OpenAIRE is a platform to support Open Science in Europe and it provides a Research Graph, which is a graph composed of scientific resources linked to their authors, where they have been published, and the keywords in them.
For the analysis of the Research Graph, it has been used a metapath approach in order to allow the analysis of a heterogeneous graph by transforming it into a series of homogeneous graphs.
Such graphs are simpler to be analyzed and they allow to focus the analysis on a single type of element of the graph.
A framework was developed to analyze the Research Graph and to highlight the anomalies in the dataset.
The framework integrates the metapath approach and a neural network to perform Node Classification and Node Embedding, and the results were compared with the methods of Graph Neural Networks in the literature.
The result of our work is a method that can leverage the node attributes and graph metapaths to perform Node Classification or Node Embedding by identifying the most significant information.
The result of the work presented in this thesis is a framework that is scalable, easy to understand and fast. Moreover, it performs better than other unsupervised methods available in the literature.
File