Tesi etd-09042023-093538 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
TEMPESTI, PIETRO
URN
etd-09042023-093538
Titolo
Development of a GDPR-Compliant Multi-Purpose Data Lake
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Prof. Cimino, Mario Giovanni Cosimo Antonio
relatore Ing. Galatolo, Federico Andrea
relatore Dott. Frongillo, Dario
relatore Ing. Galatolo, Federico Andrea
relatore Dott. Frongillo, Dario
Parole chiave
- cloud
- data lake
- design
- GDPR
Data inizio appello
22/09/2023
Consultabilità
Non consultabile
Data di rilascio
22/09/2093
Riassunto
One of the greatest challenges in the Big Data era is to manage the huge amount of data produced by the companies, which are often stored in different formats and in different locations, in a manner consistent with personal information protection regulations.
The General Data Protection Regulation (GDPR) imposes strict rules on the processing of personal data of European citizens and enterprises, focusing on the data owner's rights rather than technical measures.
Data Lake systems are emerging as the most used solution for handling Big Data in a flexible and less expensive way, permitting different strategies for transforming the data which are performed only at query time.
This thesis aims to develop a modern Data Lake System hosted in Cloud which can efficiently and effectively integrate several Big Data sources while being compliant with the General Data Protection Regulation.
An High-Level Design architecture will be presented, and starting from that project a Low-Level Design architecture will be built, on top of the Cloud services provided by Google Cloud Platform. Each architecture will be tested against the GDPR principles relevant in this kind of system and compared with the state-of-the-art of this field of studies, to verify the compliance of the system with the regulation and the impact of the security measures introduced on the overall performance of the system.
The General Data Protection Regulation (GDPR) imposes strict rules on the processing of personal data of European citizens and enterprises, focusing on the data owner's rights rather than technical measures.
Data Lake systems are emerging as the most used solution for handling Big Data in a flexible and less expensive way, permitting different strategies for transforming the data which are performed only at query time.
This thesis aims to develop a modern Data Lake System hosted in Cloud which can efficiently and effectively integrate several Big Data sources while being compliant with the General Data Protection Regulation.
An High-Level Design architecture will be presented, and starting from that project a Low-Level Design architecture will be built, on top of the Cloud services provided by Google Cloud Platform. Each architecture will be tested against the GDPR principles relevant in this kind of system and compared with the state-of-the-art of this field of studies, to verify the compliance of the system with the regulation and the impact of the security measures introduced on the overall performance of the system.
File
Nome file | Dimensione |
---|---|
La tesi non è consultabile. |