logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-01122014-182737


Tipo di tesi
Tesi di laurea magistrale
Autore
MURATORE, LUCA
URN
etd-01122014-182737
Titolo
Large Scale Data Streaming
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
INGEGNERIA INFORMATICA
Relatori
relatore Prof. Avvenuti, Marco
correlatore Dott. Cimino, Mario Giovanni Cosimo Antonio
Parole chiave
  • Data Streaming
  • Real-time stream processing
  • Storm
Data inizio appello
27/02/2014
Consultabilità
Completa
Riassunto
Over the last few years, applications that require real-time processing of an huge
amount of data are pushing the limits of traditional data processing infrastructure.
Many applications in several domains such as telecommunications, large scale sensor
networks, financial, online applications, computer network management, security
and others, require real-time processing of continuos data flows: this kind of computation systems are usually called Data Stream Management
Systems (DSMSs) or Stream Processing Engines (SPEs).
Traditional Data Base Management Systems (DBMSs) implements the store than process paradigm; it means that: data require to be stored (persistently) and indexed before they could be
processed and data processing is asynchronous in relation to their arrival.
In DSMSs data streams are not stored but are rather processed on-the-fly using
continuos queries: the query is constantly standing over the streaming
data and results are continuously output.
One of the most famous and used DSMS is called Storm.
Storm is a powerful tool and has a simple programming model, but it does not
provide a bulit-in implementation of stream-oriented operators: this is a strong
limitation because the user is forced to write a case-specic implementation every
time.
The goal of the work described in this thesis is to build a distributed real-time
computation system on top of Storm, called Enhanced Storm, that provides to
the user built-in relation algebra and database-specic operators for streaming computation.
Enhanced Storm maintains Storm fault-tolerance and scalability: in this way we supply to the user a generic, high performing and easy-to-use system.
Enhanced Storm was developed at the Distributed System Laboratory(LSD) of the
Universidad Politecnica de Madrid(UPM)[UPM].
File