logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-06142022-154002


Tipo di tesi
Tesi di laurea magistrale
Autore
COSENZA, EMANUELE
URN
etd-06142022-154002
Titolo
Deep Graph Representations for Polyphonic Multitrack Music Generation
Dipartimento
INFORMATICA
Corso di studi
INFORMATICA
Relatori
relatore Bacciu, Davide
relatore Valenti, Andrea
Parole chiave
  • deep graph networks
  • music generation
  • Variational Autoencoder
Data inizio appello
01/07/2022
Consultabilità
Completa
Riassunto
Graphs, data structures in which entities are connected by relations, represent a reasonable way to model polyphonic multitrack symbolic music data, where notes, chords and
entire musical sections may be linked at different levels of hierarchy by tonal and rhythmic relationships. Nonetheless, there is a lack of works in literature that consider graph
representations in the context of Deep Learning systems for music generation. This thesis
attempts to bridge this gap while trying to avoid typical problems of generative models
for graphs, leveraging the intrinsic grid-like, hierarchical structure of music data. To do
this, the thesis introduces a new musical representation, named chord-level graph, where
single nodes correspond to the activation of multiple notes in a single timestep for a given
instrument and edges represent temporal relationships between track activations. Paired
with it, the thesis presents a deep Variational Autoencoder that generates the structure
and the content of chord-level graphs separately, one after the other, with a hierarchical
architecture that matches the structural priors of music. After training the model on data
taken from an existing MIDI dataset, the experiments prove that the model is able to
reconstruct existing pieces, to generate appealing short and long musical sequences and to
realistically interpolate between them, producing music that is tonally and rhythmically
consistent. Furthermore, the visualization of the embeddings shows that the model is able
to learn known musical concepts on its own. To conclude, experiments on user-generated
structures confirm the potential of the model for what regards the structure-conditioned
generation of music.
File