Thesis etd-04182024-175520 |
Link copiato negli appunti
Thesis type
Tesi di dottorato di ricerca
Author
TOSONI, FRANCESCO
URN
etd-04182024-175520
Thesis title
Computation-friendly compression of matrices and tries
Academic discipline
INF/01
Course of study
INFORMATICA
Supervisors
tutor Prof. Ferragina, Paolo
correlatore Prof. Manzini, Giovanni
correlatore Prof. Manzini, Giovanni
Keywords
- basi dati chiave-valore
- compressione dati ripetitivi
- compressione senza perdita
- dizionari di stringhe
- green computing
- key-value stores
- lossless compression
- matrix-vector multiplications
- moltiplicazioni matrice-vettore
- repetitive data compression
- string dictionaries
- trie
- tries
Graduation session start date
06/05/2024
Availability
Withheld
Release date
06/05/2027
Summary
In this thesis, we continue the research on repetitive data compression by investigating novel general compression schemes that are data-independent. Although we specifically focus on machine learning and key-value systems, we believe that our methods provide insights applicable to a wider range of application domains.
Our proposed methods adapt one-dimensional general-purpose compression tools to handle complex data structures such as matrices, graphs and tries. These schemes effectively capture redundancies and interdependencies among the data, enabling compression that surpasses what can be achieved through sparsity alone, and without compromising the quality metrics such as precision or recall of the resulting models. Following the “computation-friendly” paradigm, our compressed representations allow for direct operations on the compressed data, with time comparable to operations on uncompressed data.
Our proposed methods adapt one-dimensional general-purpose compression tools to handle complex data structures such as matrices, graphs and tries. These schemes effectively capture redundancies and interdependencies among the data, enabling compression that surpasses what can be achieved through sparsity alone, and without compromising the quality metrics such as precision or recall of the resulting models. Following the “computation-friendly” paradigm, our compressed representations allow for direct operations on the compressed data, with time comparable to operations on uncompressed data.
File
Nome file | Dimensione |
---|---|
The thesis is not available. |