## Tesi etd-02082012-012143 |

Thesis type

Tesi di laurea magistrale

Author

FARRUGGIA, ANDREA

URN

etd-02082012-012143

Title

Multi-Objective Optimization Design for LZSS

Struttura

SCIENZE MATEMATICHE, FISICHE E NATURALI

Corso di studi

INFORMATICA E NETWORKING

Supervisors

**relatore**Prof. Ferragina, Paolo

**relatore**Prof. Frangioni, Antonio

**controrelatore**Prof. Prencipe, Giuseppe

Parole chiave

- data compression
- lzss
- multiobjective compression

Data inizio appello

24/02/2012;

Consultabilità

Parziale

Data di rilascio

24/02/2052

Riassunto analitico

In this thesis we explore the idea of controlling the LZSS compression ratio/decompression speed trade-off in a principled way.

Our approach lies on two ideas.

The first one is to model a LZSS parsing of a text T as a path on a (weighted) graph G(T), which has a vertex for each character in T and an edge for each phrase in the dictionary.

The second idea is to model explicitly the amount of compression space and decompression time taken up by a parsing through resource models.

Both ideas are not new: the notion of transposing LZ77 parsings to paths in a suitably defined graph G has been originally illustrated by Schuegraf, while Ferragina et al. labeled each edge of G with the cost in bits of the associated codeword.

In this thesis, we further extend G by labeling each edge with the encoding size in bits and the decoding time of the codeword, as given by the resource models.

This model allows us to define, in a precise way, the time-constrained LZSS parsing problem, namely, the problem of determining the parsing which minimizes the compressed size given a bound on the decompression time, as a constrained single source shortest path problem on G.

The proposed strategy to attack the problem is an heuristic based on the computation of the Lagrangian dual, i.e., the computation of the “best” Lagrangian relaxation of the time constraint.

In this way we reduce the problem to computing several times the bit-optimal LZSS parsing problem, so we can take advantage of its efficient resolution algorithms.

Next, we apply these ideas in the implementation of a compressor which employs the time-constrained LZSS strategy.

The treatment includes the description of a fast encoder and the derivation of an accurate time model for the target processor, a Core 2 Duo P8600.

The experimental results show that the idea is promising. In fact, the compressor exhibits remarkable performances in its capacity of controlling the time / space trade-off, mainly due to the high accuracy of the time model and the excellent performances of the Lagrangian Dual Heuristic, whose solutions are very close to the optimal one.

Moreover, results are impressive even on an absolute scale, since the compressor exhibits equal-or-better compression ratios than gzip and decompression times comparable with Snappy, the state-of-the-art of fast LZSS compressors.

Our approach lies on two ideas.

The first one is to model a LZSS parsing of a text T as a path on a (weighted) graph G(T), which has a vertex for each character in T and an edge for each phrase in the dictionary.

The second idea is to model explicitly the amount of compression space and decompression time taken up by a parsing through resource models.

Both ideas are not new: the notion of transposing LZ77 parsings to paths in a suitably defined graph G has been originally illustrated by Schuegraf, while Ferragina et al. labeled each edge of G with the cost in bits of the associated codeword.

In this thesis, we further extend G by labeling each edge with the encoding size in bits and the decoding time of the codeword, as given by the resource models.

This model allows us to define, in a precise way, the time-constrained LZSS parsing problem, namely, the problem of determining the parsing which minimizes the compressed size given a bound on the decompression time, as a constrained single source shortest path problem on G.

The proposed strategy to attack the problem is an heuristic based on the computation of the Lagrangian dual, i.e., the computation of the “best” Lagrangian relaxation of the time constraint.

In this way we reduce the problem to computing several times the bit-optimal LZSS parsing problem, so we can take advantage of its efficient resolution algorithms.

Next, we apply these ideas in the implementation of a compressor which employs the time-constrained LZSS strategy.

The treatment includes the description of a fast encoder and the derivation of an accurate time model for the target processor, a Core 2 Duo P8600.

The experimental results show that the idea is promising. In fact, the compressor exhibits remarkable performances in its capacity of controlling the time / space trade-off, mainly due to the high accuracy of the time model and the excellent performances of the Lagrangian Dual Heuristic, whose solutions are very close to the optimal one.

Moreover, results are impressive even on an absolute scale, since the compressor exhibits equal-or-better compression ratios than gzip and decompression times comparable with Snappy, the state-of-the-art of fast LZSS compressors.

File

Nome file | Dimensione |
---|---|

1 file non consultabili su richiesta dell'autore. |