Weighted Burrows-Wheeler Compression
Authors:
Aharon Fruchtman,
Yoav Gross,
Shmuel T. Klein,
Dana Shapira
Abstract:
A weight based dynamic compression method has recently been proposed, which is especially suitable for the encoding of files with locally skewed distributions. Its main idea is to assign larger weights to closer to be encoded symbols by means of an increasing weight function, rather than considering each position in the text evenly. A well known transformation that tends to convert input files int…
▽ More
A weight based dynamic compression method has recently been proposed, which is especially suitable for the encoding of files with locally skewed distributions. Its main idea is to assign larger weights to closer to be encoded symbols by means of an increasing weight function, rather than considering each position in the text evenly. A well known transformation that tends to convert input files into files with a more skewed distribution is the Burrows-Wheeler Transform. This paper employs the weighted approach on Burrows-Wheeler transformed files and provides empirical evidence of the efficiency of this combination.
△ Less
Submitted 21 May, 2021;
originally announced May 2021.
Weighted Adaptive Coding
Authors:
Aharon Fruchtman,
Yoav Gross,
Shmuel T. Klein,
Dana Shapira
Abstract:
Huffman coding is known to be optimal, yet its dynamic version may be even more efficient in practice. A new variant of Huffman encoding has been proposed recently, that provably always performs better than static Huffman coding by at least $m-1$ bits, where $m$ denotes the size of the alphabet, and has a better worst case than the standard dynamic Huffman coding. This paper introduces a new gener…
▽ More
Huffman coding is known to be optimal, yet its dynamic version may be even more efficient in practice. A new variant of Huffman encoding has been proposed recently, that provably always performs better than static Huffman coding by at least $m-1$ bits, where $m$ denotes the size of the alphabet, and has a better worst case than the standard dynamic Huffman coding. This paper introduces a new generic coding method, extending the known static and dynamic variants and including them as special cases. In fact, the generalization is applicable to all statistical methods, including arithmetic coding. This leads then to the formalization of a new adaptive coding method, which is provably always at least as good as the best dynamic variant known to date. Moreover, we present empirical results that show improvements over static and dynamic Huffman and arithmetic coding achieved by the proposed method, even when the encoded file includes the model description.
△ Less
Submitted 17 May, 2020;
originally announced May 2020.