Showing 1–2 of 2 results for author: Kida, T
-
MR-RePair: Grammar Compression based on Maximal Repeats
Authors:
Isamu Furuya,
Takuya Takagi,
Yuto Nakashima,
Shunsuke Inenaga,
Hideo Bannai,
Takuya Kida
Abstract:
We analyze the grammar generation algorithm of the RePair compression algorithm and show the relation between a grammar generated by RePair and maximal repeats. We reveal that RePair replaces step by step the most frequent pairs within the corresponding most frequent maximal repeats. Then, we design a novel variant of RePair, called MR-RePair, which substitutes the most frequent maximal repeats at…
▽ More
We analyze the grammar generation algorithm of the RePair compression algorithm and show the relation between a grammar generated by RePair and maximal repeats. We reveal that RePair replaces step by step the most frequent pairs within the corresponding most frequent maximal repeats. Then, we design a novel variant of RePair, called MR-RePair, which substitutes the most frequent maximal repeats at once instead of substituting the most frequent pairs consecutively. We implemented MR-RePair and compared the size of the grammar generated by MR-RePair to that by RePair on several text corpus. Our experiments show that MR-RePair generates more compact grammars than RePair does, especially for highly repetitive texts.
△ Less
Submitted 18 February, 2019; v1 submitted 12 November, 2018;
originally announced November 2018.
-
Compaction of Church Numerals for Higher-Order Compression
Authors:
Isamu Furuya,
Takuya Kida
Abstract:
In this study, we address the problem of compacting Church numerals. Church numerals appear as a representation of the repetitive part of data in higher-order compression. We propose a novel decomposition scheme for a natural number using tetration, which leads to a compact representation of $λ$-terms equivalent to the original Church numerals. For natural number $n$, we prove that the size of the…
▽ More
In this study, we address the problem of compacting Church numerals. Church numerals appear as a representation of the repetitive part of data in higher-order compression. We propose a novel decomposition scheme for a natural number using tetration, which leads to a compact representation of $λ$-terms equivalent to the original Church numerals. For natural number $n$, we prove that the size of the $λ$-term obtained by the proposed method is $O(({\rm slog}_{2}n)^{\log n/ \log \log n})$. Moreover, we quantitatively confirmed experimentally that the proposed method outperforms a binary expression of Church numerals when $n$ is less than approximately 10000.
△ Less
Submitted 10 November, 2017; v1 submitted 30 June, 2017;
originally announced June 2017.