Skip to main content

Showing 101–122 of 122 results for author: Gagie, T

.
  1. Competitive Boolean Function Evaluation: Beyond Monotonicity, and the Symmetric Case

    Authors: Ferdinando Cicalese, Travis Gagie, Eduardo Laber, Martin Milanic

    Abstract: We study the extremal competitive ratio of Boolean function evaluation. We provide the first non-trivial lower and upper bounds for classes of Boolean functions which are not included in the class of monotone Boolean functions. For the particular case of symmetric functions our bounds are matching and we exactly characterize the best possible competitiveness achievable by a deterministic algorithm… ▽ More

    Submitted 21 June, 2010; originally announced June 2010.

    Comments: 15 pages, 1 figure, to appear in Discrete Applied Mathematics

    Journal ref: Discrete Applied Mathematics 159 (2011) 1070--1078

  2. arXiv:0912.5079  [pdf, ps, other

    cs.IT

    A Lower Bound on the Complexity of Approximating the Entropy of a Markov Source

    Authors: Travis Gagie

    Abstract: Suppose that, for any (k \geq 1), (ε> 0) and sufficiently large $σ$, we are given a black box that allows us to sample characters from a $k$th-order Markov source over the alphabet (\{0, ..., σ- 1\}). Even if we know the source has entropy either 0 or at least (\log (σ- k)), there is still no algorithm that, with probability bounded away from (1 / 2), guesses the entropy correctly after sampling… ▽ More

    Submitted 27 December, 2009; originally announced December 2009.

  3. arXiv:0912.0850  [pdf, ps, other

    cs.DS

    Grammar-Based Compression in a Streaming Model

    Authors: Travis Gagie, Pawel Gawrychowski

    Abstract: We show that, given a string $s$ of length $n$, with constant memory and logarithmic passes over a constant number of streams we can build a context-free grammar that generates $s$ and only $s$ and whose size is within an $\Oh{\min (g \log g, \sqrt{n \log g})}$-factor of the minimum $g$. This stands in contrast to our previous result that, with polylogarithmic memory and polylogarithmic passes o… ▽ More

    Submitted 5 February, 2010; v1 submitted 4 December, 2009; originally announced December 2009.

    Comments: Section on recent work added, sketching how to improve bounds and support random access

  4. arXiv:0911.4981  [pdf, other

    cs.DS

    Efficient Fully-Compressed Sequence Representations

    Authors: Jeremy Barbay, Francisco Claude, Travis Gagie, Gonzalo Navarro, Yakov Nekrich

    Abstract: We present a data structure that stores a sequence $s[1..n]$ over alphabet $[1..σ]$ in $n\Ho(s) + o(n)(\Ho(s){+}1)$ bits, where $\Ho(s)$ is the zero-order entropy of $s$. This structure supports the queries \access, \rank\ and \select, which are fundamental building blocks for many other compressed data structures, in worst-case time $\Oh{\lg\lgσ}$ and average time $\Oh{\lg \Ho(s)}$. The worst-cas… ▽ More

    Submitted 1 April, 2012; v1 submitted 25 November, 2009; originally announced November 2009.

  5. arXiv:0909.4341  [pdf, ps, other

    cs.DS

    Lightweight Data Indexing and Compression in External Memory

    Authors: Paolo Ferragina, Travis Gagie, Giovanni Manzini

    Abstract: In this paper we describe algorithms for computing the BWT and for building (compressed) indexes in external memory. The innovative feature of our algorithms is that they are lightweight in the sense that, for an input of size $n$, they use only ${n}$ bits of disk working space while all previous approaches use $\Th{n \log n}$ bits of disk working space. Moreover, our algorithms access disk data… ▽ More

    Submitted 24 September, 2009; originally announced September 2009.

  6. arXiv:0907.0741  [pdf, other

    cs.DS

    Tight Bounds for Online Stable Sorting

    Authors: Travis Gagie, Yakov Nekrich

    Abstract: Although many authors have considered how many ternary comparisons it takes to sort a multiset $S$ of size $n$, the best known upper and lower bounds still differ by a term linear in $n$. In this paper we restrict our attention to online stable sorting and prove upper and lower bounds that are within (o (n)) not only of each other but also of the best known upper bound for offline sorting. Speci… ▽ More

    Submitted 4 July, 2009; originally announced July 2009.

  7. arXiv:0905.3107  [pdf, other

    cs.DS

    Fast and Compact Prefix Codes

    Authors: Travis Gagie, Gonzalo Navarro, Yakov Nekrich

    Abstract: It is well-known that, given a probability distribution over $n$ characters, in the worst case it takes (Θ(n \log n)) bits to store a prefix code with minimum expected codeword length. However, in this paper we first show that, for any $0<ε<1/2$ with (1 / ε= \Oh{\polylog{n}}), it takes $\Oh{n \log \log (1 / ε)}$ bits to store a prefix code with expected codeword length within $ε$ of the minimum.… ▽ More

    Submitted 19 May, 2009; originally announced May 2009.

  8. Range Quantile Queries: Another Virtue of Wavelet Trees

    Authors: Travis Gagie, Simon J. Puglisi, Andrew Turpin

    Abstract: We show how to use a balanced wavelet tree as a data structure that stores a list of numbers and supports efficient {\em range quantile queries}. A range quantile query takes a rank and the endpoints of a sublist and returns the number with that rank in that sublist. For example, if the rank is half the sublist's length, then the query returns the sublist's median. We also show how these queries… ▽ More

    Submitted 7 April, 2010; v1 submitted 26 March, 2009; originally announced March 2009.

    Comments: Added note about generalization to any constant number of dimensions.

  9. arXiv:0902.0133  [pdf, other

    cs.IT

    New Algorithms and Lower Bounds for Sequential-Access Data Compression

    Authors: Travis Gagie

    Abstract: This thesis concerns sequential-access data compression, i.e., by algorithms that read the input one or more times from beginning to end. In one chapter we consider adaptive prefix coding, for which we must read the input character by character, outputting each character's self-delimiting codeword before reading the next one. We show how to encode and decode each character in constant worst-case… ▽ More

    Submitted 1 February, 2009; originally announced February 2009.

    Comments: draft of PhD thesis

  10. arXiv:0812.3306  [pdf, ps, other

    cs.IT

    Worst-Case Optimal Adaptive Prefix Coding

    Authors: Travis Gagie, Yakov Nekrich

    Abstract: A common complaint about adaptive prefix coding is that it is much slower than static prefix coding. Karpinski and Nekrich recently took an important step towards resolving this: they gave an adaptive Shannon coding algorithm that encodes each character in (O (1)) amortized time and decodes it in (O (\log H)) amortized time, where $H$ is the empirical entropy of the input string $s$. For compari… ▽ More

    Submitted 17 December, 2008; originally announced December 2008.

  11. arXiv:0812.2868  [pdf, ps, other

    cs.DS

    Minimax Trees in Linear Time

    Authors: Pawel Gawrychowski, Travis Gagie

    Abstract: A minimax tree is similar to a Huffman tree except that, instead of minimizing the weighted average of the leaves' depths, it minimizes the maximum of any leaf's weight plus its depth. Golumbic (1976) introduced minimax trees and gave a Huffman-like, $\Oh{n \log n}$-time algorithm for building them. Drmota and Szpankowski (2002) gave another $\Oh{n \log n}$-time algorithm, which checks the Kraft… ▽ More

    Submitted 28 January, 2009; v1 submitted 15 December, 2008; originally announced December 2008.

  12. arXiv:0811.3602  [pdf, ps, other

    cs.DS

    Low-Memory Adaptive Prefix Coding

    Authors: Travis Gagie, Marek Karpinski, Yakov Nekrich

    Abstract: In this paper we study the adaptive prefix coding problem in cases where the size of the input alphabet is large. We present an online prefix coding algorithm that uses $O(σ^{1 / λ+ ε}) $ bits of space for any constants $\eps>0$, $λ>1$, and encodes the string of symbols in $O(\log \log σ)$ time per symbol \emph{in the worst case}, where $σ$ is the size of the alphabet. The upper bound on the enc… ▽ More

    Submitted 21 November, 2008; originally announced November 2008.

    Comments: 10 pages

  13. arXiv:0810.5064  [pdf, ps, other

    cs.IT cs.DS

    A New Algorithm for Building Alphabetic Minimax Trees

    Authors: Travis Gagie

    Abstract: We show how to build an alphabetic minimax tree for a sequence (W = w_1, >..., w_n) of real weights in (O (n d \log \log n)) time, where $d$ is the number of distinct integers (\lceil w_i \rceil). We apply this algorithm to building an alphabetic prefix code given a sample.

    Submitted 28 October, 2008; originally announced October 2008.

    Comments: in preparation

  14. arXiv:0711.3338  [pdf, ps, other

    cs.IT

    Bounds for Compression in Streaming Models

    Authors: Travis Gagie

    Abstract: Compression algorithms and streaming algorithms are both powerful tools for dealing with massive data sets, but many of the best compression algorithms -- e.g., those based on the Burrows-Wheeler Transform -- at first seem incompatible with streaming. In this paper we consider several popular streaming models and ask in which, if any, we can compress as well as we can with the BWT. We first prov… ▽ More

    Submitted 19 April, 2008; v1 submitted 21 November, 2007; originally announced November 2007.

    Comments: added reduction from sorting to the Burrows-Wheeler Transform; thus, Grohe and Schweikardt's lower bound for short-sorting implies the same lower bound for the BWT

  15. arXiv:0708.2084  [pdf, ps, other

    cs.IT

    Empirical entropy in context

    Authors: Travis Gagie

    Abstract: We trace the history of empirical entropy, touching briefly on its relation to Markov processes, normal numbers, Shannon entropy, the Chomsky hierarchy, Kolmogorov complexity, Ziv-Lempel compression, de Bruijn sequences and stochastic complexity.

    Submitted 15 August, 2007; originally announced August 2007.

    Comments: A survey of some results related to empirical entropy, written in the spring of 2007 as part of an introduction to a PhD thesis

  16. arXiv:0708.1877  [pdf, ps, other

    cs.IT

    A nearly tight memory-redundancy trade-off for one-pass compression

    Authors: Travis Gagie

    Abstract: Let $s$ be a string of length $n$ over an alphabet of constant size $σ$ and let $c$ and $ε$ be constants with (1 \geq c \geq 0) and (ε> 0). Using (O (n)) time, (O (n^c)) bits of memory and one pass we can always encode $s$ in (n H_k (s) + O (σ^k n^{1 - c + ε})) bits for all integers (k \geq 0) simultaneously. On the other hand, even with unlimited time, using (O (n^c)) bits of memory and one pas… ▽ More

    Submitted 14 August, 2007; originally announced August 2007.

  17. arXiv:cs/0611099  [pdf, ps, other

    cs.IT

    On the space complexity of one-pass compression

    Authors: Travis Gagie

    Abstract: We study how much memory one-pass compression algorithms need to compete with the best multi-pass algorithms. We call a one-pass algorithm an (f (n, \ell))-footprint compressor if, given $n$, $\ell$ and an $n$-ary string $S$, it stores $S$ in ((\rule{0ex}{2ex} O (H_\ell (S)) + o (\log n)) |S| + O (n^{\ell + 1} \log n)) bits -- where (H_\ell (S)) is the $\ell$th-order empirical entropy of $S$ --… ▽ More

    Submitted 20 November, 2006; originally announced November 2006.

    ACM Class: H.1.1

  18. Large Alphabets and Incompressibility

    Authors: Travis Gagie

    Abstract: We briefly survey some concepts related to empirical entropy -- normal numbers, de Bruijn sequences and Markov processes -- and investigate how well it approximates Kolmogorov complexity. Our results suggest $\ell$th-order empirical entropy stops being a reasonable complexity metric for almost all strings of length $m$ over alphabets of size $n$ about when $n^\ell$ surpasses $m$.

    Submitted 9 March, 2006; v1 submitted 13 June, 2005; originally announced June 2005.

    ACM Class: E.4

  19. arXiv:cs/0506027  [pdf, ps, other

    cs.DS

    Sorting a Low-Entropy Sequence

    Authors: Travis Gagie

    Abstract: We give the first sorting algorithm with bounds in terms of higher-order entropies: let $S$ be a sequence of length $m$ containing $n$ distinct elements and let (H_\ell (S)) be the $\ell$th-order empirical entropy of $S$, with (n^{\ell + 1} \log n \in O (m)); our algorithm sorts $S$ using ((H_\ell (S) + O (1)) m) comparisons.

    Submitted 8 June, 2005; originally announced June 2005.

    ACM Class: E.4; E.5

  20. Dynamic Asymmetric Communication

    Authors: Travis Gagie

    Abstract: We show how any dynamic instantaneous compression algorithm can be converted to an asymmetric communication protocol, with which a server with high bandwidth can help clients with low bandwidth send it messages. Unlike previous authors, we do not assume the server knows the messages' distribution, and our protocols are the first to use only one round of communication for each message.

    Submitted 20 November, 2006; v1 submitted 8 June, 2005; originally announced June 2005.

    Comments: Previous versions appeared at DCC 06 and SIROCCO 06; current version is preliminary journal version

    ACM Class: E.4

  21. arXiv:cs/0506016  [pdf, ps, other

    cs.IT

    Compressing Probability Distributions

    Authors: Travis Gagie

    Abstract: We show how to store good approximations of probability distributions in small space.

    Submitted 6 June, 2005; originally announced June 2005.

    ACM Class: E.4

    Journal ref: 10.1016/j.ipl.2005.10.006

  22. arXiv:cs/0503085  [pdf, ps, other

    cs.IT

    Dynamic Shannon Coding

    Authors: Travis Gagie

    Abstract: We present a new algorithm for dynamic prefix-free coding, based on Shannon coding. We give a simple analysis and prove a better upper bound on the length of the encoding produced than the corresponding bound for dynamic Huffman coding. We show how our algorithm can be modified for efficient length-restricted coding, alphabetic coding and coding with unequal letter costs.

    Submitted 30 March, 2005; originally announced March 2005.

    Comments: 6 pages; conference version presented at ESA 2004; journal version submitted to IEEE Transactions on Information Theory

    ACM Class: E.4