Search | arXiv e-print repository

Parikh's Theorem Made Symbolic

Authors: Matthew Hague, Artur Jeż, Anthony W. Lin

Abstract: Parikh's Theorem is a fundamental result in automata theory with numerous applications in computer science: software verification (e.g. infinite-state verification, string constraints, and theory of arrays), verification of cryptographic protocols (e.g. using Horn clauses modulo equational theories) and database querying (e.g. evaluating path-queries in graph databases). Parikh's Theorem states th… ▽ More Parikh's Theorem is a fundamental result in automata theory with numerous applications in computer science: software verification (e.g. infinite-state verification, string constraints, and theory of arrays), verification of cryptographic protocols (e.g. using Horn clauses modulo equational theories) and database querying (e.g. evaluating path-queries in graph databases). Parikh's Theorem states that the letter-counting abstraction of a language recognized by finite automata or context-free grammars is definable in Presburger Arithmetic. Unfortunately, real-world applications typically require large alphabets - which are well-known to be not amenable to explicit treatment of the alphabets. Symbolic automata have proven in the last decade to be an effective algorithmic framework for handling large finite or even infinite alphabets. A symbolic automaton employs an effective boolean algebra, which offers a symbolic representation of character sets and often lends itself to an exponentially more succinct representation of a language. Instead of letter-counting, Parikh's Theorem for symbolic automata amounts to counting the number of times different predicates are satisfied by an input sequence. Unfortunately, naively applying Parikh's Theorem from classical automata theory to symbolic automata yields existential Presburger formulas of exponential size. We provide a new construction for Parikh's Theorem for symbolic automata and grammars, which avoids this exponential blowup: our algorithm computes an existential formula in polynomial-time over (quantifier-free) Presburger and the base theory. In fact, our algorithm extends to the model of parametric symbolic grammars, which are one of the most expressive models of languages over infinite alphabets. We have implemented our algorithm and show it can be used to solve string constraints that are difficult to solve by existing solvers. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: Accepted tp POPL '24

arXiv:2308.00175 [pdf, ps, other]

Decision Procedures for Sequence Theories (Technical Report)

Authors: Artur Jeż, Anthony W. Lin, Oliver Markgraf, Philipp Rümmer

Abstract: Sequence theories are an extension of theories of strings with an infinite alphabet of letters, together with a corresponding alphabet theory (e.g. linear integer arithmetic). Sequences are natural abstractions of extendable arrays, which permit a wealth of operations including append, map, split, and concatenation. In spite of the growing amount of tool support for theories of sequences by leadin… ▽ More Sequence theories are an extension of theories of strings with an infinite alphabet of letters, together with a corresponding alphabet theory (e.g. linear integer arithmetic). Sequences are natural abstractions of extendable arrays, which permit a wealth of operations including append, map, split, and concatenation. In spite of the growing amount of tool support for theories of sequences by leading SMT-solvers, little is known about the decidability of sequence theories, which is in stark contrast to the state of the theories of strings. We show that the decidable theory of strings with concatenation and regular constraints can be extended to the world of sequences over an alphabet theory that forms a Boolean algebra, while preserving decidability. In particular, decidability holds when regular constraints are interpreted as parametric automata (which extend both symbolic automata and variable automata), but fails when interpreted as register automata (even over the alphabet theory of equality). When length constraints are added, the problem is Turing-equivalent to word equations with length (and regular) constraints. Similar investigations are conducted in the presence of symbolic transducers, which naturally model sequence functions like map, split, filter, etc. We have developed a new sequence solver, SeCo, based on parametric automata, and show its efficacy on two classes of benchmarks: (i) invariant checking on array-manipulating programs and parameterized systems, and (ii) benchmarks on symbolic register automata. △ Less

Submitted 31 July, 2023; originally announced August 2023.

arXiv:2212.02327 [pdf, ps, other]

Space-efficient conversions from SLPs

Authors: Travis Gagie, Adrián Goga, Artur Jeż, Gonzalo Navarro

Abstract: We give algorithms that, given a straight-line program (SLP) with $g$ rules that generates (only) a text $T [1..n]$, builds within $O(g)$ space the Lempel-Ziv (LZ) parse of $T$ (of $z$ phrases) in time $O(n\log^2 n)$ or in time $O(gz\log^2(n/z))$. We also show how to build a locally consistent grammar (LCG) of optimal size $g_{lc} = O(δ\log\frac{n}δ)$ from the SLP within $O(g+g_{lc})$ space and in… ▽ More We give algorithms that, given a straight-line program (SLP) with $g$ rules that generates (only) a text $T [1..n]$, builds within $O(g)$ space the Lempel-Ziv (LZ) parse of $T$ (of $z$ phrases) in time $O(n\log^2 n)$ or in time $O(gz\log^2(n/z))$. We also show how to build a locally consistent grammar (LCG) of optimal size $g_{lc} = O(δ\log\frac{n}δ)$ from the SLP within $O(g+g_{lc})$ space and in $O(n\log g)$ time, where $δ$ is the substring complexity measure of $T$. Finally, we show how to build the LZ parse of $T$ from such a LCG within $O(g_{lc})$ space and in time $O(z\log^2 n \log^2(n/z))$. All our results hold with high probability. △ Less

Submitted 10 October, 2023; v1 submitted 5 December, 2022; originally announced December 2022.

arXiv:2101.06201 [pdf, other]

Solving one variable word equations in the free group in cubic time

Authors: Robert Ferens, Artur Jeż

Abstract: A word equation with one variable in a free group is given as $U = V$, where both $U$ and $V$ are words over the alphabet of generators of the free group and $X, X^{-1}$, for a fixed variable $X$. An element of the free group is a solution when substituting it for $X$ yields a true equality (interpreted in the free group) of left- and right-hand sides. It is known that the set of all solutions of… ▽ More A word equation with one variable in a free group is given as $U = V$, where both $U$ and $V$ are words over the alphabet of generators of the free group and $X, X^{-1}$, for a fixed variable $X$. An element of the free group is a solution when substituting it for $X$ yields a true equality (interpreted in the free group) of left- and right-hand sides. It is known that the set of all solutions of a given word equation with one variable is a finite union of sets of the form $\{αw^i β\: : \: i \in \mathbb Z \}$, where $α, w, β$ are reduced words over the alphabet of generators, and a polynomial-time algorithm (of a high degree) computing this set is known. We provide a cubic time algorithm for this problem, which also shows that the set of solutions consists of at most a quadratic number of the above-mentioned sets. The algorithm uses only simple tools of word combinatorics and group theory and is simple to state. Its analysis is involved and focuses on the combinatorics of occurrences of powers of a word within a larger word. △ Less

Submitted 15 January, 2021; originally announced January 2021.

Comments: 52 pages, accepted to STACS 2021

arXiv:1908.06428 [pdf, ps, other]

The smallest grammar problem revisited

Authors: Hideo Bannai, Momoko Hirayama, Danny Hucke, Shunsuke Inenaga, Artur Jez, Markus Lohrey, Carl Philipp Reh

Abstract: In a seminal paper of Charikar et al. on the smallest grammar problem, the authors derive upper and lower bounds on the approximation ratios for several grammar-based compressors, but in all cases there is a gap between the lower and upper bound. Here the gaps for $\mathsf{LZ78}$ and $\mathsf{BISECTION}$ are closed by showing that the approximation ratio of $\mathsf{LZ78}$ is… ▽ More In a seminal paper of Charikar et al. on the smallest grammar problem, the authors derive upper and lower bounds on the approximation ratios for several grammar-based compressors, but in all cases there is a gap between the lower and upper bound. Here the gaps for $\mathsf{LZ78}$ and $\mathsf{BISECTION}$ are closed by showing that the approximation ratio of $\mathsf{LZ78}$ is $Θ( (n/\log n)^{2/3})$, whereas the approximation ratio of $\mathsf{BISECTION}$ is $Θ(\sqrt{n/\log n})$. In addition, the lower bound for $\mathsf{RePair}$ is improved from $Ω(\sqrt{\log n})$ to $Ω(\log n/\log\log n)$. Finally, results of Arpe and Reischuk relating grammar-based compression for arbitrary alphabets and binary alphabets are improved. △ Less

Submitted 18 August, 2019; originally announced August 2019.

Comments: A short version of this paper appeared in the Proceedings of SPIRE 2016. This work has been supported by the DFG research project LO 748/10-1 (QUANT-KOMP)

arXiv:1902.03568 [pdf, ps, other]

Balancing Straight-Line Programs

Authors: Moses Ganardi, Artur Jeż, Markus Lohrey

Abstract: It is shown that a context-free grammar of size $m$ that produces a single string $w$ (such a grammar is also called a string straight-line program) can be transformed in linear time into a context-free grammar for $w$ of size $\mathcal{O}(m)$, whose unique derivation tree has depth $\mathcal{O}(\log |w|)$. This solves an open problem in the area of grammar-based compression. Similar results are s… ▽ More It is shown that a context-free grammar of size $m$ that produces a single string $w$ (such a grammar is also called a string straight-line program) can be transformed in linear time into a context-free grammar for $w$ of size $\mathcal{O}(m)$, whose unique derivation tree has depth $\mathcal{O}(\log |w|)$. This solves an open problem in the area of grammar-based compression. Similar results are shown for two formalism for grammar-based tree compression: top dags and forest straight-line programs. These balancing results are all deduced from a single meta theorem stating that the depth of an algebraic circuit over an algebra with a certain finite base property can be reduced to $\mathcal{O}(\log n)$ with the cost of a constant multiplicative size increase. Here, $n$ refers to the size of the unfolding (or unravelling) of the circuit. △ Less

Submitted 1 July, 2020; v1 submitted 10 February, 2019; originally announced February 2019.

Comments: An extended abstract of this paper appears in the Proceedings of FOCS 2019

MSC Class: 68P05; 68W32

arXiv:1703.06061 [pdf, ps, other]

Approximation ratio of RePair

Authors: Danny Hucke, Artur Jez, Markus Lohrey

Abstract: In a seminal paper of Charikar et al.~on the smallest grammar problem, the authors derive upper and lower bounds on the approximation ratios for several grammar-based compressors. Here we improve the lower bound for the famous {\sf RePair} algorithm from $Ω(\sqrt{\log n})$ to $Ω(\log n/\log\log n)$. The family of words used in our proof is defined over a binary alphabet, while the lower bound from… ▽ More In a seminal paper of Charikar et al.~on the smallest grammar problem, the authors derive upper and lower bounds on the approximation ratios for several grammar-based compressors. Here we improve the lower bound for the famous {\sf RePair} algorithm from $Ω(\sqrt{\log n})$ to $Ω(\log n/\log\log n)$. The family of words used in our proof is defined over a binary alphabet, while the lower bound from Charikar et al. needs an alphabet of logarithmic size in the length of the provided words. △ Less

Submitted 17 March, 2017; originally announced March 2017.

MSC Class: F.2.2; E.4

arXiv:1702.00736 [pdf, ps, other]

Word equations in linear space

Authors: Artur Jeż

Abstract: Satisfiability of word equations is an important problem in the intersection of formal languages and algebra: Given two sequences consisting of letters and variables we are to decide whether there is a substitution for the variables that turns this equation into true equality of strings. The exact computational complexity of this problem remains unknown, with the best lower and upper bounds being,… ▽ More Satisfiability of word equations is an important problem in the intersection of formal languages and algebra: Given two sequences consisting of letters and variables we are to decide whether there is a substitution for the variables that turns this equation into true equality of strings. The exact computational complexity of this problem remains unknown, with the best lower and upper bounds being, respectively, NP and PSPACE. Recently, the novel technique of recompression was applied to this problem, simplifying the known proofs and lowering the space complexity to (nondeterministic) O(n log n). In this paper we show that satisfiability of word equations is in nondeterministic linear space, thus the language of satisfiable word equations is context-sensitive, and by the famous Immerman-Szelepcsenyi theorem: the language of unsatisfiable word equations is also context-sensitive. We use the known recompression-based algorithm and additionally employ Huffman coding for letters. The proof, however, uses analysis of how the fragments of the equation depend on each other as well as a new strategy for nondeterministic choices of the algorithm, which uses several new ideas to limit the space occupied by the letters. △ Less

Submitted 16 October, 2020; v1 submitted 2 February, 2017; originally announced February 2017.

Comments: Presented at ICALP 2017, submitted to a journal. Second version includeds simpliefied construction and clearer notation as well as fixes some small errors from the first version

ACM Class: F.4.3; F.4.2; F.2.2

arXiv:1603.02966 [pdf, ps, other]

Solutions of Word Equations over Partially Commutative Structures

Authors: Volker Diekert, Artur Jeż, Manfred Kufleitner

Abstract: We give NSPACE(n log n) algorithms solving the following decision problems. Satisfiability: Is the given equation over a free partially commutative monoid with involution (resp. a free partially commutative group) solvable? Finiteness: Are there only finitely many solutions of such an equation? PSPACE algorithms with worse complexities for the first problem are known, but so far, a PSPACE algorith… ▽ More We give NSPACE(n log n) algorithms solving the following decision problems. Satisfiability: Is the given equation over a free partially commutative monoid with involution (resp. a free partially commutative group) solvable? Finiteness: Are there only finitely many solutions of such an equation? PSPACE algorithms with worse complexities for the first problem are known, but so far, a PSPACE algorithm for the second problem was out of reach. Our results are much stronger: Given such an equation, its solutions form an EDT0L language effectively representable in NSPACE(n log n). In particular, we give an effective description of the set of all solutions for equations with constraints in free partially commutative monoids and groups. △ Less

Submitted 9 March, 2016; originally announced March 2016.

ACM Class: F.2.2; F.4.2; F.4.3

arXiv:1407.4286 [pdf, ps, other]

Constructing small tree grammars and small circuits for formulas

Authors: Moses Ganardi, Danny Hucke, Artur Jez, Markus Lohrey, Eric Noeth

Abstract: It is shown that every tree of size $n$ over a fixed set of $σ$ different ranked symbols can be decomposed (in linear time as well as in logspace) into $O\big(\frac{n}{\log_σn}\big) = O\big(\frac{n \log σ}{\log n}\big)$ many hierarchically defined pieces. Formally, such a hierarchical decomposition has the form of a straight-line linear context-free tree grammar of size… ▽ More It is shown that every tree of size $n$ over a fixed set of $σ$ different ranked symbols can be decomposed (in linear time as well as in logspace) into $O\big(\frac{n}{\log_σn}\big) = O\big(\frac{n \log σ}{\log n}\big)$ many hierarchically defined pieces. Formally, such a hierarchical decomposition has the form of a straight-line linear context-free tree grammar of size $O\big(\frac{n}{\log_σn}\big)$, which can be used as a compressed representation of the input tree. This generalizes an analogous result for strings. Previous grammar-based tree compressors were not analyzed for the worst-case size of the computed grammar, except for the top dag of Bille et al., for which only the weaker upper bound of $O\big(\frac{n}{\log_σ^{0.19} n}\big)$ (which was very recently improved to $O\big(\frac{n \cdot \log \log_σn}{\log_σn}\big)$ by Hübschle-Schneider and Raman) for unranked and unlabelled trees has been derived. The main result is used to show that every arithmetical formula of size $n$, in which only $m \leq n$ different variables occur, can be transformed (in linear time as well as in logspace) into an arithmetical circuit of size $O\big(\frac{n \cdot \log m}{\log n}\big)$ and depth $O(\log n)$. This refines a classical result of Brent from 1974, according to which an arithmetical formula of size $n$ can be transformed into a logarithmic depth circuit of size $O(n)$. △ Less

Submitted 21 September, 2015; v1 submitted 16 July, 2014; originally announced July 2014.

Comments: A short version of this paper appeared in the Proceedings of FSTTCS 2014

MSC Class: 68P30; 68Q42

arXiv:1405.5133 [pdf, other]

Finding All Solutions of Equations in Free Groups and Monoids with Involution

Authors: Volker Diekert, Artur Jeż, Wojciech Plandowski

Abstract: The aim of this paper is to present a PSPACE algorithm which yields a finite graph of exponential size and which describes the set of all solutions of equations in free groups as well as the set of all solutions of equations in free monoids with involution in the presence of rational constraints. This became possible due to the recently invented emph{recompression} technique of the second author.… ▽ More The aim of this paper is to present a PSPACE algorithm which yields a finite graph of exponential size and which describes the set of all solutions of equations in free groups as well as the set of all solutions of equations in free monoids with involution in the presence of rational constraints. This became possible due to the recently invented emph{recompression} technique of the second author. He successfully applied the recompression technique for pure word equations without involution or rational constraints. In particular, his method could not be used as a black box for free groups (even without rational constraints). Actually, the presence of an involution (inverse elements) and rational constraints complicates the situation and some additional analysis is necessary. Still, the recompression technique is general enough to accommodate both extensions. In the end, it simplifies proofs that solving word equations is in PSPACE (Plandowski 1999) and the corresponding result for equations in free groups with rational constraints (Diekert, Hagenah and Gutierrez 2001). As a byproduct we obtain a direct proof that it is decidable in PSPACE whether or not the solution set is finite. △ Less

Submitted 21 May, 2014; v1 submitted 20 May, 2014; originally announced May 2014.

Comments: A preliminary version of this paper was presented as an invited talk at CSR 2014 in Moscow, June 7 - 11, 2014

ACM Class: F.4; F.2; F.2.2

arXiv:1403.4445 [pdf, other]

A really simple approximation of smallest grammar

Authors: Artur Jeż

Abstract: In this paper we present a really simple linear-time algorithm constructing a context-free grammar of size O(g log (N/g)) for the input string, where N is the size of the input string and g the size of the optimal grammar generating this string. The algorithm works for arbitrary size alphabets, but the running time is linear assuming that the alphabet Sigma of the input string can be identified wi… ▽ More In this paper we present a really simple linear-time algorithm constructing a context-free grammar of size O(g log (N/g)) for the input string, where N is the size of the input string and g the size of the optimal grammar generating this string. The algorithm works for arbitrary size alphabets, but the running time is linear assuming that the alphabet Sigma of the input string can be identified with numbers from 1,ldots, N^c for some constant c. Algorithms with such an approximation guarantee and running time are known, however all of them were non-trivial and their analyses were involved. The here presented algorithm computes the LZ77 factorisation and transforms it in phases to a grammar. In each phase it maintains an LZ77-like factorisation of the word with at most l factors as well as additional O(l) letters, where l was the size of the original LZ77 factorisation. In one phase in a greedy way (by a left-to-right sweep and a help of the factorisation) we choose a set of pairs of consecutive letters to be replaced with new symbols, i.e. nonterminals of the constructed grammar. We choose at least 2/3 of the letters in the word and there are O(l) many different pairs among them. Hence there are O(log N) phases, each of them introduces O(l) nonterminals to a grammar. A more precise analysis yields a bound O(l log(N/l)). As l \leq g, this yields the desired bound O(g log(N/g)). △ Less

Submitted 18 March, 2014; originally announced March 2014.

Comments: Accepted for CPM 2014

ACM Class: E.4; F.4.2; F.2.2

arXiv:1310.4367 [pdf, ps, other]

Context unification is in PSPACE

Authors: Artur Jeż

Abstract: Contexts are terms with one `hole', i.e. a place in which we can substitute an argument. In context unification we are given an equation over terms with variables representing contexts and ask about the satisfiability of this equation. Context unification is a natural subvariant of second-order unification, which is undecidable, and a generalization of word equations, which are decidable, at the s… ▽ More Contexts are terms with one `hole', i.e. a place in which we can substitute an argument. In context unification we are given an equation over terms with variables representing contexts and ask about the satisfiability of this equation. Context unification is a natural subvariant of second-order unification, which is undecidable, and a generalization of word equations, which are decidable, at the same time. It is the unique problem between those two whose decidability is uncertain (for already almost two decades). In this paper we show that the context unification is in PSPACE. The result holds under a (usual) assumption that the first-order signature is finite. This result is obtained by an extension of the recompression technique, recently developed by the author and used in particular to obtain a new PSPACE algorithm for satisfiability of word equations, to context unification. The recompression is based on performing simple compression rules (replacing pairs of neighbouring function symbols), which are (conceptually) applied on the solution of the context equation and modifying the equation in a way so that such compression steps can be in fact performed directly on the equation, without the knowledge of the actual solution. △ Less

Submitted 8 November, 2013; v1 submitted 16 October, 2013; originally announced October 2013.

Comments: 27 pages, submitted, small notation changes and small improvements over the previous text

ACM Class: F.4.1; F.4.2

arXiv:1309.4958 [pdf, other]

Approximation of smallest linear tree grammar

Authors: Artur Jeż, Markus Lohrey

Abstract: A simple linear-time algorithm for constructing a linear context-free tree grammar of size O(rg + r g log (n/r g))for a given input tree T of size n is presented, where g is the size of a minimal linear context-free tree grammar for T, and r is the maximal rank of symbols in T (which is a constant in many applications). This is the first example of a grammar-based tree compression algorithm with a… ▽ More A simple linear-time algorithm for constructing a linear context-free tree grammar of size O(rg + r g log (n/r g))for a given input tree T of size n is presented, where g is the size of a minimal linear context-free tree grammar for T, and r is the maximal rank of symbols in T (which is a constant in many applications). This is the first example of a grammar-based tree compression algorithm with a good, i.e. logarithmic in terms of the size of the input tree, approximation ratio. The analysis of the algorithm uses an extension of the recompression technique from strings to trees. △ Less

Submitted 6 October, 2018; v1 submitted 19 September, 2013; originally announced September 2013.

Comments: 45 pages, published in Information and Computation. Approximation ratio improved since the first version, figures improved, some examples added. A small calculation error corrected since the previous version (all claims hold as previously)

ACM Class: F.4.2; F.2.2; E.4

arXiv:1302.3481 [pdf, other]

One-variable word equations in linear time

Authors: Artur Jeż

Abstract: In this paper we consider word equations with one variable (and arbitrary many appearances of it). A recent technique of recompression, which is applicable to general word equations, is shown to be suitable also in this case. While in general case it is non-deterministic, it determinises in case of one variable and the obtained running time is O(n + #_X log n), where #_X is the number of appearanc… ▽ More In this paper we consider word equations with one variable (and arbitrary many appearances of it). A recent technique of recompression, which is applicable to general word equations, is shown to be suitable also in this case. While in general case it is non-deterministic, it determinises in case of one variable and the obtained running time is O(n + #_X log n), where #_X is the number of appearances of the variable in the equation. This matches the previously-best algorithm due to Dąbrowski and Plandowski. Then, using a couple of heuristics as well as more detailed time analysis the running time is lowered to O(n) in RAM model. Unfortunately no new properties of solutions are shown. △ Less

Submitted 22 January, 2014; v1 submitted 14 February, 2013; originally announced February 2013.

Comments: submitted to a journal, general overhaul over the previous version

ACM Class: F.2.2; F.4.3

arXiv:1301.5842 [pdf, ps, other]

Approximation of grammar-based compression via recompression

Authors: Artur Jeż

Abstract: In this paper we present a simple linear-time algorithm constructing a context-free grammar of size O(g log(N/g)) for the input string, where N is the size of the input string and g the size of the optimal grammar generating this string. The algorithm works for arbitrary size alphabets, but the running time is linear assuming that the alphabet Σof the input string can be identified with numbers fr… ▽ More In this paper we present a simple linear-time algorithm constructing a context-free grammar of size O(g log(N/g)) for the input string, where N is the size of the input string and g the size of the optimal grammar generating this string. The algorithm works for arbitrary size alphabets, but the running time is linear assuming that the alphabet Σof the input string can be identified with numbers from {1, ..., N^c} for some constant c. Otherwise, additional cost of O(n log|Σ|) is needed. Algorithms with such approximation guarantees and running time are known, the novelty of this paper is a particular simplicity of the algorithm as well as the analysis of the algorithm, which uses a general technique of recompression recently introduced by the author. Furthermore, contrary to the previous results, this work does not use the LZ representation of the input string in the construction, nor in the analysis. △ Less

Submitted 7 November, 2013; v1 submitted 24 January, 2013; originally announced January 2013.

Comments: 22 pages, some many small improvements, to be submited to a journal

ACM Class: E.4; F.4.2; F.2.2

arXiv:1203.3705 [pdf, ps, other]

Recompression: a simple and powerful technique for word equations

Authors: Artur Jeż

Abstract: In this paper we present an application of a simple technique of local recompression, previously developed by the author in the context of compressed membership problems and compressed pattern matching, to word equations. The technique is based on local modification of variables (replacing X by aX or Xa) and iterative replacement of pairs of letters appearing in the equation by a `fresh' letter, w… ▽ More In this paper we present an application of a simple technique of local recompression, previously developed by the author in the context of compressed membership problems and compressed pattern matching, to word equations. The technique is based on local modification of variables (replacing X by aX or Xa) and iterative replacement of pairs of letters appearing in the equation by a `fresh' letter, which can be seen as a bottom-up compression of the solution of the given word equation, to be more specific, building an SLP (Straight-Line Programme) for the solution of the word equation. Using this technique we give a new, independent and self-contained proofs of most of the known results for word equations. To be more specific, the presented (nondeterministic) algorithm runs in O(n log n) space and in time polynomial in log N, where N is the size of the length-minimal solution of the word equation. The presented algorithm can be easily generalised to a generator of all solutions of the given word equation (without increasing the space usage). Furthermore, a further analysis of the algorithm yields a doubly exponential upper bound on the size of the length-minimal solution. The presented algorithm does not use exponential bound on the exponent of periodicity. Conversely, the analysis of the algorithm yields an independent proof of the exponential bound on exponent of periodicity. We believe that the presented algorithm, its idea and analysis are far simpler than all previously applied. Furthermore, thanks to it we can obtain a unified and simple approach to most of known results for word equations. As a small additional result we show that for O(1) variables (with arbitrary many appearances in the equation) word equations can be solved in linear space, i.e. they are context-sensitive. △ Less

Submitted 18 March, 2014; v1 submitted 16 March, 2012; originally announced March 2012.

Comments: Submitted to a journal. Since previous version the proofs were simplified, overall presentation improved

ACM Class: F.2.2; F.4.3

arXiv:1111.3244 [pdf, ps, other]

Faster fully compressed pattern matching by recompression

Authors: Artur Jeż

Abstract: In this paper, a fully compressed pattern matching problem is studied. The compression is represented by straight-line programs (SLPs), i.e. a context-free grammars generating exactly one string; the term fully means that both the pattern and the text are given in the compressed form. The problem is approached using a recently developed technique of local recompression: the SLPs are refactored, so… ▽ More In this paper, a fully compressed pattern matching problem is studied. The compression is represented by straight-line programs (SLPs), i.e. a context-free grammars generating exactly one string; the term fully means that both the pattern and the text are given in the compressed form. The problem is approached using a recently developed technique of local recompression: the SLPs are refactored, so that substrings of the pattern and text are encoded in both SLPs in the same way. To this end, the SLPs are locally decompressed and then recompressed in a uniform way. This technique yields an O((n+m)log M) algorithm for compressed pattern matching, assuming that M fits in O(1) machine words, where n (m) is the size of the compressed representation of the text (pattern, respectively), while M is the size of the decompressed pattern. If only m+n fits in O(1) machine words, the running time increases to O((n+m)log M log(n+m)). The previous best algorithm due to Lifshits had O(n^2m) running time. △ Less

Submitted 25 June, 2013; v1 submitted 14 November, 2011; originally announced November 2011.

Comments: Full version, submitted to a journal as is. Overall improvements over the previous version

ACM Class: F.2.2

arXiv:1110.2318 [pdf, ps, other]

doi 10.1007/s00224-013-9443-6

Compressed Membership for NFA (DFA) with Compressed Labels is in NP (P)

Authors: Artur Jeż

Abstract: In this paper, a compressed membership problem for finite automata, both deterministic and non-deterministic, with compressed transition labels is studied. The compression is represented by straight-line programs (SLPs), i.e. context-free grammars generating exactly one string. A novel technique of dealing with SLPs is introduced: the SLPs are recompressed, so that substrings of the input text are… ▽ More In this paper, a compressed membership problem for finite automata, both deterministic and non-deterministic, with compressed transition labels is studied. The compression is represented by straight-line programs (SLPs), i.e. context-free grammars generating exactly one string. A novel technique of dealing with SLPs is introduced: the SLPs are recompressed, so that substrings of the input text are encoded in SLPs labelling the transitions of the NFA (DFA) in the same way, as in the SLP representing the input text. To this end, the SLPs are locally decompressed and then recompressed in a uniform way. Furthermore, such recompression induces only small changes in the automaton, in particular, the size of the automaton remains polynomial. Using this technique it is shown that the compressed membership for NFA with compressed labels is in NP, thus confirming the conjecture of Plandowski and Rytter and extending the partial result of Lohrey and Mathissen; as it is already known, that this problem is NP-hard, we settle its exact computational complexity. Moreover, the same technique applied to the compressed membership for DFA with compressed labels yields that this problem is in P; for this problem, only trivial upper-bound PSPACE was known. △ Less

Submitted 11 October, 2011; originally announced October 2011.

ACM Class: F.4.3; F.4.2; F.2.2; F.1.1

arXiv:1102.5682 [pdf, ps, other]

On minimising automata with errors

Authors: Paweł Gawrychowski, Artur Jeż, Andreas Maletti

Abstract: The problem of k-minimisation for a DFA M is the computation of a smallest DFA N (where the size |M| of a DFA M is the size of the domain of the transition function) such that their recognized languages differ only on words of length less than k. The previously best algorithm, which runs in time O(|M| log^2 n) where n is the number of states, is extended to DFAs with partial transition functions.… ▽ More The problem of k-minimisation for a DFA M is the computation of a smallest DFA N (where the size |M| of a DFA M is the size of the domain of the transition function) such that their recognized languages differ only on words of length less than k. The previously best algorithm, which runs in time O(|M| log^2 n) where n is the number of states, is extended to DFAs with partial transition functions. Moreover, a faster O(|M| log n) algorithm for DFAs that recognise finite languages is presented. In comparison to the previous algorithm for total DFAs, the new algorithm is much simpler and allows the calculation of a k-minimal DFA for each k in parallel. Secondly, it is demonstrated that calculating the least number of introduced errors is hard: Given a DFA M and numbers k and m, it is NP-hard to decide whether there exists a k-minimal DFA N differing from DFA M on at most m words. A similar result holds for hyper-minimisation of DFAs in general: Given a DFA M and numbers s and m, it is NP-hard to decide whether there exists a DFA N with at most s states such that DFA M and N differ on at msot m words. △ Less

Submitted 28 February, 2011; originally announced February 2011.

Comments: 12 pages plus 19-page appendix, submitted to a conference

arXiv:1001.2932 [pdf, ps, other]

doi 10.1007/s00224-011-9352-5

On equations over sets of integers

Authors: Artur Jeż, Alexander Okhotin

Abstract: Systems of equations with sets of integers as unknowns are considered. It is shown that the class of sets representable by unique solutions of equations using the operations of union and addition $S+T=\makeset{m+n}{m \in S, \: n \in T}$ and with ultimately periodic constants is exactly the class of hyper-arithmetical sets. Equations using addition only can represent every hyper-arithmetical set… ▽ More Systems of equations with sets of integers as unknowns are considered. It is shown that the class of sets representable by unique solutions of equations using the operations of union and addition $S+T=\makeset{m+n}{m \in S, \: n \in T}$ and with ultimately periodic constants is exactly the class of hyper-arithmetical sets. Equations using addition only can represent every hyper-arithmetical set under a simple encoding. All hyper-arithmetical sets can also be represented by equations over sets of natural numbers equipped with union, addition and subtraction $S \dotminus T=\makeset{m-n}{m \in S, \: n \in T, \: m \geqslant n}$. Testing whether a given system has a solution is $Σ^1_1$-complete for each model. These results, in particular, settle the expressive power of the most general types of language equations, as well as equations over subsets of free groups. △ Less

Submitted 3 February, 2010; v1 submitted 17 January, 2010; originally announced January 2010.

Comments: 12 apges, 0 figures

ACM Class: F.4.3; F.4.1

Journal ref: Theory of Computing Systems 51:2, 2012, pages 196-228

arXiv:0901.2897 [pdf, other]

Online validation of the pi and pi' failure functions

Authors: Pawel Gawrychowski, Artur Jez, Lukasz Jez

Abstract: Let pi_w denote the failure function of the Morris-Pratt algorithm for a word w. In this paper we study the following problem: given an integer array A[1..n], is there a word w over arbitrary alphabet such that A[i]=pi_w[i] for all i? Moreover, what is the minimum required cardinality of the alphabet? We give a real time linear algorithm for this problem in the unit-cost RAM model with Θ(log n)… ▽ More Let pi_w denote the failure function of the Morris-Pratt algorithm for a word w. In this paper we study the following problem: given an integer array A[1..n], is there a word w over arbitrary alphabet such that A[i]=pi_w[i] for all i? Moreover, what is the minimum required cardinality of the alphabet? We give a real time linear algorithm for this problem in the unit-cost RAM model with Θ(log n) bits word size. Our algorithm returns a word w over minimal alphabet such that pi_w = A as well and uses just o(n) words of memory. Then we consider function pi' instead of pi and give an online O(n log n) algorithm for this case. This is the first polynomial algorithm for online version of this problem. △ Less

Submitted 13 April, 2009; v1 submitted 19 January, 2009; originally announced January 2009.

Comments: submitted

arXiv:0802.1685 [pdf, ps, other]

Generalized Whac-a-Mole

Authors: Marcin Bienkowski, Marek Chrobak, Christoph Durr, Mathilde Hurand, Artur Jez, Lukasz Jez, Jakub Lopuszanski, Grzegorz Stachowiak

Abstract: We consider online competitive algorithms for the problem of collecting weighted items from a dynamic set S, when items are added to or deleted from S over time. The objective is to maximize the total weight of collected items. We study the general version, as well as variants with various restrictions, including the following: the uniform case, when all items have the same weight, the decrement… ▽ More We consider online competitive algorithms for the problem of collecting weighted items from a dynamic set S, when items are added to or deleted from S over time. The objective is to maximize the total weight of collected items. We study the general version, as well as variants with various restrictions, including the following: the uniform case, when all items have the same weight, the decremental sets, when all items are present at the beginning and only deletion operations are allowed, and dynamic queues, where the dynamic set is ordered and only its prefixes can be deleted (with no restriction on insertions). The dynamic queue case is a generalization of bounded-delay packet scheduling (also referred to as buffer management). We present several upper and lower bounds on the competitive ratio for these variants. △ Less

Submitted 16 February, 2008; v1 submitted 12 February, 2008; originally announced February 2008.

Showing 1–23 of 23 results for author: Jeż, A