-
State Complexity of Pattern Matching in Regular Languages
Authors:
Janusz A. Brzozowski,
Sylvie Davies,
Abhishek Madan
Abstract:
In a simple pattern matching problem one has a pattern $w$ and a text $t$, which are words over a finite alphabet $Σ$. One may ask whether $w$ occurs in $t$, and if so, where? More generally, we may have a set $P$ of patterns and a set $T$ of texts, where $P$ and $T$ are regular languages. We are interested whether any word of $T$ begins with a word of $P$, ends with a word of $P$, has a word of…
▽ More
In a simple pattern matching problem one has a pattern $w$ and a text $t$, which are words over a finite alphabet $Σ$. One may ask whether $w$ occurs in $t$, and if so, where? More generally, we may have a set $P$ of patterns and a set $T$ of texts, where $P$ and $T$ are regular languages. We are interested whether any word of $T$ begins with a word of $P$, ends with a word of $P$, has a word of $P$ as a factor, or has a word of $P$ as a subsequence. Thus we are interested in the languages $(PΣ^*)\cap T$, $(Σ^*P)\cap T$, $(Σ^* PΣ^*)\cap T$, and $(Σ^* \mathbin{\operatorname{shu}} P)\cap T$, where $\operatorname{shu}$ is the shuffle operation. The state complexity $κ(L)$ of a regular language $L$ is the number of states in the minimal deterministic finite automaton recognizing $L$. We derive the following upper bounds on the state complexities of our pattern-matching languages, where $κ(P)\le m$, and $κ(T)\le n$: $κ((PΣ^*)\cap T) \le mn$; $κ((Σ^*P)\cap T) \le 2^{m-1}n$; $κ((Σ^*PΣ^*)\cap T) \le (2^{m-2}+1)n$; and $κ((Σ^*\mathbin{\operatorname{shu}} P)\cap T) \le (2^{m-2}+1)n$. We prove that these bounds are tight, and that to meet them, the alphabet must have at least two letters in the first three cases, and at least $m-1$ letters in the last case. We also consider the special case where $P$ is a single word $w$, and obtain the following tight upper bounds: $κ((wΣ^*)\cap T_n) \le m+n-1$; $κ((Σ^*w)\cap T_n) \le (m-1)n-(m-2)$; $κ((Σ^*wΣ^*)\cap T_n) \le (m-1)n$; and $κ((Σ^*\mathbin{\operatorname{shu}} w)\cap T_n) \le (m-1)n$. For unary languages, we have a tight upper bound of $m+n-2$ in all eight of the aforementioned cases.
△ Less
Submitted 4 November, 2018; v1 submitted 12 June, 2018;
originally announced June 2018.
-
Most Complex Deterministic Union-Free Regular Languages
Authors:
Janusz A. Brzozowski,
Sylvie Davies
Abstract:
A regular language $L$ is union-free if it can be represented by a regular expression without the union operation. A union-free language is deterministic if it can be accepted by a deterministic one-cycle-free-path finite automaton; this is an automaton which has one final state and exactly one cycle-free path from any state to the final state. Jirásková and Masopust proved that the state complexi…
▽ More
A regular language $L$ is union-free if it can be represented by a regular expression without the union operation. A union-free language is deterministic if it can be accepted by a deterministic one-cycle-free-path finite automaton; this is an automaton which has one final state and exactly one cycle-free path from any state to the final state. Jirásková and Masopust proved that the state complexities of the basic operations reversal, star, product, and boolean operations in deterministic union-free languages are exactly the same as those in the class of all regular languages. To prove that the bounds are met they used five types of automata, involving eight types of transformations of the set of states of the automata. We show that for each $n\ge 3$ there exists one ternary witness of state complexity $n$ that meets the bound for reversal and product. Moreover, the restrictions of this witness to binary alphabets meet the bounds for star and boolean operations. We also show that the tight upper bounds on the state complexity of binary operations that take arguments over different alphabets are the same as those for arbitrary regular languages. Furthermore, we prove that the maximal syntactic semigroup of a union-free language has $n^n$ elements, as in the case of regular languages, and that the maximal state complexities of atoms of union-free languages are the same as those for regular languages. Finally, we prove that there exists a most complex union-free language that meets the bounds for all these complexity measures. Altogether this proves that the complexity measures above cannot distinguish union-free languages from regular languages.
△ Less
Submitted 2 January, 2018; v1 submitted 24 November, 2017;
originally announced November 2017.
-
State Complexity of Overlap Assembly
Authors:
Janusz Brzozowski,
Lila Kari,
Bai Li,
Marek Szykuła
Abstract:
The \emph{state complexity} of a regular language $L_m$ is the number $m$ of states in a minimal deterministic finite automaton (DFA) accepting $L_m$. The state complexity of a regularity-preserving binary operation on regular languages is defined as the maximal state complexity of the result of the operation where the two operands range over all languages of state complexities $\le m$ and…
▽ More
The \emph{state complexity} of a regular language $L_m$ is the number $m$ of states in a minimal deterministic finite automaton (DFA) accepting $L_m$. The state complexity of a regularity-preserving binary operation on regular languages is defined as the maximal state complexity of the result of the operation where the two operands range over all languages of state complexities $\le m$ and $\le n$, respectively. We find a tight upper bound on the state complexity of the binary operation \emph{overlap assembly} on regular languages. This operation was introduced by Csuhaj-Varjú, Petre, and Vaszil to model the process of self-assembly of two linear DNA strands into a longer DNA strand, provided that their ends "overlap". We prove that the state complexity of the overlap assembly of languages $L_m$ and $L_n$, where $m\ge 2$ and $n\ge1$, is at most $2 (m-1) 3^{n-1} + 2^n$. Moreover, for $m \ge 2$ and $n \ge 3$ there exist languages $L_m$ and $L_n$ over an alphabet of size $n$ whose overlap assembly meets the upper bound and this bound cannot be met with smaller alphabets. Finally, we prove that $m+n$ is a tight upper bound on the overlap assembly of unary languages, and that there are binary languages whose overlap assembly has exponential state complexity at least $m(2^{n-1}-2)+2$.
△ Less
Submitted 11 December, 2018; v1 submitted 16 October, 2017;
originally announced October 2017.
-
Towards a Theory of Complexity of Regular Languages
Authors:
Janusz A. Brzozowski
Abstract:
We survey recent results concerning the complexity of regular languages represented by their minimal deterministic finite automata. In addition to the quotient complexity of the language -- which is the number of its (left) quotients, and is the same as its state complexity -- we also consider the size of its syntactic semigroup and the quotient complexity of its atoms -- basic components of every…
▽ More
We survey recent results concerning the complexity of regular languages represented by their minimal deterministic finite automata. In addition to the quotient complexity of the language -- which is the number of its (left) quotients, and is the same as its state complexity -- we also consider the size of its syntactic semigroup and the quotient complexity of its atoms -- basic components of every regular language. We then turn to the study of the quotient/state complexity of common operations on regular languages: reversal, (Kleene) star, product (concatenation) and boolean operations. We examine relations among these complexity measures. We discuss several subclasses of regular languages defined by convexity. In many, but not all, cases there exist "most complex" languages, languages satisfying all these complexity measures.
△ Less
Submitted 16 February, 2017;
originally announced February 2017.
-
Most Complex Non-Returning Regular Languages
Authors:
Janusz A. Brzozowski,
Sylvie Davies
Abstract:
A regular language $L$ is non-returning if in the minimal deterministic finite automaton accepting it there are no transitions into the initial state. Eom, Han and Jirásková derived upper bounds on the state complexity of boolean operations and Kleene star, and proved that these bounds are tight using two different binary witnesses. They derived upper bounds for concatenation and reversal using th…
▽ More
A regular language $L$ is non-returning if in the minimal deterministic finite automaton accepting it there are no transitions into the initial state. Eom, Han and Jirásková derived upper bounds on the state complexity of boolean operations and Kleene star, and proved that these bounds are tight using two different binary witnesses. They derived upper bounds for concatenation and reversal using three different ternary witnesses. These five witnesses use a total of six different transformations. We show that for each $n\ge 4$ there exists a ternary witness of state complexity $n$ that meets the bound for reversal and that at least three letters are needed to meet this bound. Moreover, the restrictions of this witness to binary alphabets meet the bounds for product, star, and boolean operations. We also derive tight upper bounds on the state complexity of binary operations that take arguments with different alphabets. We prove that the maximal syntactic semigroup of a non-returning language has $(n-1)^n$ elements and requires at least $\binom{n}{2}$ generators. We find the maximal state complexities of atoms of non-returning languages. Finally, we show that there exists a most complex non-returning language that meets the bounds for all these complexity measures.
△ Less
Submitted 14 January, 2017;
originally announced January 2017.
-
Complexity of Left-Ideal, Suffix-Closed and Suffix-Free Regular Languages
Authors:
Janusz Brzozowski,
Corwin Sinnamom
Abstract:
A language $L$ over an alphabet $Σ$ is suffix-convex if, for any words $x,y,z\inΣ^*$, whenever $z$ and $xyz$ are in $L$, then so is $yz$. Suffix-convex languages include three special cases: left-ideal, suffix-closed, and suffix-free languages. We examine complexity properties of these three special classes of suffix-convex regular languages. In particular, we study the quotient/state complexity o…
▽ More
A language $L$ over an alphabet $Σ$ is suffix-convex if, for any words $x,y,z\inΣ^*$, whenever $z$ and $xyz$ are in $L$, then so is $yz$. Suffix-convex languages include three special cases: left-ideal, suffix-closed, and suffix-free languages. We examine complexity properties of these three special classes of suffix-convex regular languages. In particular, we study the quotient/state complexity of boolean operations, product (concatenation), star, and reversal on these languages, as well as the size of their syntactic semigroups, and the quotient complexity of their atoms.
△ Less
Submitted 3 October, 2016;
originally announced October 2016.
-
Unrestricted State Complexity of Binary Operations on Regular and Ideal Languages
Authors:
Janusz Brzozowski,
Corwin Sinnamon
Abstract:
We study the state complexity of binary operations on regular languages over different alphabets. It is known that if $L'_m$ and $L_n$ are languages of state complexities $m$ and $n$, respectively, and restricted to the same alphabet, the state complexity of any binary boolean operation on $L'_m$ and $L_n$ is $mn$, and that of product (concatenation) is $m 2^n - 2^{n-1}$. In contrast to this, we s…
▽ More
We study the state complexity of binary operations on regular languages over different alphabets. It is known that if $L'_m$ and $L_n$ are languages of state complexities $m$ and $n$, respectively, and restricted to the same alphabet, the state complexity of any binary boolean operation on $L'_m$ and $L_n$ is $mn$, and that of product (concatenation) is $m 2^n - 2^{n-1}$. In contrast to this, we show that if $L'_m$ and $L_n$ are over different alphabets, the state complexity of union and symmetric difference is $(m+1)(n+1)$, that of difference is $mn+m$, that of intersection is $mn$, and that of product is $m2^n+2^{n-1}$. We also study unrestricted complexity of binary operations in the classes of regular right, left, and two-sided ideals, and derive tight upper bounds. The bounds for product of the unrestricted cases (with the bounds for the restricted cases in parentheses) are as follows: right ideals $m+2^{n-2}+2^{n-1}$ ($m+2^{n-2}$); left ideals $mn+m+n$ ($m+n-1$); two-sided ideals $m+2n$ ($m+n-1$). The state complexities of boolean operations on all three types of ideals are the same as those of arbitrary regular languages, whereas that is not the case if the alphabets of the arguments are the same. Finally, we update the known results about most complex regular, right-ideal, left-ideal, and two-sided-ideal languages to include the unrestricted cases.
△ Less
Submitted 20 December, 2017; v1 submitted 14 September, 2016;
originally announced September 2016.
-
Complexity of Prefix-Convex Regular Languages
Authors:
Janusz Brzozowski,
Corwin Sinnamon
Abstract:
A language $L$ over an alphabet $Σ$ is prefix-convex if, for any words $x,y,z\inΣ^*$, whenever $x$ and $xyz$ are in $L$, then so is $xy$. Prefix-convex languages include right-ideal, prefix-closed, and prefix-free languages. We study complexity properties of prefix-convex regular languages. In particular, we find the quotient/state complexity of boolean operations, product (concatenation), star, a…
▽ More
A language $L$ over an alphabet $Σ$ is prefix-convex if, for any words $x,y,z\inΣ^*$, whenever $x$ and $xyz$ are in $L$, then so is $xy$. Prefix-convex languages include right-ideal, prefix-closed, and prefix-free languages. We study complexity properties of prefix-convex regular languages. In particular, we find the quotient/state complexity of boolean operations, product (concatenation), star, and reversal, the size of the syntactic semigroup, and the quotient complexity of atoms. For binary operations we use arguments with different alphabets when appropriate; this leads to higher tight upper bounds than those obtained with equal alphabets. We exhibit most complex prefix-convex languages that meet the complexity bounds for all the measures listed above.
△ Less
Submitted 24 June, 2016; v1 submitted 21 May, 2016;
originally announced May 2016.
-
Unrestricted State Complexity of Binary Operations on Regular Languages
Authors:
Janusz Brzozowski
Abstract:
I study the state complexity of binary operations on regular languages over different alphabets. It is well known that if $L'_m$ and $L_n$ are languages restricted to be over the same alphabet, with $m$ and $n$ quotients, respectively, the state complexity of any binary boolean operation on $L'_m$ and $L_n$ is $mn$, and that of the product (concatenation) is $(m-1)2^n +2^{n-1}$. In contrast to thi…
▽ More
I study the state complexity of binary operations on regular languages over different alphabets. It is well known that if $L'_m$ and $L_n$ are languages restricted to be over the same alphabet, with $m$ and $n$ quotients, respectively, the state complexity of any binary boolean operation on $L'_m$ and $L_n$ is $mn$, and that of the product (concatenation) is $(m-1)2^n +2^{n-1}$. In contrast to this, I show that if $L'_m$ and $L_n$ are over their own different alphabets, the state complexity of union and symmetric difference is $mn+m+n+1$, that of intersection is $mn$, that of difference is $mn+m$, and that of the product is $m2^n+2^{n-1}$.
△ Less
Submitted 10 June, 2016; v1 submitted 3 February, 2016;
originally announced February 2016.
-
On the State Complexity of the Shuffle of Regular Languages
Authors:
Janusz Brzozowski,
Galina Jirásková,
Bo Liu,
Aayush Rajasekaran,
Marek Szykuła
Abstract:
We investigate the shuffle operation on regular languages represented by complete deterministic finite automata. We prove that $f(m,n)=2^{mn-1} + 2^{(m-1)(n-1)}(2^{m-1}-1)(2^{n-1}-1)$ is an upper bound on the state complexity of the shuffle of two regular languages having state complexities $m$ and $n$, respectively. We also state partial results about the tightness of this bound. We show that the…
▽ More
We investigate the shuffle operation on regular languages represented by complete deterministic finite automata. We prove that $f(m,n)=2^{mn-1} + 2^{(m-1)(n-1)}(2^{m-1}-1)(2^{n-1}-1)$ is an upper bound on the state complexity of the shuffle of two regular languages having state complexities $m$ and $n$, respectively. We also state partial results about the tightness of this bound. We show that there exist witness languages meeting the bound if $2\le m\le 5$ and $n\ge2$, and also if $m=n=6$. Moreover, we prove that in the subset automaton of the NFA accepting the shuffle, all $2^{mn}$ states can be distinguishable, and an alphabet of size three suffices for that. It follows that the bound can be met if all $f(m,n)$ states are reachable. We know that an alphabet of size at least $mn$ is required provided that $m,n \ge 2$. The question of reachability, and hence also of the tightness of the bound $f(m,n)$ in general, remains open.
△ Less
Submitted 15 July, 2016; v1 submitted 3 December, 2015;
originally announced December 2015.
-
Most Complex Regular Ideal Languages
Authors:
Janusz Brzozowski,
Sylvie Davies,
Bo Yang Victor Liu
Abstract:
A right ideal (left ideal, two-sided ideal) is a non-empty language $L$ over an alphabet $Σ$ such that $L=LΣ^*$ ($L=Σ^*L$, $L=Σ^*LΣ^*$). Let $k=3$ for right ideals, 4 for left ideals and 5 for two-sided ideals. We show that there exist sequences ($L_n \mid n \ge k $) of right, left, and two-sided regular ideals, where $L_n$ has quotient complexity (state complexity) $n$, such that $L_n$ is most co…
▽ More
A right ideal (left ideal, two-sided ideal) is a non-empty language $L$ over an alphabet $Σ$ such that $L=LΣ^*$ ($L=Σ^*L$, $L=Σ^*LΣ^*$). Let $k=3$ for right ideals, 4 for left ideals and 5 for two-sided ideals. We show that there exist sequences ($L_n \mid n \ge k $) of right, left, and two-sided regular ideals, where $L_n$ has quotient complexity (state complexity) $n$, such that $L_n$ is most complex in its class under the following measures of complexity: the size of the syntactic semigroup, the quotient complexities of the left quotients of $L_n$, the number of atoms (intersections of complemented and uncomplemented left quotients), the quotient complexities of the atoms, and the quotient complexities of reversal, star, product (concatenation), and all binary boolean operations. In that sense, these ideals are "most complex" languages in their classes, or "universal witnesses" to the complexity of the various operations.
△ Less
Submitted 13 October, 2016; v1 submitted 31 October, 2015;
originally announced November 2015.
-
Syntactic complexity of regular ideals
Authors:
Janusz A. Brzozowski,
Marek Szykuła,
Yuli Ye
Abstract:
The state complexity of a regular language is the number of states in a minimal deterministic finite automaton accepting the language. The syntactic complexity of a regular language is the cardinality of its syntactic semigroup. The syntactic complexity of a subclass of regular languages is the worst-case syntactic complexity taken as a function of the state complexity $n$ of languages in that cla…
▽ More
The state complexity of a regular language is the number of states in a minimal deterministic finite automaton accepting the language. The syntactic complexity of a regular language is the cardinality of its syntactic semigroup. The syntactic complexity of a subclass of regular languages is the worst-case syntactic complexity taken as a function of the state complexity $n$ of languages in that class. We prove that $n^{n-1}$, $n^{n-1}+n-1$, and $n^{n-2}+(n-2)2^{n-2}+1$ are tight upper bounds on the syntactic complexities of right ideals and prefix-closed languages, left ideals and suffix-closed languages, and two-sided ideals and factor-closed languages, respectively. Moreover, we show that the transition semigroups meeting the upper bounds for all three types of ideals are unique, and the numbers of generators (4, 5, and 6, respectively) cannot be reduced.
△ Less
Submitted 13 January, 2017; v1 submitted 20 September, 2015;
originally announced September 2015.
-
Complexity of Suffix-Free Regular Languages
Authors:
Janusz Brzozowski,
Marek Szykuła
Abstract:
We study various complexity properties of suffix-free regular languages. The quotient complexity of a regular language $L$ is the number of left quotients of $L$; this is the same as the state complexity of $L$. A regular language $L'$ is a dialect of a regular language $L$ if it differs only slightly from $L$. The quotient complexity of an operation on regular languages is the maximal quotient co…
▽ More
We study various complexity properties of suffix-free regular languages. The quotient complexity of a regular language $L$ is the number of left quotients of $L$; this is the same as the state complexity of $L$. A regular language $L'$ is a dialect of a regular language $L$ if it differs only slightly from $L$. The quotient complexity of an operation on regular languages is the maximal quotient complexity of the result of the operation expressed as a function of the quotient complexities of the operands. A sequence $(L_k,L_{k+1},\dots)$ of regular languages in some class ${\mathcal C}$, where $n$ is the quotient complexity of $L_n$, is called a stream. A stream is most complex in class ${\mathcal C}$ if its languages $L_n$ meet the complexity upper bounds for all basic measures. It is known that there exist such most complex streams in the class of regular languages, in the class of prefix-free languages, and also in the classes of right, left, and two-sided ideals. In contrast to this, we prove that there does not exist a most complex stream in the class of suffix-free regular languages. However, we do exhibit one ternary suffix-free stream that meets the bound for product and whose restrictions to binary alphabets meet the bounds for star and boolean operations. We also exhibit a quinary stream that meets the bounds for boolean operations, reversal, size of syntactic semigroup, and atom complexities. Moreover, we solve an open problem about the bound for the product of two languages of quotient complexities $m$ and $n$ in the binary case by showing that it can be met for infinitely many $m$ and $n$.
△ Less
Submitted 12 December, 2016; v1 submitted 20 April, 2015;
originally announced April 2015.
-
Quotient Complexities of Atoms in Regular Ideal Languages
Authors:
Janusz Brzozowski,
Sylvie Davies
Abstract:
A (left) quotient of a language $L$ by a word $w$ is the language $w^{-1}L=\{x\mid wx\in L\}$. The quotient complexity of a regular language $L$ is the number of quotients of $L$; it is equal to the state complexity of $L$, which is the number of states in a minimal deterministic finite automaton accepting $L$. An atom of $L$ is an equivalence class of the relation in which two words are equivalen…
▽ More
A (left) quotient of a language $L$ by a word $w$ is the language $w^{-1}L=\{x\mid wx\in L\}$. The quotient complexity of a regular language $L$ is the number of quotients of $L$; it is equal to the state complexity of $L$, which is the number of states in a minimal deterministic finite automaton accepting $L$. An atom of $L$ is an equivalence class of the relation in which two words are equivalent if for each quotient, they either are both in the quotient or both not in it; hence it is a non-empty intersection of complemented and uncomplemented quotients of $L$. A right (respectively, left and two-sided) ideal is a language $L$ over an alphabet $Σ$ that satisfies $L=LΣ^*$ (respectively, $L=Σ^*L$ and $L=Σ^*LΣ^*$). We compute the maximal number of atoms and the maximal quotient complexities of atoms of right, left and two-sided regular ideals.
△ Less
Submitted 23 May, 2015; v1 submitted 7 March, 2015;
originally announced March 2015.
-
Syntactic Complexity of Suffix-Free Languages
Authors:
Janusz Brzozowski,
Marek Szykuła
Abstract:
We solve an open problem concerning syntactic complexity: We prove that the cardinality of the syntactic semigroup of a suffix-free language with $n$ left quotients (that is, with state complexity $n$) is at most $(n-1)^{n-2}+n-2$ for $n\ge 6$. Since this bound is known to be reachable, this settles the problem. We also reduce the alphabet of the witness languages reaching this bound to five lette…
▽ More
We solve an open problem concerning syntactic complexity: We prove that the cardinality of the syntactic semigroup of a suffix-free language with $n$ left quotients (that is, with state complexity $n$) is at most $(n-1)^{n-2}+n-2$ for $n\ge 6$. Since this bound is known to be reachable, this settles the problem. We also reduce the alphabet of the witness languages reaching this bound to five letters instead of $n+2$, and show that it cannot be any smaller. Finally, we prove that the transition semigroup of a minimal deterministic automaton accepting a witness language is unique for each $n$.
△ Less
Submitted 15 October, 2015; v1 submitted 6 December, 2014;
originally announced December 2014.
-
Upper Bounds on Syntactic Complexity of Left and Two-Sided Ideals
Authors:
Janusz Brzozowski,
Marek Szykuła
Abstract:
We solve two open problems concerning syntactic complexity: We prove that the cardinality of the syntactic semigroup of a left ideal or a suffix-closed language with $n$ left quotients (that is, with state complexity $n$) is at most $n^{n-1}+n-1$, and that of a two-sided ideal or a factor-closed language is at most $n^{n-2}+(n-2)2^{n-2}+1$. Since these bounds are known to be reachable, this settle…
▽ More
We solve two open problems concerning syntactic complexity: We prove that the cardinality of the syntactic semigroup of a left ideal or a suffix-closed language with $n$ left quotients (that is, with state complexity $n$) is at most $n^{n-1}+n-1$, and that of a two-sided ideal or a factor-closed language is at most $n^{n-2}+(n-2)2^{n-2}+1$. Since these bounds are known to be reachable, this settles the problems.
△ Less
Submitted 3 July, 2014; v1 submitted 9 March, 2014;
originally announced March 2014.
-
Large Aperiodic Semigroups
Authors:
Janusz Brzozowski,
Marek Szykuła
Abstract:
The syntactic complexity of a regular language is the size of its syntactic semigroup. This semigroup is isomorphic to the transition semigroup of the minimal deterministic finite automaton accepting the language, that is, to the semigroup generated by transformations induced by non-empty words on the set of states of the automaton. In this paper we search for the largest syntactic semigroup of a…
▽ More
The syntactic complexity of a regular language is the size of its syntactic semigroup. This semigroup is isomorphic to the transition semigroup of the minimal deterministic finite automaton accepting the language, that is, to the semigroup generated by transformations induced by non-empty words on the set of states of the automaton. In this paper we search for the largest syntactic semigroup of a star-free language having $n$ left quotients; equivalently, we look for the largest transition semigroup of an aperiodic finite automaton with $n$ states.
We introduce two new aperiodic transition semigroups. The first is generated by transformations that change only one state; we call such transformations and resulting semigroups unitary. In particular, we study complete unitary semigroups which have a special structure, and we show that each maximal unitary semigroup is complete. For $n \ge 4$ there exists a complete unitary semigroup that is larger than any aperiodic semigroup known to date.
We then present even larger aperiodic semigroups, generated by transformations that map a non-empty subset of states to a single state; we call such transformations and semigroups semiconstant. In particular, we examine semiconstant tree semigroups which have a structure based on full binary trees. The semiconstant tree semigroups are at present the best candidates for largest aperiodic semigroups.
We also prove that $2^n-1$ is an upper bound on the state complexity of reversal of star-free languages, and resolve an open problem about a special case of state complexity of concatenation of star-free languages.
△ Less
Submitted 18 June, 2014; v1 submitted 31 December, 2013;
originally announced January 2014.
-
Most Complex Regular Right-Ideal Languages
Authors:
Janusz Brzozowski,
Gareth Davies
Abstract:
A right ideal is a language L over an alphabet A that satisfies L = LA*. We show that there exists a stream (sequence) (R_n : n \ge 3) of regular right ideal languages, where R_n has n left quotients and is most complex under the following measures of complexity: the state complexities of the left quotients, the number of atoms (intersections of complemented and uncomplemented left quotients), the…
▽ More
A right ideal is a language L over an alphabet A that satisfies L = LA*. We show that there exists a stream (sequence) (R_n : n \ge 3) of regular right ideal languages, where R_n has n left quotients and is most complex under the following measures of complexity: the state complexities of the left quotients, the number of atoms (intersections of complemented and uncomplemented left quotients), the state complexities of the atoms, the size of the syntactic semigroup, the state complexities of the operations of reversal, star, and product, and the state complexities of all binary boolean operations. In that sense, this stream of right ideals is a universal witness.
△ Less
Submitted 18 November, 2013;
originally announced November 2013.
-
Symmetric Groups and Quotient Complexity of Boolean Operations
Authors:
Jason Bell,
Janusz Brzozowski,
Nelma Moreira,
Rogério Reis
Abstract:
The quotient complexity of a regular language L is the number of left quotients of L, which is the same as the state complexity of L. Suppose that L and L' are binary regular languages with quotient complexities m and n, and that the transition semigroups of the minimal deterministic automata accepting L and L' are the symmetric groups S_m and S_n of degrees m and n, respectively. Denote by o any…
▽ More
The quotient complexity of a regular language L is the number of left quotients of L, which is the same as the state complexity of L. Suppose that L and L' are binary regular languages with quotient complexities m and n, and that the transition semigroups of the minimal deterministic automata accepting L and L' are the symmetric groups S_m and S_n of degrees m and n, respectively. Denote by o any binary boolean operation that is not a constant and not a function of one argument only. For m,n >= 2 with (m,n) not in {(2,2),(3,4),(4,3),(4,4)} we prove that the quotient complexity of LoL' is mn if and only either (a) m is not equal to n or (b) m=n and the bases (ordered pairs of generators) of S_m and S_n are not conjugate. For (m,n)\in {(2,2),(3,4),(4,3),(4,4)} we give examples to show that this need not hold. In proving these results we generalize the notion of uniform minimality to direct products of automata. We also establish a non-trivial connection between complexity of boolean operations and group theory.
△ Less
Submitted 7 October, 2013;
originally announced October 2013.
-
Maximally Atomic Languages
Authors:
Janusz Brzozowski,
Gareth Davies
Abstract:
The atoms of a regular language are non-empty intersections of complemented and uncomplemented quotients of the language. Tight upper bounds on the number of atoms of a language and on the quotient complexities of atoms are known. We introduce a new class of regular languages, called the maximally atomic languages, consisting of all languages meeting these bounds. We prove the following result: If…
▽ More
The atoms of a regular language are non-empty intersections of complemented and uncomplemented quotients of the language. Tight upper bounds on the number of atoms of a language and on the quotient complexities of atoms are known. We introduce a new class of regular languages, called the maximally atomic languages, consisting of all languages meeting these bounds. We prove the following result: If L is a regular language of quotient complexity n and G is the subgroup of permutations in the transition semigroup T of the minimal DFA of L, then L is maximally atomic if and only if G is transitive on k-subsets of 1,...,n for 0 <= k <= n and T contains a transformation of rank n-1.
△ Less
Submitted 21 May, 2014; v1 submitted 20 August, 2013;
originally announced August 2013.
-
Maximal Syntactic Complexity of Regular Languages Implies Maximal Quotient Complexities of Atoms
Authors:
Janusz Brzozowski,
Gareth Davies
Abstract:
We relate two measures of complexity of regular languages. The first is syntactic complexity, that is, the cardinality of the syntactic semigroup of the language. That semigroup is isomorphic to the semigroup of transformations of states induced by non-empty words in the minimal deterministic finite automaton accepting the language. If the language has n left quotients (its minimal automaton has n…
▽ More
We relate two measures of complexity of regular languages. The first is syntactic complexity, that is, the cardinality of the syntactic semigroup of the language. That semigroup is isomorphic to the semigroup of transformations of states induced by non-empty words in the minimal deterministic finite automaton accepting the language. If the language has n left quotients (its minimal automaton has n states), then its syntactic complexity is at most n^n and this bound is tight. The second measure consists of the quotient (state) complexities of the atoms of the language, where atoms are non-empty intersections of complemented and uncomplemented quotients. A regular language has at most 2^n atoms and this bound is tight. The maximal quotient complexity of any atom with r complemented quotients is 2^n-1, if r=0 or r=n, and 1+\sum_{k=1}^{r} \sum_{h=k+1}^{k+n-r} \binom{h}{n} \binom{k}{h}, otherwise. We prove that if a language has maximal syntactic complexity, then it has 2^n atoms and each atom has maximal quotient complexity, but the converse is false.
△ Less
Submitted 22 May, 2013; v1 submitted 15 February, 2013;
originally announced February 2013.
-
Minimal Nondeterministic Finite Automata and Atoms of Regular Languages
Authors:
Janusz Brzozowski,
Hellis Tamm
Abstract:
We examine the NFA minimization problem in terms of atomic NFA's, that is, NFA's in which the right language of every state is a union of atoms, where the atoms of a regular language are non-empty intersections of complemented and uncomplemented left quotients of the language. We characterize all reduced atomic NFA's of a given language, that is, those NFA's that have no equivalent states. Using a…
▽ More
We examine the NFA minimization problem in terms of atomic NFA's, that is, NFA's in which the right language of every state is a union of atoms, where the atoms of a regular language are non-empty intersections of complemented and uncomplemented left quotients of the language. We characterize all reduced atomic NFA's of a given language, that is, those NFA's that have no equivalent states. Using atomic NFA's, we formalize Sengoku's approach to NFA minimization and prove that his method fails to find all minimal NFA's. We also formulate the Kameda-Weiner NFA minimization in terms of quotients and atoms.
△ Less
Submitted 23 January, 2013;
originally announced January 2013.
-
Syntactic Complexity of R- and J-Trivial Regular Languages
Authors:
Janusz Brzozowski,
Baiyu Li
Abstract:
The syntactic complexity of a regular language is the cardinality of its syntactic semigroup. The syntactic complexity of a subclass of the class of regular languages is the maximal syntactic complexity of languages in that class, taken as a function of the state complexity n of these languages. We study the syntactic complexity of R- and J-trivial regular languages, and prove that n! and floor of…
▽ More
The syntactic complexity of a regular language is the cardinality of its syntactic semigroup. The syntactic complexity of a subclass of the class of regular languages is the maximal syntactic complexity of languages in that class, taken as a function of the state complexity n of these languages. We study the syntactic complexity of R- and J-trivial regular languages, and prove that n! and floor of [e(n-1)!] are tight upper bounds for these languages, respectively. We also prove that 2^{n-1} is the tight upper bound on the state complexity of reversal of J-trivial regular languages.
△ Less
Submitted 3 February, 2013; v1 submitted 22 August, 2012;
originally announced August 2012.
-
Universal Witnesses for State Complexity of Boolean Operations and Concatenation Combined with Star
Authors:
Janusz Brzozowski,
David Liu
Abstract:
We study the state complexity of boolean operations and product (concatenation, catenation) combined with star. We derive tight upper bounds for the symmetric differences and differences of two languages, one or both of which are starred, and for the product of two starred languages. We prove that the previously discovered bounds for the union and the intersection of languages with one or two star…
▽ More
We study the state complexity of boolean operations and product (concatenation, catenation) combined with star. We derive tight upper bounds for the symmetric differences and differences of two languages, one or both of which are starred, and for the product of two starred languages. We prove that the previously discovered bounds for the union and the intersection of languages with one or two starred arguments, for the product of two languages one of which is starred, and for the star of the product of two languages can all be met by the recently introduced universal witnesses and their variants.
△ Less
Submitted 9 July, 2012;
originally announced July 2012.
-
Universal Witnesses for State Complexity of Basic Operations Combined with Reversal
Authors:
Janusz Brzozowski,
David Liu
Abstract:
We study the state complexity of boolean operations, concatenation and star with one or two of the argument languages reversed. We derive tight upper bounds for the symmetric differences and differences of such languages. We prove that the previously discovered bounds for union, intersection, concatenation and star of such languages can all be met by the recently introduced universal witnesses and…
▽ More
We study the state complexity of boolean operations, concatenation and star with one or two of the argument languages reversed. We derive tight upper bounds for the symmetric differences and differences of such languages. We prove that the previously discovered bounds for union, intersection, concatenation and star of such languages can all be met by the recently introduced universal witnesses and their variants.
△ Less
Submitted 2 July, 2012;
originally announced July 2012.
-
Syntactic Complexity of Finite/Cofinite, Definite, and Reverse Definite Languages
Authors:
Janusz Brzozowski,
David Liu
Abstract:
We study the syntactic complexity of finite/cofinite, definite and reverse definite languages. The syntactic complexity of a class of languages is defined as the maximal size of syntactic semigroups of languages from the class, taken as a function of the state complexity n of the languages. We prove that (n-1)! is a tight upper bound for finite/cofinite languages and that it can be reached only if…
▽ More
We study the syntactic complexity of finite/cofinite, definite and reverse definite languages. The syntactic complexity of a class of languages is defined as the maximal size of syntactic semigroups of languages from the class, taken as a function of the state complexity n of the languages. We prove that (n-1)! is a tight upper bound for finite/cofinite languages and that it can be reached only if the alphabet size is greater than or equal to (n-1)!-(n-2)!. We prove that the bound is also (n-1)! for reverse definite languages, but the minimal alphabet size is (n-1)!-2(n-2)!. We show that \lfloor e\cdot (n-1)!\rfloor is a lower bound on the syntactic complexity of definite languages, and conjecture that this is also an upper bound, and that the alphabet size required to meet this bound is \floor{e \cdot (n-1)!} - \floor{e \cdot (n-2)!}. We prove the conjecture for n\le 4.
△ Less
Submitted 21 June, 2012; v1 submitted 13 March, 2012;
originally announced March 2012.
-
Quotient Complexities of Atoms of Regular Languages
Authors:
Janusz Brzozowski,
Hellis Tamm
Abstract:
An atom of a regular language L with n (left) quotients is a non-empty intersection of uncomplemented or complemented quotients of L, where each of the n quotients appears in a term of the intersection. The quotient complexity of L, which is the same as the state complexity of L, is the number of quotients of L. We prove that, for any language L with quotient complexity n, the quotient complexity…
▽ More
An atom of a regular language L with n (left) quotients is a non-empty intersection of uncomplemented or complemented quotients of L, where each of the n quotients appears in a term of the intersection. The quotient complexity of L, which is the same as the state complexity of L, is the number of quotients of L. We prove that, for any language L with quotient complexity n, the quotient complexity of any atom of L with r complemented quotients has an upper bound of 2^n-1 if r=0 or r=n, and 1+\sum_{k=1}^{r} \sum_{h=k+1}^{k+n-r} C_{h}^{n} \cdot C_{k}^{h} otherwise, where C_j^i is the binomial coefficient. For each n\ge 1, we exhibit a language whose atoms meet these bounds.
△ Less
Submitted 8 March, 2012; v1 submitted 31 December, 2011;
originally announced January 2012.
-
Syntactic Complexity of Star-Free Languages
Authors:
Janusz Brzozowski,
Baiyu Li
Abstract:
The syntactic complexity of a regular language is the cardinality of its syntactic semigroup. The syntactic complexity of a subclass of regular languages is the maximal syntactic complexity of languages in that subclass, taken as a function of the state complexity of these languages. We study the syntactic complexity of star-free regular languages, that is, languages that can be constructed from f…
▽ More
The syntactic complexity of a regular language is the cardinality of its syntactic semigroup. The syntactic complexity of a subclass of regular languages is the maximal syntactic complexity of languages in that subclass, taken as a function of the state complexity of these languages. We study the syntactic complexity of star-free regular languages, that is, languages that can be constructed from finite languages using union, complement and concatenation. We find tight upper bounds on the syntactic complexity of languages accepted by monotonic and partially monotonic automata. We introduce "nearly monotonic" automata, which accept star-free languages, and find a tight upper bound on the syntactic complexity of languages accepted by such automata. We conjecture that this bound is also an upper bound on the syntactic complexity of star-free languages.
△ Less
Submitted 15 September, 2011;
originally announced September 2011.
-
Syntactic Complexity of Prefix-, Suffix-, Bifix-, and Factor-Free Regular Languages
Authors:
Janusz Brzozowski,
Baiyu Li,
Yuli Ye
Abstract:
The syntactic complexity of a regular language is the cardinality of its syntactic semigroup. The syntactic complexity of a subclass of the class of regular languages is the maximal syntactic complexity of languages in that class, taken as a function of the state complexity $n$ of these languages. We study the syntactic complexity of prefix-, suffix-, bifix-, and factor-free regular languages. We…
▽ More
The syntactic complexity of a regular language is the cardinality of its syntactic semigroup. The syntactic complexity of a subclass of the class of regular languages is the maximal syntactic complexity of languages in that class, taken as a function of the state complexity $n$ of these languages. We study the syntactic complexity of prefix-, suffix-, bifix-, and factor-free regular languages. We prove that $n^{n-2}$ is a tight upper bound for prefix-free regular languages. We present properties of the syntactic semigroups of suffix-, bifix-, and factor-free regular languages, conjecture tight upper bounds on their size to be $(n-1)^{n-2}+(n-2)$, $(n-1)^{n-3} + (n-2)^{n-3} + (n-3)2^{n-3}$, and $(n-1)^{n-3} + (n-3)2^{n-3} + 1$, respectively, and exhibit languages with these syntactic complexities.
△ Less
Submitted 18 November, 2011; v1 submitted 15 March, 2011;
originally announced March 2011.
-
Theory of Atomata
Authors:
Janusz Brzozowski,
Hellis Tamm
Abstract:
We show that every regular language defines a unique nondeterministic finite automaton (NFA), which we call "átomaton", whose states are the "atoms" of the language, that is, non-empty intersections of complemented or uncomplemented left quotients of the language. We describe methods of constructing the átomaton, and prove that it is isomorphic to the reverse automaton of the minimal deterministic…
▽ More
We show that every regular language defines a unique nondeterministic finite automaton (NFA), which we call "átomaton", whose states are the "atoms" of the language, that is, non-empty intersections of complemented or uncomplemented left quotients of the language. We describe methods of constructing the átomaton, and prove that it is isomorphic to the reverse automaton of the minimal deterministic finite automaton (DFA) of the reverse language. We study "atomic" NFAs in which the right language of every state is a union of atoms. We generalize Brzozowski's double-reversal method for minimizing a deterministic finite automaton (DFA), showing that the result of applying the subset construction to an NFA is a minimal DFA if and only if the reverse of the NFA is atomic. We prove that Sengoku's claim that his method always finds a minimal NFA is false.
△ Less
Submitted 19 August, 2013; v1 submitted 18 February, 2011;
originally announced February 2011.
-
Quotient Complexity of Star-Free Languages
Authors:
Janusz Brzozowski,
Bo Liu
Abstract:
The quotient complexity, also known as state complexity, of a regular language is the number of distinct left quotients of the language. The quotient complexity of an operation is the maximal quotient complexity of the language resulting from the operation, as a function of the quotient complexities of the operands. The class of star-free languages is the smallest class containing the finite langu…
▽ More
The quotient complexity, also known as state complexity, of a regular language is the number of distinct left quotients of the language. The quotient complexity of an operation is the maximal quotient complexity of the language resulting from the operation, as a function of the quotient complexities of the operands. The class of star-free languages is the smallest class containing the finite languages and closed under boolean operations and concatenation. We prove that the tight bounds on the quotient complexities of union, intersection, difference, symmetric difference, concatenation, and star for star-free languages are the same as those for regular languages, with some small exceptions, whereas the bound for reversal is 2^n-1.
△ Less
Submitted 17 December, 2010;
originally announced December 2010.
-
Syntactic Complexity of Ideal and Closed Languages
Authors:
Janusz Brzozowski,
Yuli Ye
Abstract:
The state complexity of a regular language is the number of states in the minimal deterministic automaton accepting the language. The syntactic complexity of a regular language is the cardinality of its syntactic semigroup. The syntactic complexity of a subclass of regular languages is the worst-case syntactic complexity taken as a function of the state complexity $n$ of languages in that class. W…
▽ More
The state complexity of a regular language is the number of states in the minimal deterministic automaton accepting the language. The syntactic complexity of a regular language is the cardinality of its syntactic semigroup. The syntactic complexity of a subclass of regular languages is the worst-case syntactic complexity taken as a function of the state complexity $n$ of languages in that class. We study the syntactic complexity of the class of regular ideal languages and their complements, the closed languages. We prove that $n^{n-1}$ is a tight upper bound on the complexity of right ideals and prefix-closed languages, and that there exist left ideals and suffix-closed languages of syntactic complexity $n^{n-1}+n-1$, and two-sided ideals and factor-closed languages of syntactic complexity $n^{n-2}+(n-2)2^{n-2}+1$.
△ Less
Submitted 15 October, 2010;
originally announced October 2010.
-
On the Complexity of the Evaluation of Transient Extensions of Boolean Functions
Authors:
Janusz Brzozowski,
Baiyu Li,
Yuli Ye
Abstract:
Transient algebra is a multi-valued algebra for hazard detection in gate circuits. Sequences of alternating 0's and 1's, called transients, represent signal values, and gates are modeled by extensions of boolean functions to transients. Formulas for computing the output transient of a gate from the input transients are known for NOT, AND, OR} and XOR gates and their complements, but, in general,…
▽ More
Transient algebra is a multi-valued algebra for hazard detection in gate circuits. Sequences of alternating 0's and 1's, called transients, represent signal values, and gates are modeled by extensions of boolean functions to transients. Formulas for computing the output transient of a gate from the input transients are known for NOT, AND, OR} and XOR gates and their complements, but, in general, even the problem of deciding whether the length of the output transient exceeds a given bound is NP-complete. We propose a method of evaluating extensions of general boolean functions. We introduce and study a class of functions with the following property: Instead of evaluating an extension of a boolean function on a given set of transients, it is possible to get the same value by using transients derived from the given ones, but having length at most 3. We prove that all functions of three variables, as well as certain other functions, have this property, and can be efficiently evaluated.
△ Less
Submitted 10 August, 2010;
originally announced August 2010.
-
Quotient Complexity of Bifix-, Factor-, and Subword-Free Regular Languages
Authors:
Janusz Brzozowski,
Galina Jirásková,
Baiyu Li,
Joshua Smith
Abstract:
A language L is prefix-free if, whenever words u and v are in L and u is a prefix of v, then u=v. Suffix-, factor-, and subword-free languages are defined similarly, where "subword" means "subsequence". A language is bifix-free if it is both prefix- and suffix-free. We study the quotient complexity, more commonly known as state complexity, of operations in the classes of bifix-, factor-, and subwo…
▽ More
A language L is prefix-free if, whenever words u and v are in L and u is a prefix of v, then u=v. Suffix-, factor-, and subword-free languages are defined similarly, where "subword" means "subsequence". A language is bifix-free if it is both prefix- and suffix-free. We study the quotient complexity, more commonly known as state complexity, of operations in the classes of bifix-, factor-, and subword-free regular languages. We find tight upper bounds on the quotient complexity of intersection, union, difference, symmetric difference, concatenation, star, and reversal in these three classes of languages.
△ Less
Submitted 11 May, 2011; v1 submitted 24 June, 2010;
originally announced June 2010.
-
Quotient Complexity of Closed Languages
Authors:
J. Brzozowski,
G. Jirásková,
C. Zou
Abstract:
A language L is prefix-closed if, whenever a word w is in L, then every prefix of w is also in L. We define suffix-, factor-, and subword-closed languages in the same way, where by subword we mean subsequence. We study the quotient complexity (usually called state complexity) of operations on prefix-, suffix-, factor-, and subword-closed languages. We find tight upper bounds on the complexity of…
▽ More
A language L is prefix-closed if, whenever a word w is in L, then every prefix of w is also in L. We define suffix-, factor-, and subword-closed languages in the same way, where by subword we mean subsequence. We study the quotient complexity (usually called state complexity) of operations on prefix-, suffix-, factor-, and subword-closed languages. We find tight upper bounds on the complexity of the prefix-, suffix-, factor-, and subword-closure of arbitrary languages, and on the complexity of boolean operations, concatenation, star and reversal in each of the four classes of closed languages. We show that repeated application of positive closure and complement to a closed language results in at most four distinct languages, while Kleene closure and complement gives at most eight languages.
△ Less
Submitted 5 December, 2009;
originally announced December 2009.
-
Quotient complexity of ideal languages
Authors:
J. Brzozowski,
G. Jirásková,
B. Li
Abstract:
We study the state complexity of regular operations in the class of ideal languages. A language L over an alphabet Sigma is a right (left) ideal if it satisfies L = L Sigma* (L = Sigma* L). It is a two-sided ideal if L = Sigma* L Sigma *, and an all-sided ideal if it is the shuffle of Sigma* with L. We prefer the term "quotient complexity" instead of "state complexity", and we use derivatives to…
▽ More
We study the state complexity of regular operations in the class of ideal languages. A language L over an alphabet Sigma is a right (left) ideal if it satisfies L = L Sigma* (L = Sigma* L). It is a two-sided ideal if L = Sigma* L Sigma *, and an all-sided ideal if it is the shuffle of Sigma* with L. We prefer the term "quotient complexity" instead of "state complexity", and we use derivatives to calculate upper bounds on quotient complexity, whenever it is convenient. We find tight upper bounds on the quotient complexity of each type of ideal language in terms of the complexity of an arbitrary generator and of its minimal generator, the complexity of the minimal generator, and also on the operations union, intersection, set difference, symmetric difference, concatenation, star and reversal of ideal languages.
△ Less
Submitted 14 August, 2009;
originally announced August 2009.
-
Quotient Complexity of Regular Languages
Authors:
Janusz Brzozowski
Abstract:
The past research on the state complexity of operations on regular languages is examined, and a new approach based on an old method (derivatives of regular expressions) is presented. Since state complexity is a property of a language, it is appropriate to define it in formal-language terms as the number of distinct quotients of the language, and to call it "quotient complexity". The problem of f…
▽ More
The past research on the state complexity of operations on regular languages is examined, and a new approach based on an old method (derivatives of regular expressions) is presented. Since state complexity is a property of a language, it is appropriate to define it in formal-language terms as the number of distinct quotients of the language, and to call it "quotient complexity". The problem of finding the quotient complexity of a language f(K,L) is considered, where K and L are regular languages and f is a regular operation, for example, union or concatenation. Since quotients can be represented by derivatives, one can find a formula for the typical quotient of f(K,L) in terms of the quotients of K and L. To obtain an upper bound on the number of quotients of f(K,L) all one has to do is count how many such quotients are possible, and this makes automaton constructions unnecessary. The advantages of this point of view are illustrated by many examples. Moreover, new general observations are presented to help in the estimation of the upper bounds on quotient complexity of regular operations.
△ Less
Submitted 27 July, 2009;
originally announced July 2009.
-
Closures in Formal Languages: Concatenation, Separation, and Algorithms
Authors:
J. Brzozowski,
E. Grant,
J. Shallit
Abstract:
We continue our study of open and closed languages. We investigate how the properties of being open and closed are preserved under concatenation. We investigate analogues, in formal languages, of the separation axioms in topological spaces; one of our main results is that there is a clopen partition separating two words if and only if the words commute. We show that we can decide in quadratic ti…
▽ More
We continue our study of open and closed languages. We investigate how the properties of being open and closed are preserved under concatenation. We investigate analogues, in formal languages, of the separation axioms in topological spaces; one of our main results is that there is a clopen partition separating two words if and only if the words commute. We show that we can decide in quadratic time if the language specified by a DFA is closed, but if the language is specified by an NFA, the problem is PSPACE-complete.
△ Less
Submitted 23 January, 2009;
originally announced January 2009.
-
Closures in Formal Languages and Kuratowski's Theorem
Authors:
J. Brzozowski,
E. Grant,
J. Shallit
Abstract:
A famous theorem of Kuratowski states that in a topological space, at most 14 distinct sets can be produced by repeatedly applying the operations of closure and complement to a given set. We re-examine this theorem in the setting of formal languages, where closure is either Kleene closure or positive closure. We classify languages according to the structure of the algebra they generate under ite…
▽ More
A famous theorem of Kuratowski states that in a topological space, at most 14 distinct sets can be produced by repeatedly applying the operations of closure and complement to a given set. We re-examine this theorem in the setting of formal languages, where closure is either Kleene closure or positive closure. We classify languages according to the structure of the algebra they generate under iterations of complement and closure. We show that there are precisely 9 such algebras in the case of positive closure, and 12 in the case of Kleene closure.
△ Less
Submitted 23 January, 2009;
originally announced January 2009.
-
Decision Problems For Convex Languages
Authors:
Janusz Brzozowski,
Jeffrey Shallit,
Zhi Xu
Abstract:
In this paper we examine decision problems associated with various classes of convex languages, studied by Ang and Brzozowski (under the name "continuous languages"). We show that we can decide whether a given language L is prefix-, suffix-, factor-, or subword-convex in polynomial time if L is represented by a DFA, but that the problem is PSPACE-hard if L is represented by an NFA. In the case t…
▽ More
In this paper we examine decision problems associated with various classes of convex languages, studied by Ang and Brzozowski (under the name "continuous languages"). We show that we can decide whether a given language L is prefix-, suffix-, factor-, or subword-convex in polynomial time if L is represented by a DFA, but that the problem is PSPACE-hard if L is represented by an NFA. In the case that a regular language is not convex, we prove tight upper bounds on the length of the shortest words demonstrating this fact, in terms of the number of states of an accepting DFA. Similar results are proved for some subclasses of convex languages: the prefix-, suffix-, factor-, and subword-closed languages, and the prefix-, suffix-, factor-, and subword-free languages.
△ Less
Submitted 12 December, 2008; v1 submitted 13 August, 2008;
originally announced August 2008.