Modal Separability of Fixpoint Formulae
Abstract
We study modal separability for fixpoint formulae: given two mutually exclusive fixpoint formulae , decide whether there is a modal formula that separates them, that is, that satisfies . This problem has applications for finding simple reasons for inconsistency. Our main contributions are tight complexity bounds for deciding modal separability and optimal ways to compute a separator if it exists. More precisely, it is ExpTime-complete in general and PSpace-complete over words. Separators can be computed in doubly exponential time in general and in exponential time over words, and this is optimal as well. The results for general structures transfer to arbitrary, finitely branching, and finite trees. The word case results hold for finite, infinite, and arbitrary words.
1 Introduction
For given logics , the -separability problem for is to decide given two -formulae whether there is an -formula that separates and in the sense that and . Obviously, a separator can only exist when and are mutually exclusive, and the problem is only meaningful when is less expressive than . Intuitively, a separator formulated in a “simpler” logic explains a given inconsistency in a “complicated” logic . Note that, for logics closed under negation, -separability generalizes the -definability problem for : decide whether a given -formula is equivalent to an -formula. Indeed, is equivalent to an -formula iff and are -separable. Since separability is more general than definability, solving it requires an even better understanding of the expressive power of the logics under consideration.
Example 1.
Consider being the modal logic , also known under the name in the context of description logics. Expressions of the logic (called formulae in terminology and concepts in description logic parlance) describe properties of colored, directed graphs with a distinguished point called the root. As the more expressive take : the extension of with regular modalities (in DL terms: the extension of with regular role expressions). Assume the graphs under consideration have edges labelled with colors , and and consider properties:
-
:
“There is a path from the root whose labeling belongs to .”
-
:
“The labeling of every (finite) path from the root belongs to .”
These (contradictory) properties are expressed by -formulae and and it is easy to see that none of them can be expressed in the weaker . Nonetheless, and are separated by a simple -formula that says: “there is an -labelled edge from the root”. Thus, serves as an easy explanation of the inconsistency of and .
Generalizing the example, in this paper we investigate -separability of formulae in the modal -calculus [19, 11], which extends [6]. is a general framework capturing logics supporting fixpoints that is relevant both for knowledge representation and reasoning and for verification. It describes all bisimulation-invariant properties definable in [8, Theorem 11] and thus encompasses virtually all specification languages such as and [2].
Our results generalize the -definability problem for which was shown decidable by Otto [16, Main Theorem]. The adaptation of the argument to the more general separability is relatively easy. However, Otto’s paper is focussed on deciding the existence of modal definitions. The problem of computing a definition when it exists is not discussed, and it seems that the formula which can be read off from the proof is at least tower-exponential big. This issue was addressed in [13]. Unfortunately, in this case the approach, although constructive, does not easily generalize from definability to separability. Matching lower bounds are also missing. We fill the gap by providing a procedure which is both fully constructive and works for the more general separation case. Both the constructed formulae and the running time are optimal, as illustrated by suitable examples and reductions.
We consider both general models and “word models” which are Kripke structures in which each point has at most one successor. The latter are relevant from a verification perspective and for temporal reasoning. In order to obtain our results we first prove model-theoretic characterizations in terms of bisimulations. We then exploit the close connection of to nondeterministic parity tree automata to give (1) optimal procedures for the separability problem and (2) upper bounds on the modal depth of a separator, if it exists. In (1) we show ExpTime-completeness of modal separability in general and PSpace-completeness over words. The lower bounds are essentially inherited from satisfiability. The upper bounds derived in (2) are then used together with the automata to compute so-called -uniform consequences, that is, modal formulae that have exactly the same modal consequences as a given -formula, up to modal depth . These -uniform consequences are then used as separators. Also here, our procedures are optimal: they compute separators of at most double exponential size, and we show that there are -formulae that are expressible in but any equivalent -formulae must have doubly exponential size. This means that there is a double exponential succinctness gap between and . In the word case, our procedures compute exponentially sized separators and there is only an exponential succinctness gap. All lower bounds (both computational and succinctness) already hold for () in place of , and for definability in place of separability.
It is interesting to note that our results hold over classes of models definable by -formulae. This observation allows us to cover the more general notion of separation in presence of an ontology (i.e. a background theory imposing some conditions on models). As long as the ontology is expressible in , separability and computation of separators reduce to the ontology-free setting. Without much effort the same observation lets us transfer our results to finite words, infinite words, and finite trees.
Missing proofs can be found in the appendix.
Related Work.
Separability has been intensively studied in formal language theory. A seminal result in this area is that separability of regular word languages by a first-order language is decidable in ExpTime [17]. Since over words defines precisely the regular languages and first-order logic captures , this is particularly related to our results over words.
In logic, a recent work investigates the complexity of separating between formulae supporting counting quantifiers by formulae that do not support them [12]. The used techniques exploit compactness, which makes them inapplicable to our case and inherently non-constructive.
Another related problem is the question of interpolant existence. An interpolant of two -formulae and is an -formula with and such that the signature of is contained in the signatures of both and . Thus, the problem resembles separability but the restriction on is in terms of the signature instead of in terms of the logic. Sometimes this question reduces to entailment, as many logics enjoy the Craig interpolation property: an interpolant of and exists whenever . Interpolant existence for logics that lack Craig interpolation has recently been studied in [10, 1]. The used tools, however, are similar in nature to the ones from [12] and therefore inapplicable to our problem.
Finally, a related problem is separability of data examples. There, the task is to separate sets of pointed structures instead of formulae (see [14, 9] and the references therein). Separability of data examples can be cast as an instance of (our logical notion of) separability if is expressive enough to describe the data examples. Conversely, -separability of formulae and is the same as data separability of the (possibly infinite) sets of their models by an -formula.
2 Preliminaries
Assuming familiarity of the reader with modal logic and the modal -calculus, we recall here only the main notions and refer to [3] for more details.
Syntax.
We consider modal logic and its fixpoint extension over a modal signature consisting of two finite sets: actions and propositions . The syntax of is given as:
with and . If is a singleton, we use and in place of and . The syntax of is obtained by extending the above with additional clauses:
where belongs to a fixed set of variables. The restriction to a fixed finite signature is only for the sake of readability. All results in the paper remain true with arbitrary signature.
Semantics.
The models we consider are pointed Kripke structures. That is, a model consists of a set (called its universe) with a distinguished point called the root, an interpretation for every and a valuation . We call the set colors and denote it by . Both and are interpreted in points of models in a standard way. Since models are by definition pointed we write meaning that the root of satisfies . The same symbol denotes entailment: means that every model of is a model of . In the case only models from some fixed class are considered we talk about satisfiability and entailment over and in the latter case write .
A particularly relevant class of models are trees. A model is a tree if the underlying directed graph is a tree with as its root. The branching or outdegree of a point is the number of its children in this underlying graph. The class of all trees is denoted by . We identify words (both finite and infinite) over alphabet with trees over a single action of outdegree at most one. Points of such models are interpreted as positions in the word, the unique accessibility relation represents the successor relation, and the valuation determines the letter at each position. A prefix of a tree is a subset of its universe closed under taking ancestors. When no confusion arises we identify a prefix with the induced subtree of that has as its universe. The depth of a point is the distance from the root. The prefix of depth (or just -prefix) is the set of all points at depth at most and is denoted by (and the corresponding subtree by ).
We define bisimulations and bisimilarity in a standard way except that in the case of trees for convenience we assume that bisimulations only link points at the same depth. An -step bisimulation (or just -bisimulation) between trees and is a bisimulation between their -prefixes. We denote -bisimilarity by .
Size of formulae.
The size of a formula , denoted , is the number of nodes in its syntax tree. Similarly, its depth is the maximal length of paths in the syntax tree. The depth of a formula should not be confused with its modal depth which is the maximal nesting of modal operators; all formulae of modal depth at most are denoted .
When we specify formulae in the paper, we use syntactic sugar , , and nabla , for finite sets of formulae . The first two are self-explanatory and allow for higher branching in the syntax tree. The last one, , intuitively means that “every formula in is true in some child and every child satisfies some formula from ” and is an abbreviation for
(1) |
It is well-known that and can be rewritten into basic syntax under polynomial cost. We also include the colors directly in the syntax: is a shorthand for the formula . Rewriting colors increases the size only by a factor linear in .
Automata.
Throughout the paper we use automata over tree models of both bounded and arbitrary outdegree. A nondeterministic parity tree automaton (NPTA) is a tuple where is a finite set of states, is the initial state, is the alphabet fixed above, and assigns each state a priority. The transition function is of type:
Intuitively, means that in the state upon reading color the automaton (i) chooses a transition and (ii) labels all the children of the current point with states from so that every is assigned to some child. A run of on a tree is an assignment consistent with in such sense and sending the root of the tree to . The run is accepting if for every infinite path in the sequence satisfies the parity condition. We write in case has an accepting run on . An automaton that is identical to except that the original initial state is replaced with is denoted . We refer with NPWA to an NPTA working over words.
In NPTAs over trees of bounded outdegree it might be more common to use a transition function of type , but the difference is not essential: our NPTAs can be represented in this way and conversely, all relevant constructions for such NPTAs can be adapted to our setting. Most importantly, we rely on the following classical result (see for example the discussion in [21] and the well-presented Dealternation Theorem 5.7 in [4]):
Theorem 1.
For every -formula , we can construct an equivalent NPTA , that is, iff , for every tree , with number of states at most exponential in . If we consider models of bounded outdegree then is computed in exponential time, otherwise in doubly exponential time.
3 Foundations of Separability
We start with recalling the notion of separability and discuss some of its basic properties.
Definition 1.
Given , a modal separator of is with and . It is a modal separator over a class if and .
The notion induces the problem of modal separability: given two -formulae , decide whether a modal separator exists. Clearly, -definability of or is a sufficient condition for the existence of a modal separator between . However, Example 1 shows that it is not a necessary one: neither nor are -definable, yet a separator exists. We make some foundational observations.
Inspired by the notion of Craig interpolation, one could also consider the notion of a Craig modal separator, which is a modal separator of which only uses symbols occurring in both and . However, based on the fact that enjoys Craig interpolation, we show in Theorem 2 (proof in Appendix A) that Craig modal separability and modal separability coincide. Since enjoys Craig interpolation over many classes of models [15, Theorem 1], Theorem 2 remains true over all classes of models considered below. We thus focus on modal separability.
Theorem 2.
admit a modal separator iff they admit a Craig modal separator.
Inspired by the notion of uniform interpolation [23, 5], it is natural to ask whether every admits a uniform modal separator, that is, a formula that is a modal separator of for every with . However, substituting for we get that the uniform modal separator for is actually equivalent to . Consequently, a -formula has a uniform modal separator iff it is modally definable. This is contrast with the fact that both [23] and [5] enjoy uniform interpolation.
Since has both the finite model property and the (finitely branching) tree model property, the notions of a modal separator over finite models, arbitrary tree models, and finitely branching tree models all coincide with modal separator (over arbitrary models). Unsurprisingly, this does not apply to the class of all finite trees.
Example 2.
Consider a -formula expressing that there exists an infinite path originating in the root. It is satisfiable, but unsatisfiable over finite trees. Thus is an -definition of over finite trees, but is not -definable (over arbitrary models).
We deal with separability over finite trees as follows. Call a class of models -definable in if there is a -formula such that iff , for all models .
Lemma 1.
Let be -definable in by and let . Then is a modal separator of over iff is a modal separator of and over .
Intuitively, Lemma 1 provides us with a reduction of modal separability over to modal separability over (the larger) . It has a number of interesting consequences. First, observe that the formula from Example 2 defines the class of finite trees in the class of all finitely branching trees. Hence provides a reduction of modal separability over finite trees to modal separability over finitely branching trees, and thus to modal separability. Similarly, and again using , Lemma 1 reduces modal separability over finite words and over infinite words to modal separability over (arbitrary) words. Finally, the lemma can be used to reduce modal separability relative to background knowledge to modal separability. Call a modal separator of relative to if it is a modal separator of over the class of models satisfying in every point. This setting is most relevant for the DL community since plays the role of an ontology. In particular, the question whether two -concepts are separable by an -concept relative to an -ontology is an instance of that problem (recall that every -concept can be expressed as a -formula). Let be the -formula expressing that is satisfied in every point reachable via the accessibility relations. Using Lemma 1 and bisimulation invariance of , it is routine to verify that is a modal separator of relative to iff is a modal separator of and .
In view of what was said so far, we will from now on concentrate on deciding modal separability over general and word models and computing separators if they exist. A main ingredient for both tasks is to show that if there is a modal separator for -formula , then there is one of modal depth at most exponential in the size of and . As a necessary tool for showing this exponential bound on , and for efficiently deciding if a given suffices, in Appendix B we establish the following model-theoretic characterization. Fix for the rest of the paper and denote their size by .
Proposition 1.
Let . The following are equivalent:
-
(i)
There is of modal depth separating and ;
-
(ii)
For all models and bisimilar up to depth : implies ;
-
(iii)
For all trees and identical up to depth : implies ;
-
(iv)
For all trees and identical up to depth and whose branching is bounded by :
implies .
Based on Proposition 1, we show that -separability of -formulae is ExpTime-complete and thus not harder than -definability.
Theorem 3.
Modal separability of -formulae is ExpTime-complete over arbitrary models.
ExpTime-hardness already holds for -definability and is proved by an immediate
reduction from -satisfiability, which is ExpTime-complete
already for its fragment [6, Section
4].
It is not hard to modify the original hardness proof
for -satisfiability to work over finite trees, so
Theorem 3 remains valid over finite trees as
well.
For the upper bound, we mostly follow the technical development
in [16]. Thanks to Proposition 1 separability is equivalent to the existence of for which condition (iv) holds. This can be expressed as an statement about the full -ary tree, and thus decided. However, for optimal complexity and to extract bounds that we use later, in Appendix C we apply a lower-level automata-theoretic analysis.
Over words, we essentially follow the same approach. Since the tree automata used in the proof of Theorem 3 can be replaced by word automata, the complexity drops to PSpace (see Appendix D). A matching lower bound can be derived as above by a reduction from satisfiability in over words [20, Theorem 4.1] (which, in fact, can be rephrased in terms of ).
Theorem 4.
Modal separability of -formulae is PSpace-complete over words.
As announced, an important step in the proofs of the upper bounds, both in the case with arbitrary models and with words, is the following proposition which we will also use later.
Proposition 2.
If are separable then they are separable by a formula of modal depth exponential in their size . The same is true over words.
In the remainder of the paper we will deal with computing separators based on Proposition 2. Before we proceed let note that our approach differs from the treatment of modal definability from [13]. There, the authors rewrite given into modal in such a way that if the initial is modally definable then and are equivalent. In the case when is not modally definable, however, the output is rather random. For example, obtained from the formula from Example 2 is equivalent to which is not even a consequence of . Thus, a different construction is needed to obtain separators. We will actually compute something slightly stronger that might be of independent interest.
Definition 2.
Given and , a formula is an -uniform consequence of if, for all :
An analogous notion relative to a fixed class of models is obtained by replacing with .
In words: is an -uniform consequence of if it has modal depth , is a consequence of , and entails every other consequence of of modal depth . In particular, if and are separable by some modal formula of modal depth and is an -uniform consequence of , then this separates from as well. Observe that -uniform consequences exist for every and . Indeed, given and we can obtain an -uniform consequence of by taking the disjunction of all -types consistent with . Here, by an -type we mean a maximal consistent subset of . Since up to equivalence there are only finitely many formulae in , each -type can be represented as a single -formula and the mentioned disjunction is well-defined.
In view of Proposition 2, it thus suffices to compute -uniform consequences of . Unfortunately, the naive construction given above is nonelementary in the size of the separated formulae and . In the next sections we give better constructions.
4 Optimal Separators: Arbitrary Models
We construct doubly exponentially sized separators and provide matching lower bounds.
4.1 Construction
Theorem 5.
If and are modally separable then a separator of size doubly exponential in exists and can be computed in doubly exponential time.
The above is a consequence of the following lemma.
Lemma 2.
For every and , one can construct an -uniform consequence of with branching doubly exponential in and depth linear in .
We show how Theorem 5 follows from Lemma 2. Proposition 2 guarantees that if a modal separator for and exists then there is one with modal depth exponential in . Since entails this separator it follows that is a separator itself.
The branching of is at most doubly exponential in and thus also in : for some constant . The depth of is linear in and therefore for some . Altogether this means that the size of :
is at most doubly exponential in . It remains to prove Lemma 2.
Proof.
Let be the NPTA equivalent to with exponentially many states, which exists due to Theorem 1. For each and we construct of branching such that:
(2) |
for every structure . Then, is our desired -uniform consequence of .
We proceed by induction on . For the base case we put:
which clearly satisfies the induction goal (2). For the induction step define:
Fix , denote the color of the root by and the set of all children of the root by . If satisfies then there is such that nabla of is satisfied in the root. By invariance under bisimulation we may assume that the root of has sufficiently many children to find a separate witness for each . That is, we assume a surjective assignment that maps every to some formula true in . By induction hypothesis, for each with there is a model -bisimilar to the subtree of rooted in . Define as follows: first take the disjoint union of all the ’s and a fresh point of color ; then for every add an edge from to the root of and set as the new root. It is easy to see that and , as desired.
Conversely, assume and witnessed by an -bisimulation and a run . Denote the children of the root of by . Since is a run, the set of states assigned to belongs to . Every is -bisimilar to some and hence by the induction hypothesis satisfies for . Symmetrically, for every there is accepted by . Since that is -bisimilar to some , by induction hypothesis satisfies . It follows that the root of satisfies and therefore also . ∎
Let us remark that Lemma 2 can be easily adapted to deal with vocabulary restrictions. That is, given we could construct similar to but only using atomic propositions from and only entailing -consequences of whose vocabulary is contained in . To that end, it suffices to project-out atomic propositions not in from the automaton and only then proceed with our construction. The automaton obtained by such a projection accepts a model iff is -bisimilar to some . It follows that for every :
where the first equivalence uses and the latter . Such -uniform consequence of can then be taken as a Craig modal separator, in the same way as serves as a modal separator.
4.2 Lower bounds
For the lower bounds, we show that over arbitrary structures (in fact, already binary trees) is doubly exponentially more succinct than . The example is essentially taken from [7, Section 3.1]. There the authors use game-theoretic tools which are later applied to more complicated cases. Since we are only interested in this example, we provide a straightforward self-contained argument.
Proposition 3.
There is a sequence of -formulae of size polynomial in such that each is equivalent to a -formula but every equivalent to has size at least .
Proof.
We only give a sketch, the details are found in Appendix E. We assume two different actions and . For each consider the property:
-
:
“No path (over all actions) longer than starts in the root.”
This can be enforced by encoding an -bit binary counter into the structure of the model, and requiring that on every path the counter values are strictly increasing. Let be this (technically stronger) property expressing the behavior of the encoded counter. Assuming that the encoding is reasonably efficient, can be easily expressed by a -formula of size polynomial in (in fact, a weak fragment of is already sufficient). Since the lengths of paths are bounded, can be also expressed in .
However, every equivalent to has size at least . The reason is that for every sequence of actions and of length , the syntax tree of must contain a descending sequence of subformulae of length such that the -th subformula begins with a modal operator corresponding to the -th action. This allows to embed a binary tree of height into the syntax tree of . ∎
Note that the presence of two different actions and is essential for the argument. We conjecture that is doubly exponentially more succinct than already in the monomodal setting. Consider the following Property , parameterized by :
-
:
“ and there exists a maximal path on which the number of points satisfying is even.”
where is the same as in Proposition 3. It is not difficult to come up with small, that is, of size polynomial in , -formulae expressing . Unfortunately, proving that no small formula can be equivalent to seems difficult. For instance, consider models where every non-leaf point has a child satisfying and a child satisfying . Then a trick similar to the famous example of Potthoff (showing, roughly, that the language of all binary trees of even depth is first-order definable) [18, Example 1] can be exploited to get a modal formula equivalent to (over such models), but of size only single exponential in . Moreover, the results in the next Section 5 show that looking at words only is not sufficient either.
5 Optimal Separators: Word Case
In this section we show that optimal modal separators (over words) can be computed exponentially faster and are exponentially smaller compared to the case with arbitrary models.
Theorem 6.
If and are modally separable over words, then a separator of size exponential in exists and can be computed in exponential time.
As with arbitrary models, Proposition 2 gives an upper bound on the modal depth of a separator and so it suffices to construct -uniform consequences of of small size.
We illustrate the idea first. Consider the classes EVENn and ODDn, of all word structures of length in which proposition is satisfied in an even and odd, respectively, number of points. Constructing modal formulae and defining EVENn and ODDn in the following, naive way leads to exponential formulae since contains both and and :
This exponential blow-up can be avoided, however, using “divide-and-conquer” as follows:
Although several copies of formulae of smaller index are used as well, but since the index is halved, we end up with formulae of roughly quadradic size. The proof of the following analogue of Lemma 2 relies on this idea.
Lemma 3.
For every and every NPWA with states , one can construct a formula which is ’s -uniform consequence over words and has size polynomial in and . The construction requires polynomial time.
To see that Lemma 3 implies Theorem 6, let and admit a modal separator over words. Let be an NPWA that is equivalent to . By Theorem 1, has exponentially many states and can be computed in exponential time. Proposition 2 implies that there is a modal separator of modal depth at most exponential in . As with arbitrary models, ’s -uniform consequence from Lemma 3 is the sought separator. We now prove the lemma.
Proof.
Let be an NPWA. The main idea is to construct, for every and , a formula such that for every input word :
The key step is the recursive splitting similar to the definitions of EVENn and ODDn above. Intuitively, is the disjunction over all of the conditions “there is a run from in the initial position to in position , and a run from in position to in position .” The latter conditions are recursively expressed using and . The constructed formulas , are then used to describe all possible -prefixes of models of . The details of the construction can be found in Appendix F. ∎
We conclude the section with the comment that Theorem 6 is optimal in the sense that there are modally separable formulae which require a large separator. We actually show the following stronger statement implying that, over words, is exponentially more succinct than .
Proposition 4.
There is a sequence of -formulae of size polynomial in such that each is equivalent to a -formula but every equivalent to has size at least .
The proof is entirely standard. The main idea is that, already in one can stipulate (with a small formula) a finite word of exponential length. Clearly, any -formula expressing this requires exponential size. The only difficulty is doing it with a fixed signature: instead of encoding -bit counters using propositions, we use just two propositions and encode numbers in consecutive points.
6 Conclusion and Open Problems
We have studied the problem of deciding separability of -formulae by fixpoint free formulae from , and computing separators if they exist. Our results cover several interesting classes of models such as trees, finite trees, and words. Due to the great expressivity of the results remain valid in the presence of ontologies.
A notably missing case is the class of trees of fixed outdegree independent from formulae. This is surprisingly different from the classes we studied. The key difficulty here lies in the fact that the implication (iii) (ii) from Proposition 1 is not true over such trees.
An intriguing challenge left for future study is to look at extensions of and/or . Natural extensions are inverse modalities, the universal modality, graded modalities, and constants (corresponding to inverse roles, the universal role, counting quantifiers, and nominals in DL speech). We expect the adaptation to inverse modalities to be only minor. Also graded modalities look innocent if they are allowed both in the larger logic and in the separator logic. If we only extend with graded modalities and ask for separators in (without graded modailites), we would have to combine our techniques with the ones from [12], which is potentially challenging. We expect universal modality and/or constants to pose more technical difficulties as well. Intuitively, adding a universal modality or constants leads to the loss of the strong locality underlying Proposition 1.
References
- [1] Alessandro Artale, Jean Christoph Jung, Andrea Mazzullo, Ana Ozaki, and Frank Wolter. Living without beth and craig: Definitions and interpolants in description and modal logics with nominals and role inclusions. ACM Trans. Comput. Log., 24(4):34:1–34:51, 2023.
- [2] Christel Baier and Joost-Pieter Katoen. Principles of model checking. MIT Press, 2008.
- [3] Patrick Blackburn, J. F. A. K. van Benthem, and Frank Wolter, editors. Handbook of Modal Logic, volume 3 of Studies in logic and practical reasoning. North-Holland, 2007.
- [4] Mikołaj Bojańczyk and Wojciech Czerwiński. Automata Toolbox. Univrsity of Warsaw, 2018.
- [5] Giovanna D’Agostino and Marco Hollenberg. Uniform interpolation, automata and the modal -calculus. In Marcus Kracht, Maarten de Rijke, Heinrich Wansing, and Michael Zakharyaschev, editors, Advances in Modal Logic 1, papers from the first workshop on "Advances in Modal logic," held in Berlin, Germany, 8-10 October 1996, pages 73–84. CSLI Publications, 1996.
- [6] Michael J. Fischer and Richard E. Ladner. Propositional dynamic logic of regular programs. J. Comput. Syst. Sci., 18(2):194–211, 1979.
- [7] Tim French, Wiebe van der Hoek, Petar Iliev, and Barteld P. Kooi. On the succinctness of some modal logics. Artif. Intell., 197:56–85, 2013.
- [8] David Janin and Igor Walukiewicz. On the expressive completeness of the propositional mu-calculus with respect to monadic second order logic. In Ugo Montanari and Vladimiro Sassone, editors, CONCUR ’96: Concurrency Theory, pages 263–277, 1996.
- [9] Jean Christoph Jung, Carsten Lutz, Hadrien Pulcini, and Frank Wolter. Logical separability of labeled data examples under ontologies. Artif. Intell., 313:103785, 2022.
- [10] Jean Christoph Jung and Frank Wolter. Living without beth and craig: Definitions and interpolants in the guarded and two-variable fragments. In 36th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2021, Rome, Italy, June 29 - July 2, 2021, pages 1–14. IEEE, 2021.
- [11] Dexter Kozen. Results on the propositional mu-calculus. Theor. Comput. Sci., 27:333–354, 1983.
- [12] Louwe Kuijer, Tony Tan, Frank Wolter, and Michael Zakharyaschev. Separating counting from non-counting in fragments of two-variable first-order logic (extended abstract). In Proc. of DL 2024, 2024.
- [13] Karoliina Lehtinen and Sandra Quickert. Deciding the first levels of the modal mu alternation hierarchy by formula construction. In Stephan Kreutzer, editor, 24th EACSL Annual Conference on Computer Science Logic, CSL 2015, September 7-10, 2015, Berlin, Germany, volume 41 of LIPIcs, pages 457–471. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2015.
- [14] Denis Mayr Lima Martins. Reverse engineering database queries from examples: State-of-the-art, challenges, and research opportunities. Inf. Syst., 83:89–100, 2019.
- [15] Maarten Marx. Interpolation in modal logic. In International Conference on Algebraic Methodology and Software Technology, 1999.
- [16] Martin Otto. Eliminating recursion in the -calculus. In Christoph Meinel and Sophie Tison, editors, STACS 99, 16th Annual Symposium on Theoretical Aspects of Computer Science, Trier, Germany, March 4-6, 1999, Proceedings, volume 1563 of Lecture Notes in Computer Science, pages 531–540. Springer, 1999.
- [17] Thomas Place and Marc Zeitoun. Separating regular languages with first-order logic. Log. Methods Comput. Sci., 12(1), 2016.
- [18] Andreas Potthoff. First-order logic on finite trees. In Theory and Practice of Software Development, 1995.
- [19] Vaughan R. Pratt. A decidable mu-calculus: Preliminary report. In 22nd Annual Symposium on Foundations of Computer Science, Nashville, Tennessee, USA, 28-30 October 1981, pages 421–427. IEEE Computer Society, 1981.
- [20] A. Prasad Sistla and Edmund M. Clarke. The complexity of propositional linear temporal logics. J. ACM, 32(3):733–749, 1985.
- [21] Moshe Y. Vardi. Reasoning about the past with two-way automata. In Kim Guldstrand Larsen, Sven Skyum, and Glynn Winskel, editors, Automata, Languages and Programming, 25th International Colloquium, ICALP’98, Aalborg, Denmark, July 13-17, 1998, Proceedings, volume 1443 of Lecture Notes in Computer Science, pages 628–641. Springer, 1998.
- [22] Yde Venema. Lectures on the modal -calculus, 2020.
- [23] Albert Visser. Uniform interpolation and layered bisimulation. In Petr Hájek, editor, Gödel ’96: Logical Foundations of Mathematics, Computer Science and Physics - Kurt Gödel’s Legacy, Lecture Notes in Logic, page 139–164. Springer, 1996.
Throughout the appendix we use the term tallness. The tallness of a tree is the distance from the root to the closest leaf (or if it the tree has no leafs).
Appendix A Separability Coincides with Craig Separability
We prove Theorem 2 which says that are modally separable iff they are Craig modally separable.
Proof.
Clearly, any Craig modal separator is also a modal separapator.
Conversely, suppose there is a modal separator of , and let be its modal depth. We use the -uniform consequences as defined in Definition 2. Let be the -uniform consequence of and the -uniform consequence of (both exist, see the discussion after Definition 2). Note that . By definition of -uniform consequence, we have and , and thus . Since enjoys Craig interpolation, there is an interpolant for which is also a Craig modal separator of . ∎
Appendix B Model-theoretic Characterization
We prove Proposition 1.
The implication (i) (iv) is straightforward. Separator cannot distinguish models identical up to depth . If then and so which in turn implies .
Our argument for (iv)(iii) uses a classical characterization of the semantics of formulae in terms of parity games. Since we do not want to introduce games we only sketch the (easy) construction; the necessary definitions can be found in [22].
We prove the implication by contrapositive. Assume and identical up to depth . The fact that is equivalent to existence of a winning strategy in an appropriately defined parity game . Positions of the game are pairs: with and subformula of and a move from to is only allowed if . A similar game captures the semantics of .
Take positional winning strategies and for in the games and . We trim : only keep the points that belong to a position chosen by in either a - or -play, and remove all the other ones. After that, also remove points that become inaccessible from the root so that the resulting structure is a tree. The tree obtained this way still satisfies because the strategy remains winning. For every both games together only have positions containing . Thus, by positionality, among the children of only at most many belong to a position chosen by or . Since only such points belong to , it follows that has branching at most . We trim to in the same way. By construction and are identical up to depth which violates (iv).
To prove (iii) (ii) assume towards contradiction that there are models and linked by an -bisimulation . We construct trees and identical up to depth , therefore reaching a contradiction with (iii).
Without loss of generality we assume that and are trees, otherwise they can be unravelled. The tree is as follows. It has universe:
and the pair consisting of the roots of and (which by assumption belongs to ) is taken as the new root. Edges between pairs from are defined pointwise (that is: iff and ). Edges between points in are taken from . On top of that, we add iff in the point is at depth precisely and . The colors are inherited from on and from whichever coordinate on (points linked by always have the same color).
Consider the function defined as the left projection on and identity on :
The graph of is a bisimulation between and . In particular, satisfies . We define satisfying symmetrically. Then and satisfy and and are identical up to depth (the -prefixes of both and are equal ). Technically, and are directed acyclic graphs but not necessarily trees. However, they can be turned into trees by removing inaccessible points and unravelling. Both these operations preserve satisfaction of and and identity of -prefixes.
Appendix C Separability over All Models is in ExpTime
We prove the upper bounds in Theorem 3. By Proposition 1 deciding modal separability boils down to checking if there is for which (iv) holds. Using well-known properties of and (namely, closure under relativization) we may reduce the problem to the special case when models under consideration are full -ary trees, meaning that every point has precisely children. Under this assumption the graphs underlying models are all isomorphic to the (unlabelled) full -ary tree . Hence, we identify models with valuations . Let us call a finite prefix sufficient for separation if for every identical on : implies . Denote the set of such prefixes:
It follows directly from the definition that for every :
The -prefix of is in Condition (iv) is true for . | (3) |
The set can be viewed as a language of finite trees. This language is closed under taking finite supermodels, so it contains for some iff it is nonempty. It follows that separability of and is equivalent to nonemptiness of .
We construct an automaton that accepts finite trees not belonging to . Take automata and equivalent to and of size exponential in . The idea is that given a finite tree the automaton guesses , and that can be extended to a valuation and accepting runs of and of consistent with . has the set as states and is the initial one. Transition function is defined in two steps. First take:
There is , such that the left projection of | ||
This describes guessing a coloring on the full and runs of both and consistent with that coloring. To handle points with less than children we put:
can be obtained from some | ||
by removing some consistent pairs. |
Here consistency of a pair of states means that there exists a model accepted by both and . The (trivial) rank function of assigns a bad rank to every state so that only finite trees are accepted.
To finish the proof let us first prove Proposition 2.
Proof.
Put . The proposition follows directly from (3) and:
is nonempty it contains the -prefix of . | (4) |
Only the left-to-right implication is nontrivial and we prove it by contrapositive. If does not belong to then this is witnessed by a run of on . Since every leaf of is at depth greater then , on the path from the root to some state must repeat. Hence, the run can be pumped to finite prefixes of of arbitrarily big tallness. It follows that is empty. This completes the proof of Proposition 2. ∎
Appendix D Separability over Words is in PSpace
We prove the upper bounds in Theorem 4. We proceed with the same argument as in the case with general models, with two major simplifications. First, over words bisimilarity and identity coincide. Thus the equivalence of items (ii), (iii) and (iv) of Proposition 1 becomes trivial in the word case. Second, bounded tallness (which over words is the same as finiteness) of the language of the appropriate word automaton can be checked more easily. Instead of writing down the automaton we construct it on the fly, and only remember a single state at every moment. This yields a PSpace (and not ExpTime) decision procedure.
Appendix E Lower Bounds over All Models
We construct formulae of for which there exist equivalent formulae in but only doubly exponentially larger.
Consider the nonstandard modal operator which means “there exists a point reachable from the root and satisfying”. It is straightforward to encode such modality (and its dual meaning “for every reachable point”) by a formula of constant size. It therefore suffices to define the sequence in the extension of with such operators.
We construct . For convenience assume atomic propositions (otherwise we encode these without a significant blowup in the size of ). The propositions can be seen as an encoding of an -bit binary counter. Consider the following property of models: the color of the root encodes counter value and for every (reachable) point if the color of encodes number then either and has no children or all ’s children have a color that encodes . The property can be easily expressed by a formula of size polynomial in . Such is equivalent to a modal formula. This follows immediately from Proposition 1 since property implies that there are no paths longer than .
However, no formula smaller than can be equivalent to . To see this assume equivalent to . For every sequence of colors of length , must contain a sequence of subformulae such that:
-
•
is a strict subformula of whenever ;
-
•
each begins with or ;
-
•
the least strict superformula of beginning with a modal operator is .
If did not contain such sequence , it would be indifferent to what happens in points only reachable via paths labelled with . It follows that there is an embedding of the -prefix of the full binary tree to the syntax tree of and so it has size at least .
Note that the use of multiple atomic propositions in the above construction is only for convenience. With any reasonable encoding the proof could be adapted even to the setting with .
Appendix F Optimal Separators over Words
We prove Lemma 3. Recall that is the automaton under consideration. Since we are working over words, contains only singleton sets and possibly the emtpy set. The latter case, , is of particular interest because this means that the automaton in state reading color can “accept” even if it has not finished reading the input; in particular, the automaton can accept finite words as well. Denote with the set of colors with . Further denote with the set of all such that accepts a word starting with color .
We first construct formulae , for such that for every word :
-
iff there is a run of from to on the -prefix of .
The definition is by induction on :
It is routine to verify that satisfies and is of size .
We finish the construction by setting:
It is readily checked that satisfies the required size bounds. To verify that for every with , we show the following equivalence for all words :
(5) |
For , let be a word structure with . If for some , then by , there is a run of from the initial state to some state when reading the -prefix of , and the last color in the prefix is . Since , we can extend the -prefix of to a word accepted by . If , for some and , then is a finite word of depth that is accepted by the automaton. We can take in this case.
For , let be a word such that there is some with . The former condition implies that and thus there is an accepting run of on , and the latter implies that and coincide on their -prefixes. We distinguish cases. If the depth of is greater than , then the -prefix of ending in state witnesses . Otherwise, the depth of is and the run ending in witnesses that .