Modal Separability of Fixpoint Formulae

Jean Christoph Jung    Jędrzej Kołodziejski
(TU Dortmund University)
Abstract

We study modal separability for fixpoint formulae: given two mutually exclusive fixpoint formulae φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, decide whether there is a modal formula ψ𝜓\psiitalic_ψ that separates them, that is, that satisfies φψ¬φmodels𝜑𝜓modelssuperscript𝜑\varphi\models\psi\models\neg\varphi^{\prime}italic_φ ⊧ italic_ψ ⊧ ¬ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. This problem has applications for finding simple reasons for inconsistency. Our main contributions are tight complexity bounds for deciding modal separability and optimal ways to compute a separator if it exists. More precisely, it is ExpTime-complete in general and PSpace-complete over words. Separators can be computed in doubly exponential time in general and in exponential time over words, and this is optimal as well. The results for general structures transfer to arbitrary, finitely branching, and finite trees. The word case results hold for finite, infinite, and arbitrary words.

1 Introduction

For given logics ,+superscript\mathcal{L},\mathcal{L}^{+}caligraphic_L , caligraphic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT, the \mathcal{L}caligraphic_L-separability problem for +superscript\mathcal{L}^{+}caligraphic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT is to decide given two +superscript\mathcal{L}^{+}caligraphic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT-formulae φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT whether there is an \mathcal{L}caligraphic_L-formula ψ𝜓\psiitalic_ψ that separates φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT in the sense that φψmodels𝜑𝜓\varphi\models\psiitalic_φ ⊧ italic_ψ and ψ¬φmodels𝜓superscript𝜑\psi\models\neg\varphi^{\prime}italic_ψ ⊧ ¬ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Obviously, a separator can only exist when φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are mutually exclusive, and the problem is only meaningful when \mathcal{L}caligraphic_L is less expressive than +superscript\mathcal{L}^{+}caligraphic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT. Intuitively, a separator formulated in a “simpler” logic \mathcal{L}caligraphic_L explains a given inconsistency in a “complicated” logic +superscript\mathcal{L}^{+}caligraphic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT. Note that, for logics +superscript\mathcal{L}^{+}caligraphic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT closed under negation, \mathcal{L}caligraphic_L-separability generalizes the \mathcal{L}caligraphic_L-definability problem for +superscript\mathcal{L}^{+}caligraphic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT: decide whether a given +superscript\mathcal{L}^{+}caligraphic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT-formula is equivalent to an \mathcal{L}caligraphic_L-formula. Indeed, φ+𝜑superscript\varphi\in\mathcal{L}^{+}italic_φ ∈ caligraphic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT is equivalent to an \mathcal{L}caligraphic_L-formula iff φ𝜑\varphiitalic_φ and ¬φ𝜑\neg\varphi¬ italic_φ are \mathcal{L}caligraphic_L-separable. Since separability is more general than definability, solving it requires an even better understanding of the expressive power of the logics under consideration.

Example 1.

Consider \mathcal{L}caligraphic_L being the modal logic 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML, also known under the name 𝒜𝒞𝒜𝒞{\cal ALC}caligraphic_A caligraphic_L caligraphic_C in the context of description logics. Expressions of the logic (called formulae in 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML terminology and concepts in description logic parlance) describe properties of colored, directed graphs with a distinguished point called the root. As the more expressive +superscript\mathcal{L}^{+}caligraphic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT take 𝖯𝖣𝖫𝖯𝖣𝖫\mathsf{PDL}sansserif_PDL: the extension of 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML with regular modalities (in DL terms: the extension 𝒜𝒞reg𝒜subscript𝒞reg{\cal ALC}_{\textit{reg}}caligraphic_A caligraphic_L caligraphic_C start_POSTSUBSCRIPT reg end_POSTSUBSCRIPT of 𝒜𝒞𝒜𝒞{\cal ALC}caligraphic_A caligraphic_L caligraphic_C with regular role expressions). Assume the graphs under consideration have edges labelled with colors A𝐴Aitalic_A, B𝐵Bitalic_B and C𝐶Citalic_C and consider properties:

  • P𝑃Pitalic_P:

    “There is a path from the root whose labeling belongs to A+Bsuperscript𝐴𝐵A^{+}Bitalic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_B.”

  • Psuperscript𝑃P^{\prime}italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT:

    “The labeling of every (finite) path from the root belongs to Csuperscript𝐶C^{*}italic_C start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT.”

These (contradictory) properties are expressed by 𝖯𝖣𝖫𝖯𝖣𝖫\mathsf{PDL}sansserif_PDL-formulae φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and it is easy to see that none of them can be expressed in the weaker 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML. Nonetheless, φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are separated by a simple 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-formula ψ𝜓\psiitalic_ψ that says: “there is an A𝐴Aitalic_A-labelled edge from the root”. Thus, ψ𝜓\psiitalic_ψ serves as an easy explanation of the inconsistency of φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

Generalizing the example, in this paper we investigate 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-separability of formulae in the modal μ𝜇\muitalic_μ-calculus μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML [19, 11], which extends 𝖯𝖣𝖫𝖯𝖣𝖫\mathsf{PDL}sansserif_PDL [6]. μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML is a general framework capturing logics supporting fixpoints that is relevant both for knowledge representation and reasoning and for verification. It describes all bisimulation-invariant properties definable in 𝖬𝖲𝖮𝖬𝖲𝖮{\sf MSO}sansserif_MSO [8, Theorem 11] and thus encompasses virtually all specification languages such as 𝖫𝖳𝖫𝖫𝖳𝖫\mathsf{LTL}sansserif_LTL and 𝖢𝖳𝖫𝖢𝖳𝖫\mathsf{CTL}sansserif_CTL [2].

Our results generalize the 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-definability problem for μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML which was shown decidable by Otto [16, Main Theorem]. The adaptation of the argument to the more general separability is relatively easy. However, Otto’s paper is focussed on deciding the existence of modal definitions. The problem of computing a definition when it exists is not discussed, and it seems that the formula which can be read off from the proof is at least tower-exponential big. This issue was addressed in [13]. Unfortunately, in this case the approach, although constructive, does not easily generalize from definability to separability. Matching lower bounds are also missing. We fill the gap by providing a procedure which is both fully constructive and works for the more general separation case. Both the constructed formulae and the running time are optimal, as illustrated by suitable examples and reductions.

We consider both general models and “word models” which are Kripke structures in which each point has at most one successor. The latter are relevant from a verification perspective and for temporal reasoning. In order to obtain our results we first prove model-theoretic characterizations in terms of bisimulations. We then exploit the close connection of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML to nondeterministic parity tree automata to give (1) optimal procedures for the separability problem and (2) upper bounds on the modal depth of a separator, if it exists. In (1) we show ExpTime-completeness of modal separability in general and PSpace-completeness over words. The lower bounds are essentially inherited from satisfiability. The upper bounds derived in (2) are then used together with the automata to compute so-called n𝑛nitalic_n-uniform consequences, that is, modal formulae that have exactly the same modal consequences as a given μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formula, up to modal depth n𝑛nitalic_n. These n𝑛nitalic_n-uniform consequences are then used as separators. Also here, our procedures are optimal: they compute separators of at most double exponential size, and we show that there are μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formulae that are expressible in 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML but any equivalent 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-formulae must have doubly exponential size. This means that there is a double exponential succinctness gap between μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML and 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML. In the word case, our procedures compute exponentially sized separators and there is only an exponential succinctness gap. All lower bounds (both computational and succinctness) already hold for 𝖯𝖣𝖫𝖯𝖣𝖫\mathsf{PDL}sansserif_PDL (𝒜𝒞reg𝒜subscript𝒞reg{\cal ALC}_{\textit{reg}}caligraphic_A caligraphic_L caligraphic_C start_POSTSUBSCRIPT reg end_POSTSUBSCRIPT) in place of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML, and for definability in place of separability.

It is interesting to note that our results hold over classes of models definable by μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formulae. This observation allows us to cover the more general notion of separation in presence of an ontology (i.e. a background theory imposing some conditions on models). As long as the ontology is expressible in μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML, separability and computation of separators reduce to the ontology-free setting. Without much effort the same observation lets us transfer our results to finite words, infinite words, and finite trees.

Missing proofs can be found in the appendix.

Related Work.

Separability has been intensively studied in formal language theory. A seminal result in this area is that separability of regular word languages by a first-order language is decidable in ExpTime [17]. Since μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML over words defines precisely the regular languages and first-order logic captures 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML, this is particularly related to our results over words.

In logic, a recent work investigates the complexity of separating between formulae supporting counting quantifiers by formulae that do not support them [12]. The used techniques exploit compactness, which makes them inapplicable to our case and inherently non-constructive.

Another related problem is the question of interpolant existence. An interpolant of two \mathcal{L}caligraphic_L-formulae φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is an \mathcal{L}caligraphic_L-formula ψ𝜓\psiitalic_ψ with φψφmodels𝜑𝜓modelssuperscript𝜑\varphi\models\psi\models\varphi^{\prime}italic_φ ⊧ italic_ψ ⊧ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and such that the signature of ψ𝜓\psiitalic_ψ is contained in the signatures of both φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Thus, the problem resembles separability but the restriction on ψ𝜓\psiitalic_ψ is in terms of the signature instead of in terms of the logic. Sometimes this question reduces to entailment, as many logics enjoy the Craig interpolation property: an interpolant of φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT exists whenever φφmodels𝜑superscript𝜑\varphi\models\varphi^{\prime}italic_φ ⊧ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Interpolant existence for logics that lack Craig interpolation has recently been studied in [10, 1]. The used tools, however, are similar in nature to the ones from [12] and therefore inapplicable to our problem.

Finally, a related problem is separability of data examples. There, the task is to separate sets of pointed structures instead of formulae (see [14, 9] and the references therein). Separability of data examples can be cast as an instance of (our logical notion of) separability if +superscript\mathcal{L}^{+}caligraphic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT is expressive enough to describe the data examples. Conversely, \mathcal{L}caligraphic_L-separability of formulae φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is the same as data separability of the (possibly infinite) sets of their models by an \mathcal{L}caligraphic_L-formula.

2 Preliminaries

Assuming familiarity of the reader with modal logic and the modal μ𝜇\muitalic_μ-calculus, we recall here only the main notions and refer to [3] for more details.

Syntax.

We consider modal logic 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML and its fixpoint extension μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML over a modal signature consisting of two finite sets: actions 𝖠𝖼𝗍𝖠𝖼𝗍\mathsf{Act}sansserif_Act and propositions 𝖯𝗋𝗈𝗉𝖯𝗋𝗈𝗉\mathsf{Prop}sansserif_Prop. The syntax of 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML is given as:

φ::=||τ|¬τ|φφ|φφ|𝖺φ|[𝖺]φ\varphi::=\top\ |\ \bot\ |\ \tau\ |\ \neg\tau\ |\ \varphi\vee\varphi\ |\ % \varphi\wedge\varphi\ |\ \langle\mathsf{a}\rangle\varphi\ |\ [\mathsf{a}]\varphi\ italic_φ : := ⊤ | ⊥ | italic_τ | ¬ italic_τ | italic_φ ∨ italic_φ | italic_φ ∧ italic_φ | ⟨ sansserif_a ⟩ italic_φ | [ sansserif_a ] italic_φ

with τ𝖯𝗋𝗈𝗉𝜏𝖯𝗋𝗈𝗉\tau\in\mathsf{Prop}italic_τ ∈ sansserif_Prop and 𝖺𝖠𝖼𝗍𝖺𝖠𝖼𝗍\mathsf{a}\in\mathsf{Act}sansserif_a ∈ sansserif_Act. If 𝖠𝖼𝗍={𝖺}𝖠𝖼𝗍𝖺\mathsf{Act}=\{\mathsf{a}\}sansserif_Act = { sansserif_a } is a singleton, we use φ𝜑\Diamond\varphi◇ italic_φ and φ𝜑\Box\varphi□ italic_φ in place of 𝖺φdelimited-⟨⟩𝖺𝜑\langle\mathsf{a}\rangle\varphi⟨ sansserif_a ⟩ italic_φ and [𝖺]φdelimited-[]𝖺𝜑[\mathsf{a}]\varphi[ sansserif_a ] italic_φ. The syntax of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML is obtained by extending the above with additional clauses:

φ::=x|μx.φ|νx.φ\varphi::=x\ |\ \mu x.\varphi\ |\ \nu x.\varphiitalic_φ : := italic_x | italic_μ italic_x . italic_φ | italic_ν italic_x . italic_φ

where x𝑥xitalic_x belongs to a fixed set 𝖵𝖺𝗋𝖵𝖺𝗋\mathsf{Var}sansserif_Var of variables. The restriction to a fixed finite signature is only for the sake of readability. All results in the paper remain true with arbitrary signature.

Semantics.

The models we consider are pointed Kripke structures. That is, a model \mathcal{M}caligraphic_M consists of a set M𝑀Mitalic_M (called its universe) with a distinguished point vIMsubscript𝑣𝐼𝑀v_{I}\in Mitalic_v start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ∈ italic_M called the root, an interpretation 𝖺M×M{\stackrel{{\scriptstyle\mathsf{a}}}{{\to}}}\subseteq M\times Mstart_RELOP SUPERSCRIPTOP start_ARG → end_ARG start_ARG sansserif_a end_ARG end_RELOP ⊆ italic_M × italic_M for every 𝖺𝖠𝖼𝗍𝖺𝖠𝖼𝗍\mathsf{a}\in\mathsf{Act}sansserif_a ∈ sansserif_Act and a valuation 𝗏𝖺𝗅:M𝒫(𝖯𝗋𝗈𝗉):𝗏𝖺𝗅𝑀𝒫𝖯𝗋𝗈𝗉\mathsf{val}:M\to\mathcal{P}(\mathsf{Prop})sansserif_val : italic_M → caligraphic_P ( sansserif_Prop ). We call the set 𝒫(𝖯𝗋𝗈𝗉)𝒫𝖯𝗋𝗈𝗉\mathcal{P}(\mathsf{Prop})caligraphic_P ( sansserif_Prop ) colors and denote it by ΣΣ\Sigmaroman_Σ. Both 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML and μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML are interpreted in points of models in a standard way. Since models are by definition pointed we write φmodels𝜑\mathcal{M}\models\varphicaligraphic_M ⊧ italic_φ meaning that the root of \mathcal{M}caligraphic_M satisfies φ𝜑\varphiitalic_φ. The same symbol denotes entailment: φψmodels𝜑𝜓\varphi\models\psiitalic_φ ⊧ italic_ψ means that every model of φ𝜑\varphiitalic_φ is a model of ψ𝜓\psiitalic_ψ. In the case only models from some fixed class 𝒞𝒞\mathcal{C}caligraphic_C are considered we talk about satisfiability and entailment over 𝒞𝒞\mathcal{C}caligraphic_C and in the latter case write φ𝒞ψsubscriptmodels𝒞𝜑𝜓\varphi\models_{\mathcal{C}}\psiitalic_φ ⊧ start_POSTSUBSCRIPT caligraphic_C end_POSTSUBSCRIPT italic_ψ.

A particularly relevant class of models are trees. A model \mathcal{M}caligraphic_M is a tree if the underlying directed graph (M,{𝖺|𝖺𝖠𝖼𝗍})(M,\bigcup\{\stackrel{{\scriptstyle\mathsf{a}}}{{\to}}\ |\ \mathsf{a}\in% \mathsf{Act}\})( italic_M , ⋃ { start_RELOP SUPERSCRIPTOP start_ARG → end_ARG start_ARG sansserif_a end_ARG end_RELOP | sansserif_a ∈ sansserif_Act } ) is a tree with vIsubscript𝑣𝐼v_{I}italic_v start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT as its root. The branching or outdegree of a point is the number of its children in this underlying graph. The class of all trees is denoted by 𝖳𝗋𝖾𝖾𝗌𝖳𝗋𝖾𝖾𝗌\mathsf{Trees}sansserif_Trees. We identify words (both finite and infinite) over alphabet ΣΣ\Sigmaroman_Σ with trees over a single action of outdegree at most one. Points of such models are interpreted as positions in the word, the unique accessibility relation represents the successor relation, and the valuation determines the letter at each position. A prefix of a tree is a subset of its universe closed under taking ancestors. When no confusion arises we identify a prefix NM𝑁𝑀N\subseteq Mitalic_N ⊆ italic_M with the induced subtree 𝒩𝒩\mathcal{N}caligraphic_N of \mathcal{M}caligraphic_M that has N𝑁Nitalic_N as its universe. The depth of a point is the distance from the root. The prefix of depth n𝑛nitalic_n (or just n𝑛nitalic_n-prefix) is the set of all points at depth at most n𝑛nitalic_n and is denoted by M|nM_{|_{n}}italic_M start_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT (and the corresponding subtree by |n\mathcal{M}_{|_{n}}caligraphic_M start_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT).

We define bisimulations and bisimilarity in a standard way except that in the case of trees for convenience we assume that bisimulations only link points at the same depth. An n𝑛nitalic_n-step bisimulation (or just n𝑛nitalic_n-bisimulation) between trees \mathcal{M}caligraphic_M and 𝒩𝒩\mathcal{N}caligraphic_N is a bisimulation between their n𝑛nitalic_n-prefixes. We denote n𝑛nitalic_n-bisimilarity by nsuperscriptleftrightarroweq𝑛\leftrightarroweq^{n}start_ARROW start_RELOP SUPERSCRIPTOP start_ARG - end_ARG start_ARG ↔ end_ARG end_RELOP end_ARROW start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT.

Size of formulae.

The size of a formula φ𝜑\varphiitalic_φ, denoted |φ|𝜑|\varphi|| italic_φ |, is the number of nodes in its syntax tree. Similarly, its depth is the maximal length of paths in the syntax tree. The depth of a formula should not be confused with its modal depth which is the maximal nesting of modal operators; all formulae of modal depth at most n𝑛nitalic_n are denoted 𝖬𝖫nsuperscript𝖬𝖫𝑛\mathsf{ML}^{n}sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT.

When we specify formulae in the paper, we use syntactic sugar ΦΦ\bigvee\Phi⋁ roman_Φ, ΦΦ\bigwedge\Phi⋀ roman_Φ, and nabla ΦΦ\nabla\Phi∇ roman_Φ, for finite sets of formulae ΦΦ\Phiroman_Φ. The first two are self-explanatory and allow for higher branching in the syntax tree. The last one, ΦΦ\nabla\Phi∇ roman_Φ, intuitively means that “every formula in ΦΦ\Phiroman_Φ is true in some child and every child satisfies some formula from ΦΦ\Phiroman_Φ” and is an abbreviation for

Φ=φΦφφΦφ.Φsubscript𝜑Φ𝜑subscript𝜑Φ𝜑\displaystyle\textstyle\nabla\Phi=\bigwedge_{\varphi\in\Phi}\Diamond\varphi% \wedge\Box\bigvee_{\varphi\in\Phi}\varphi.∇ roman_Φ = ⋀ start_POSTSUBSCRIPT italic_φ ∈ roman_Φ end_POSTSUBSCRIPT ◇ italic_φ ∧ □ ⋁ start_POSTSUBSCRIPT italic_φ ∈ roman_Φ end_POSTSUBSCRIPT italic_φ . (1)

It is well-known that ΦΦ\bigvee\Phi⋁ roman_Φ and ΦΦ\bigwedge\Phi⋀ roman_Φ can be rewritten into basic syntax under polynomial cost. We also include the colors ΣΣ\Sigmaroman_Σ directly in the syntax: cΣ𝑐Σc\in\Sigmaitalic_c ∈ roman_Σ is a shorthand for the formula {τ,¬τ|τc,τc}conditional-set𝜏superscript𝜏formulae-sequence𝜏𝑐superscript𝜏𝑐\bigwedge\{\tau,\neg\tau^{\prime}\ |\ \tau\in c,\tau^{\prime}\notin c\}⋀ { italic_τ , ¬ italic_τ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_τ ∈ italic_c , italic_τ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∉ italic_c }. Rewriting colors increases the size only by a factor linear in |𝖯𝗋𝗈𝗉|𝖯𝗋𝗈𝗉|\mathsf{Prop}|| sansserif_Prop |.

Automata.

Throughout the paper we use automata over tree models of both bounded and arbitrary outdegree. A nondeterministic parity tree automaton (NPTA) is a tuple 𝒜=(Q,Σ,qI,δ,𝗋𝖺𝗇𝗄)𝒜𝑄Σsubscript𝑞𝐼𝛿𝗋𝖺𝗇𝗄\mathcal{A}=(Q,\Sigma,q_{I},\delta,\mathsf{rank})caligraphic_A = ( italic_Q , roman_Σ , italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT , italic_δ , sansserif_rank ) where Q𝑄Qitalic_Q is a finite set of states, qIQsubscript𝑞𝐼𝑄q_{I}\in Qitalic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ∈ italic_Q is the initial state, ΣΣ\Sigmaroman_Σ is the alphabet fixed above, and 𝗋𝖺𝗇𝗄𝗋𝖺𝗇𝗄\mathsf{rank}sansserif_rank assigns each state a priority. The transition function δ𝛿\deltaitalic_δ is of type:

δ:Q×Σ𝒫(𝒫(Q)).:𝛿𝑄Σ𝒫𝒫𝑄\delta:Q\times\Sigma\to\mathcal{P}(\mathcal{P}(Q)).italic_δ : italic_Q × roman_Σ → caligraphic_P ( caligraphic_P ( italic_Q ) ) .

Intuitively, δ(q,c)={S1,,Sl}𝛿𝑞𝑐subscript𝑆1subscript𝑆𝑙\delta(q,c)=\{S_{1},...,S_{l}\}italic_δ ( italic_q , italic_c ) = { italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_S start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT } means that in the state q𝑞qitalic_q upon reading color c𝑐citalic_c the automaton (i) chooses a transition Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and (ii) labels all the children of the current point with states from Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT so that every pSi𝑝subscript𝑆𝑖p\in S_{i}italic_p ∈ italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is assigned to some child. A run of 𝒜𝒜\mathcal{A}caligraphic_A on a tree \mathcal{M}caligraphic_M is an assignment ρ:MQ:𝜌𝑀𝑄\rho:M\to Qitalic_ρ : italic_M → italic_Q consistent with δ𝛿\deltaitalic_δ in such sense and sending the root of the tree to qIsubscript𝑞𝐼q_{I}italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT. The run is accepting if for every infinite path v0,v1subscript𝑣0subscript𝑣1v_{0},v_{1}\ldotsitalic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT … in \mathcal{M}caligraphic_M the sequence 𝗋𝖺𝗇𝗄(ρ(v0)),𝗋𝖺𝗇𝗄(ρ(v1)),𝗋𝖺𝗇𝗄𝜌subscript𝑣0𝗋𝖺𝗇𝗄𝜌subscript𝑣1\mathsf{rank}(\rho(v_{0})),\mathsf{rank}(\rho(v_{1})),\ldotssansserif_rank ( italic_ρ ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) , sansserif_rank ( italic_ρ ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) , … satisfies the parity condition. We write 𝒜models𝒜\mathcal{M}\models\mathcal{A}caligraphic_M ⊧ caligraphic_A in case 𝒜𝒜\mathcal{A}caligraphic_A has an accepting run on \mathcal{M}caligraphic_M. An automaton that is identical to 𝒜𝒜\mathcal{A}caligraphic_A except that the original initial state is replaced with q𝑞qitalic_q is denoted 𝒜[qIq]𝒜delimited-[]subscript𝑞𝐼𝑞\mathcal{A}[q_{I}\mapsfrom q]caligraphic_A [ italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ↤ italic_q ]. We refer with NPWA to an NPTA working over words.

In NPTAs over trees of bounded outdegree k𝑘kitalic_k it might be more common to use a transition function of type δ:Q×Σ𝒫(Qk):𝛿𝑄Σ𝒫superscript𝑄𝑘\delta:Q\times\Sigma\to\mathcal{P}(Q^{k})italic_δ : italic_Q × roman_Σ → caligraphic_P ( italic_Q start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ), but the difference is not essential: our NPTAs can be represented in this way and conversely, all relevant constructions for such NPTAs can be adapted to our setting. Most importantly, we rely on the following classical result (see for example the discussion in [21] and the well-presented Dealternation Theorem 5.7 in [4]):

Theorem 1.

For every μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formula φ𝜑\varphiitalic_φ, we can construct an equivalent NPTA 𝒜𝒜\mathcal{A}caligraphic_A, that is, φmodels𝜑\mathcal{M}\models\varphicaligraphic_M ⊧ italic_φ iff 𝒜models𝒜\mathcal{M}\models\mathcal{A}caligraphic_M ⊧ caligraphic_A, for every tree \mathcal{M}caligraphic_M, with number of states at most exponential in |φ|𝜑|\varphi|| italic_φ |. If we consider models of bounded outdegree k𝑘kitalic_k then 𝒜𝒜\mathcal{A}caligraphic_A is computed in exponential time, otherwise in doubly exponential time.

3 Foundations of Separability

We start with recalling the notion of separability and discuss some of its basic properties.

Definition 1.

Given φ,φμ𝖬𝖫𝜑superscript𝜑𝜇𝖬𝖫\varphi,\varphi^{\prime}\in\mu\mathsf{ML}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_μ sansserif_ML, a modal separator of φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is ψ𝖬𝖫𝜓𝖬𝖫\psi\in\mathsf{ML}italic_ψ ∈ sansserif_ML with φψmodels𝜑𝜓\varphi\models\psiitalic_φ ⊧ italic_ψ and ψ¬φmodels𝜓superscript𝜑\psi\models\neg\varphi^{\prime}italic_ψ ⊧ ¬ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. It is a modal separator over a class 𝒞𝒞\mathcal{C}caligraphic_C if φ𝒞ψsubscriptmodels𝒞𝜑𝜓\varphi\models_{\mathcal{C}}\psiitalic_φ ⊧ start_POSTSUBSCRIPT caligraphic_C end_POSTSUBSCRIPT italic_ψ and ψ𝒞¬φsubscriptmodels𝒞𝜓superscript𝜑\psi\models_{\mathcal{C}}\neg\varphi^{\prime}italic_ψ ⊧ start_POSTSUBSCRIPT caligraphic_C end_POSTSUBSCRIPT ¬ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

The notion induces the problem of modal separability: given two μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formulae φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, decide whether a modal separator exists. Clearly, 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-definability of φ𝜑\varphiitalic_φ or φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is a sufficient condition for the existence of a modal separator between φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. However, Example 1 shows that it is not a necessary one: neither φ𝜑\varphiitalic_φ nor φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-definable, yet a separator exists. We make some foundational observations.

Inspired by the notion of Craig interpolation, one could also consider the notion of a Craig modal separator, which is a modal separator ψ𝜓\psiitalic_ψ of φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT which only uses symbols occurring in both φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. However, based on the fact that 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML enjoys Craig interpolation, we show in Theorem 2 (proof in Appendix A) that Craig modal separability and modal separability coincide. Since 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML enjoys Craig interpolation over many classes of models [15, Theorem 1], Theorem 2 remains true over all classes of models considered below. We thus focus on modal separability.

Theorem 2.

φ,φμ𝖬𝖫𝜑superscript𝜑𝜇𝖬𝖫\varphi,\varphi^{\prime}\in\mu\mathsf{ML}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_μ sansserif_ML admit a modal separator iff they admit a Craig modal separator.

Inspired by the notion of uniform interpolation [23, 5], it is natural to ask whether every φμ𝖬𝖫𝜑𝜇𝖬𝖫\varphi\in\mu\mathsf{ML}italic_φ ∈ italic_μ sansserif_ML admits a uniform modal separator, that is, a formula ψ𝖬𝖫𝜓𝖬𝖫\psi\in\mathsf{ML}italic_ψ ∈ sansserif_ML that is a modal separator of φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT for every φμ𝖬𝖫superscript𝜑𝜇𝖬𝖫\varphi^{\prime}\in\mu\mathsf{ML}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_μ sansserif_ML with φ¬φmodels𝜑superscript𝜑\varphi\models\neg\varphi^{\prime}italic_φ ⊧ ¬ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. However, substituting ¬φ𝜑\neg\varphi¬ italic_φ for φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT we get that the uniform modal separator ψ𝜓\psiitalic_ψ for φ𝜑\varphiitalic_φ is actually equivalent to φ𝜑\varphiitalic_φ. Consequently, a μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formula has a uniform modal separator iff it is modally definable. This is contrast with the fact that both 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML [23] and μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML [5] enjoy uniform interpolation.

Since μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML has both the finite model property and the (finitely branching) tree model property, the notions of a modal separator over finite models, arbitrary tree models, and finitely branching tree models all coincide with modal separator (over arbitrary models). Unsurprisingly, this does not apply to the class of all finite trees.

Example 2.

Consider a μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formula φ=νx.xformulae-sequencesubscript𝜑𝜈𝑥𝑥\varphi_{\infty}=\nu x.\Diamond xitalic_φ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT = italic_ν italic_x . ◇ italic_x expressing that there exists an infinite path originating in the root. It is satisfiable, but unsatisfiable over finite trees. Thus bottom\bot is an 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-definition of φsubscript𝜑\varphi_{\infty}italic_φ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT over finite trees, but φsubscript𝜑\varphi_{\infty}italic_φ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT is not 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-definable (over arbitrary models).

We deal with separability over finite trees as follows. Call a class 𝒞𝒞\mathcal{C}caligraphic_C of models μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-definable in 𝒟𝒟\mathcal{D}caligraphic_D if there is a μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formula θ𝜃\thetaitalic_θ such that 𝒞𝒞\mathcal{M}\in\mathcal{C}caligraphic_M ∈ caligraphic_C iff θmodels𝜃\mathcal{M}\models\thetacaligraphic_M ⊧ italic_θ, for all models 𝒟𝒟\mathcal{M}\in\mathcal{D}caligraphic_M ∈ caligraphic_D.

Lemma 1.

Let 𝒞𝒞\mathcal{C}caligraphic_C be μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-definable in 𝒟𝒟\mathcal{D}caligraphic_D by θ𝜃\thetaitalic_θ and let ψ𝖬𝖫𝜓𝖬𝖫\psi\in\mathsf{ML}italic_ψ ∈ sansserif_ML. Then ψ𝜓\psiitalic_ψ is a modal separator of φ,φμ𝖬𝖫𝜑superscript𝜑𝜇𝖬𝖫\varphi,\varphi^{\prime}\in\mu\mathsf{ML}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_μ sansserif_ML over 𝒞𝒞\mathcal{C}caligraphic_C iff ψ𝜓\psiitalic_ψ is a modal separator of θφ𝜃𝜑\theta\wedge\varphiitalic_θ ∧ italic_φ and θφ𝜃superscript𝜑\theta\wedge\varphi^{\prime}italic_θ ∧ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over 𝒟𝒟\mathcal{D}caligraphic_D.

Intuitively, Lemma 1 provides us with a reduction of modal separability over 𝒞𝒞\mathcal{C}caligraphic_C to modal separability over (the larger) 𝒟𝒟\mathcal{D}caligraphic_D. It has a number of interesting consequences. First, observe that the formula ¬φsubscript𝜑\neg\varphi_{\infty}¬ italic_φ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT from Example 2 defines the class of finite trees in the class of all finitely branching trees. Hence ¬φsubscript𝜑\neg\varphi_{\infty}¬ italic_φ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT provides a reduction of modal separability over finite trees to modal separability over finitely branching trees, and thus to modal separability. Similarly, and again using φsubscript𝜑\varphi_{\infty}italic_φ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT, Lemma 1 reduces modal separability over finite words and over infinite words to modal separability over (arbitrary) words. Finally, the lemma can be used to reduce modal separability relative to background knowledge to modal separability. Call ψ𝜓\psiitalic_ψ a modal separator of φ,φμ𝖬𝖫𝜑superscript𝜑𝜇𝖬𝖫\varphi,\varphi^{\prime}\in\mu\mathsf{ML}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_μ sansserif_ML relative to θ0μ𝖬𝖫subscript𝜃0𝜇𝖬𝖫\theta_{0}\in\mu\mathsf{ML}italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_μ sansserif_ML if it is a modal separator of φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over the class of models satisfying θ0subscript𝜃0\theta_{0}italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT in every point. This setting is most relevant for the DL community since θ0subscript𝜃0\theta_{0}italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT plays the role of an ontology. In particular, the question whether two 𝒜𝒞reg𝒜subscript𝒞reg{\cal ALC}_{\textit{reg}}caligraphic_A caligraphic_L caligraphic_C start_POSTSUBSCRIPT reg end_POSTSUBSCRIPT-concepts φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are separable by an 𝒜𝒞𝒜𝒞{\cal ALC}caligraphic_A caligraphic_L caligraphic_C-concept relative to an 𝒜𝒞reg𝒜subscript𝒞reg{\cal ALC}_{\textit{reg}}caligraphic_A caligraphic_L caligraphic_C start_POSTSUBSCRIPT reg end_POSTSUBSCRIPT-ontology is an instance of that problem (recall that every 𝒜𝒞reg𝒜subscript𝒞reg{\cal ALC}_{\textit{reg}}caligraphic_A caligraphic_L caligraphic_C start_POSTSUBSCRIPT reg end_POSTSUBSCRIPT-concept can be expressed as a μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formula). Let θ𝜃\thetaitalic_θ be the μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formula expressing that θ0subscript𝜃0\theta_{0}italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is satisfied in every point reachable via the accessibility relations. Using Lemma 1 and bisimulation invariance of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML, it is routine to verify that ψ𝜓\psiitalic_ψ is a modal separator of φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT relative to θ0subscript𝜃0\theta_{0}italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT iff ψ𝜓\psiitalic_ψ is a modal separator of θφ𝜃𝜑\theta\wedge\varphiitalic_θ ∧ italic_φ and θφ𝜃superscript𝜑\theta\wedge\varphi^{\prime}italic_θ ∧ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

In view of what was said so far, we will from now on concentrate on deciding modal separability over general and word models and computing separators if they exist. A main ingredient for both tasks is to show that if there is a modal separator for μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formula φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, then there is one of modal depth n𝑛nitalic_n at most exponential in the size of φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. As a necessary tool for showing this exponential bound on n𝑛nitalic_n, and for efficiently deciding if a given n𝑛nitalic_n suffices, in Appendix B we establish the following model-theoretic characterization. Fix φ,φμ𝖬𝖫𝜑superscript𝜑𝜇𝖬𝖫\varphi,\varphi^{\prime}\in\mu\mathsf{ML}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_μ sansserif_ML for the rest of the paper and denote their size by k=|φ|+|φ|𝑘𝜑superscript𝜑k=|\varphi|+|\varphi^{\prime}|italic_k = | italic_φ | + | italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT |.

Proposition 1.

Let n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N. The following are equivalent:

  1. (i)

    There is ψ𝖬𝖫𝜓𝖬𝖫\psi\in\mathsf{ML}italic_ψ ∈ sansserif_ML of modal depth n𝑛nitalic_n separating φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT;

  2. (ii)

    For all models \mathcal{M}caligraphic_M and superscript\mathcal{M}^{\prime}caligraphic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT bisimilar up to depth n𝑛nitalic_n: φmodels𝜑\mathcal{M}\models\varphicaligraphic_M ⊧ italic_φ implies ⊧̸φnot-modelssuperscriptsuperscript𝜑\mathcal{M}^{\prime}\not\models\varphi^{\prime}caligraphic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧̸ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT;

  3. (iii)

    For all trees \mathcal{M}caligraphic_M and superscript\mathcal{M}^{\prime}caligraphic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT identical up to depth n𝑛nitalic_n: φmodels𝜑\mathcal{M}\models\varphicaligraphic_M ⊧ italic_φ implies ⊧̸φnot-modelssuperscriptsuperscript𝜑\mathcal{M}^{\prime}\not\models\varphi^{\prime}caligraphic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧̸ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT;

  4. (iv)

    For all trees \mathcal{M}caligraphic_M and superscript\mathcal{M}^{\prime}caligraphic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT identical up to depth n𝑛nitalic_n and whose branching is bounded by k𝑘kitalic_k:
    φmodels𝜑\mathcal{M}\models\varphicaligraphic_M ⊧ italic_φ implies ⊧̸φnot-modelssuperscriptsuperscript𝜑\mathcal{M}^{\prime}\not\models\varphi^{\prime}caligraphic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧̸ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

Based on Proposition 1, we show that 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-separability of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formulae is ExpTime-complete and thus not harder than 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-definability.

Theorem 3.

Modal separability of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formulae is ExpTime-complete over arbitrary models.

ExpTime-hardness already holds for 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-definability and is proved by an immediate reduction from μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-satisfiability, which is ExpTime-complete already for its fragment 𝖯𝖣𝖫𝖯𝖣𝖫\mathsf{PDL}sansserif_PDL [6, Section 4]. It is not hard to modify the original hardness proof for 𝖯𝖣𝖫𝖯𝖣𝖫\mathsf{PDL}sansserif_PDL-satisfiability to work over finite trees, so Theorem 3 remains valid over finite trees as well. For the upper bound, we mostly follow the technical development in [16]. Thanks to Proposition 1 separability is equivalent to the existence of n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N for which condition (iv) holds. This can be expressed as an 𝖬𝖲𝖮𝖬𝖲𝖮{\sf MSO}sansserif_MSO statement about the full k𝑘kitalic_k-ary tree, and thus decided. However, for optimal complexity and to extract bounds that we use later, in Appendix C we apply a lower-level automata-theoretic analysis.

Over words, we essentially follow the same approach. Since the tree automata used in the proof of Theorem 3 can be replaced by word automata, the complexity drops to PSpace (see Appendix D). A matching lower bound can be derived as above by a reduction from satisfiability in 𝖫𝖳𝖫𝖫𝖳𝖫\mathsf{LTL}sansserif_LTL over words [20, Theorem 4.1] (which, in fact, can be rephrased in terms of 𝖯𝖣𝖫𝖯𝖣𝖫\mathsf{PDL}sansserif_PDL).

Theorem 4.

Modal separability of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formulae is PSpace-complete over words.

As announced, an important step in the proofs of the upper bounds, both in the case with arbitrary models and with words, is the following proposition which we will also use later.

Proposition 2.

If φ,φμ𝖬𝖫𝜑superscript𝜑𝜇𝖬𝖫\varphi,\varphi^{\prime}\in\mu\mathsf{ML}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_μ sansserif_ML are separable then they are separable by a formula of modal depth l𝑙litalic_l exponential in their size k𝑘kitalic_k. The same is true over words.

In the remainder of the paper we will deal with computing separators based on Proposition 2. Before we proceed let note that our approach differs from the treatment of modal definability from [13]. There, the authors rewrite given φ𝜑\varphiitalic_φ into modal ψ𝜓\psiitalic_ψ in such a way that if the initial φ𝜑\varphiitalic_φ is modally definable then φ𝜑\varphiitalic_φ and ψ𝜓\psiitalic_ψ are equivalent. In the case when φ𝜑\varphiitalic_φ is not modally definable, however, the output ψ𝜓\psiitalic_ψ is rather random. For example, ψ𝜓\psiitalic_ψ obtained from the formula φsubscript𝜑\varphi_{\infty}italic_φ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT from Example 2 is equivalent to bottom\bot which is not even a consequence of φsubscript𝜑\varphi_{\infty}italic_φ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT. Thus, a different construction is needed to obtain separators. We will actually compute something slightly stronger that might be of independent interest.

Definition 2.

Given φμ𝖬𝖫𝜑𝜇𝖬𝖫\varphi\in\mu\mathsf{ML}italic_φ ∈ italic_μ sansserif_ML and n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N, a formula ψ𝖬𝖫n𝜓superscript𝖬𝖫𝑛\psi\in\mathsf{ML}^{n}italic_ψ ∈ sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT is an n𝑛nitalic_n-uniform consequence of φ𝜑\varphiitalic_φ if, for all θ𝖬𝖫n𝜃superscript𝖬𝖫𝑛\theta\in\mathsf{ML}^{n}italic_θ ∈ sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT:

φθψθformulae-sequencemodels𝜑𝜃iffmodels𝜓𝜃\varphi\models\theta\hskip 14.22636pt\iff\hskip 14.22636pt\psi\models\thetaitalic_φ ⊧ italic_θ ⇔ italic_ψ ⊧ italic_θ

An analogous notion relative to a fixed class 𝒞𝒞\mathcal{C}caligraphic_C of models is obtained by replacing models\models with 𝒞subscriptmodels𝒞\models_{\mathcal{C}}⊧ start_POSTSUBSCRIPT caligraphic_C end_POSTSUBSCRIPT.

In words: ψ𝜓\psiitalic_ψ is an n𝑛nitalic_n-uniform consequence of φ𝜑\varphiitalic_φ if it has modal depth n𝑛nitalic_n, is a consequence of φ𝜑\varphiitalic_φ, and entails every other consequence of φ𝜑\varphiitalic_φ of modal depth n𝑛nitalic_n. In particular, if φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are separable by some modal formula of modal depth n𝑛nitalic_n and ψ𝜓\psiitalic_ψ is an n𝑛nitalic_n-uniform consequence of φ𝜑\varphiitalic_φ, then this ψ𝜓\psiitalic_ψ separates φ𝜑\varphiitalic_φ from φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT as well. Observe that n𝑛nitalic_n-uniform consequences exist for every φμ𝖬𝖫𝜑𝜇𝖬𝖫\varphi\in\mu\mathsf{ML}italic_φ ∈ italic_μ sansserif_ML and n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N. Indeed, given φ𝜑\varphiitalic_φ and n𝑛nitalic_n we can obtain an n𝑛nitalic_n-uniform consequence ψ𝜓\psiitalic_ψ of φ𝜑\varphiitalic_φ by taking the disjunction of all 𝖬𝖫nsuperscript𝖬𝖫𝑛\mathsf{ML}^{n}sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT-types consistent with φ𝜑\varphiitalic_φ. Here, by an 𝖬𝖫nsuperscript𝖬𝖫𝑛\mathsf{ML}^{n}sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT-type we mean a maximal consistent subset of 𝖬𝖫nsuperscript𝖬𝖫𝑛\mathsf{ML}^{n}sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. Since up to equivalence there are only finitely many formulae in 𝖬𝖫nsuperscript𝖬𝖫𝑛\mathsf{ML}^{n}sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, each 𝖬𝖫nsuperscript𝖬𝖫𝑛\mathsf{ML}^{n}sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT-type can be represented as a single 𝖬𝖫nsuperscript𝖬𝖫𝑛\mathsf{ML}^{n}sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT-formula and the mentioned disjunction ψ𝜓\psiitalic_ψ is well-defined.

In view of Proposition 2, it thus suffices to compute n𝑛nitalic_n-uniform consequences of φ𝜑\varphiitalic_φ. Unfortunately, the naive construction given above is nonelementary in the size of the separated formulae φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. In the next sections we give better constructions.

4 Optimal Separators: Arbitrary Models

We construct doubly exponentially sized separators and provide matching lower bounds.

4.1 Construction

Theorem 5.

If φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are modally separable then a separator φ𝜑\varphiitalic_φ of size doubly exponential in k=|φ|+|φ|𝑘𝜑superscript𝜑k=|\varphi|+|\varphi^{\prime}|italic_k = | italic_φ | + | italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | exists and can be computed in doubly exponential time.

The above is a consequence of the following lemma.

Lemma 2.

For every φμ𝖬𝖫𝜑𝜇𝖬𝖫\varphi\in\mu\mathsf{ML}italic_φ ∈ italic_μ sansserif_ML and n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N, one can construct an n𝑛nitalic_n-uniform consequence ψn𝖬𝖫nsubscript𝜓𝑛superscript𝖬𝖫𝑛\psi_{n}\in\mathsf{ML}^{n}italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT of φ𝜑\varphiitalic_φ with branching doubly exponential in |φ|𝜑|\varphi|| italic_φ | and depth linear in n𝑛nitalic_n.

We show how Theorem 5 follows from Lemma 2. Proposition 2 guarantees that if a modal separator for φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT exists then there is one with modal depth l𝑙litalic_l exponential in k𝑘kitalic_k. Since ψlsubscript𝜓𝑙\psi_{l}italic_ψ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT entails this separator it follows that ψlsubscript𝜓𝑙\psi_{l}italic_ψ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT is a separator itself.

The branching m𝑚mitalic_m of ψlsubscript𝜓𝑙\psi_{l}italic_ψ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT is at most doubly exponential in |φ|𝜑|\varphi|| italic_φ | and thus also in k𝑘kitalic_k: m22kx𝑚superscript2superscript2superscript𝑘𝑥m\leq 2^{2^{k^{x}}}italic_m ≤ 2 start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_k start_POSTSUPERSCRIPT italic_x end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT for some constant x𝑥xitalic_x. The depth d𝑑ditalic_d of ψlsubscript𝜓𝑙\psi_{l}italic_ψ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT is linear in l𝑙litalic_l and therefore d2ky𝑑superscript2superscript𝑘𝑦d\leq 2^{k^{y}}italic_d ≤ 2 start_POSTSUPERSCRIPT italic_k start_POSTSUPERSCRIPT italic_y end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT for some y𝑦yitalic_y. Altogether this means that the size of ψlsubscript𝜓𝑙\psi_{l}italic_ψ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT:

|ψl|md(22kx)2kysubscript𝜓𝑙superscript𝑚𝑑superscriptsuperscript2superscript2superscript𝑘𝑥superscript2superscript𝑘𝑦|\psi_{l}|\leq m^{d}\leq(2^{2^{k^{x}}})^{2^{k^{y}}}| italic_ψ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT | ≤ italic_m start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ≤ ( 2 start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_k start_POSTSUPERSCRIPT italic_x end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_k start_POSTSUPERSCRIPT italic_y end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT

is at most doubly exponential in k𝑘kitalic_k. It remains to prove Lemma 2.

Proof.

Let 𝒜=(Q,Σ,qI,δ,𝗋𝖺𝗇𝗄)𝒜𝑄Σsubscript𝑞𝐼𝛿𝗋𝖺𝗇𝗄\mathcal{A}=(Q,\Sigma,q_{I},\delta,\mathsf{rank})caligraphic_A = ( italic_Q , roman_Σ , italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT , italic_δ , sansserif_rank ) be the NPTA equivalent to φ𝜑\varphiitalic_φ with exponentially many states, which exists due to Theorem 1. For each n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N and qQ𝑞𝑄q\in Qitalic_q ∈ italic_Q we construct ψn,q𝖬𝖫nsubscript𝜓𝑛𝑞superscript𝖬𝖫𝑛\psi_{n,q}\in\mathsf{ML}^{n}italic_ψ start_POSTSUBSCRIPT italic_n , italic_q end_POSTSUBSCRIPT ∈ sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT of branching 22|Q|superscript2superscript2𝑄2^{2^{|Q|}}2 start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT | italic_Q | end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT such that:

ψn,q  there exists 𝒩𝒜[qIq] with n𝒩modelssubscript𝜓𝑛𝑞  there exists 𝒩𝒜[qIq] with n𝒩\displaystyle\mathcal{M}\models\psi_{n,q}\text{\hskip 14.22636pt $\iff$ \hskip 1% 4.22636pt there exists $\mathcal{N}\models\mathcal{A}[q_{I}\mapsfrom q]$ with % $\mathcal{M}\leftrightarroweq^{n}\mathcal{N}$}caligraphic_M ⊧ italic_ψ start_POSTSUBSCRIPT italic_n , italic_q end_POSTSUBSCRIPT ⇔ there exists caligraphic_N ⊧ caligraphic_A [ italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ↤ italic_q ] with caligraphic_M start_ARROW start_RELOP SUPERSCRIPTOP start_ARG - end_ARG start_ARG ↔ end_ARG end_RELOP end_ARROW start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT caligraphic_N (2)

for every structure \mathcal{M}caligraphic_M. Then, ψn,qIsubscript𝜓𝑛subscript𝑞𝐼\psi_{n,q_{I}}italic_ψ start_POSTSUBSCRIPT italic_n , italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT end_POSTSUBSCRIPT is our desired n𝑛nitalic_n-uniform consequence ψnsubscript𝜓𝑛\psi_{n}italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT of φ𝜑\varphiitalic_φ.

We proceed by induction on n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N. For the base case we put:

ψ0,q={cΣ|there is 𝒩𝒜[qIq] with 𝒩c}subscript𝜓0𝑞conditional-set𝑐Σthere is 𝒩𝒜[qIq] with 𝒩c\psi_{0,q}=\bigvee\{c\in\Sigma\ |\ \text{there is $\mathcal{N}\models\mathcal{% A}[q_{I}\mapsfrom q]$ with $\mathcal{N}\models c$}\}italic_ψ start_POSTSUBSCRIPT 0 , italic_q end_POSTSUBSCRIPT = ⋁ { italic_c ∈ roman_Σ | there is caligraphic_N ⊧ caligraphic_A [ italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ↤ italic_q ] with caligraphic_N ⊧ italic_c }

which clearly satisfies the induction goal (2). For the induction step define:

ψn+1,q=cΣSδ(q,c)c{ψn,p|pS}.subscript𝜓𝑛1𝑞subscript𝑐Σsubscript𝑆𝛿𝑞𝑐𝑐conditionalsubscript𝜓𝑛𝑝𝑝𝑆\psi_{n+1,q}=\bigvee_{c\in\Sigma}\bigvee_{S\in\delta(q,c)}c\wedge\nabla\{\psi_% {n,p}\ |\ p\in S\}.italic_ψ start_POSTSUBSCRIPT italic_n + 1 , italic_q end_POSTSUBSCRIPT = ⋁ start_POSTSUBSCRIPT italic_c ∈ roman_Σ end_POSTSUBSCRIPT ⋁ start_POSTSUBSCRIPT italic_S ∈ italic_δ ( italic_q , italic_c ) end_POSTSUBSCRIPT italic_c ∧ ∇ { italic_ψ start_POSTSUBSCRIPT italic_n , italic_p end_POSTSUBSCRIPT | italic_p ∈ italic_S } .

Fix \mathcal{M}caligraphic_M, denote the color of the root by c𝑐citalic_c and the set of all children of the root by M0subscript𝑀0M_{0}italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. If \mathcal{M}caligraphic_M satisfies ψn+1,qsubscript𝜓𝑛1𝑞\psi_{n+1,q}italic_ψ start_POSTSUBSCRIPT italic_n + 1 , italic_q end_POSTSUBSCRIPT then there is {p1,,pl}=Sδ(q,c)subscript𝑝1subscript𝑝𝑙𝑆𝛿𝑞𝑐\{p_{1},...,p_{l}\}=S\in\delta(q,c){ italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_p start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT } = italic_S ∈ italic_δ ( italic_q , italic_c ) such that nabla of Φ={ψn,p|pS}Φconditional-setsubscript𝜓𝑛𝑝𝑝𝑆\Phi=\{\psi_{n,p}\ |\ p\in S\}roman_Φ = { italic_ψ start_POSTSUBSCRIPT italic_n , italic_p end_POSTSUBSCRIPT | italic_p ∈ italic_S } is satisfied in the root. By invariance under bisimulation we may assume that the root of \mathcal{M}caligraphic_M has sufficiently many children to find a separate witness for each ψn,pΦsubscript𝜓𝑛𝑝Φ\psi_{n,p}\in\Phiitalic_ψ start_POSTSUBSCRIPT italic_n , italic_p end_POSTSUBSCRIPT ∈ roman_Φ. That is, we assume a surjective assignment h:M0Φ:subscript𝑀0Φh:M_{0}\to\Phiitalic_h : italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT → roman_Φ that maps every vM0𝑣subscript𝑀0v\in M_{0}italic_v ∈ italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to some formula ψn,psubscript𝜓𝑛𝑝\psi_{n,p}italic_ψ start_POSTSUBSCRIPT italic_n , italic_p end_POSTSUBSCRIPT true in v𝑣vitalic_v. By induction hypothesis, for each vM0𝑣subscript𝑀0v\in M_{0}italic_v ∈ italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT with h(v)=ψn,p𝑣subscript𝜓𝑛𝑝h(v)=\psi_{n,p}italic_h ( italic_v ) = italic_ψ start_POSTSUBSCRIPT italic_n , italic_p end_POSTSUBSCRIPT there is a model 𝒩p𝒜[qIp]modelssubscript𝒩𝑝𝒜delimited-[]subscript𝑞𝐼𝑝\mathcal{N}_{p}\models\mathcal{A}[q_{I}\mapsfrom p]caligraphic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊧ caligraphic_A [ italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ↤ italic_p ] n𝑛nitalic_n-bisimilar to the subtree of \mathcal{M}caligraphic_M rooted in v𝑣vitalic_v. Define 𝒩𝒩\mathcal{N}caligraphic_N as follows: first take the disjoint union {v}{𝒩p|pS}square-union𝑣conditional-setsubscript𝒩𝑝𝑝𝑆\{v\}\sqcup\mathop{\mathop{\mbox{\bigmathxx\char 116\relax}}}\limits\{\mathcal% {N}_{p}\ |\ p\in S\}{ italic_v } ⊔ ⊔ { caligraphic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT | italic_p ∈ italic_S } of all the 𝒩psubscript𝒩𝑝\mathcal{N}_{p}caligraphic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT’s and a fresh point v𝑣vitalic_v of color c𝑐citalic_c; then for every 𝒩psubscript𝒩𝑝\mathcal{N}_{p}caligraphic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT add an edge from v𝑣vitalic_v to the root of 𝒩psubscript𝒩𝑝\mathcal{N}_{p}caligraphic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT and set v𝑣vitalic_v as the new root. It is easy to see that 𝒩𝒜[qIq]models𝒩𝒜delimited-[]subscript𝑞𝐼𝑞\mathcal{N}\models\mathcal{A}[q_{I}\mapsfrom q]caligraphic_N ⊧ caligraphic_A [ italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ↤ italic_q ] and n+1𝒩superscriptleftrightarroweq𝑛1𝒩\mathcal{M}\leftrightarroweq^{n+1}\mathcal{N}caligraphic_M start_ARROW start_RELOP SUPERSCRIPTOP start_ARG - end_ARG start_ARG ↔ end_ARG end_RELOP end_ARROW start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT caligraphic_N, as desired.

Conversely, assume n+1𝒩superscriptleftrightarroweq𝑛1𝒩\mathcal{M}\leftrightarroweq^{n+1}\mathcal{N}caligraphic_M start_ARROW start_RELOP SUPERSCRIPTOP start_ARG - end_ARG start_ARG ↔ end_ARG end_RELOP end_ARROW start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT caligraphic_N and 𝒩𝒜[qIq]models𝒩𝒜delimited-[]subscript𝑞𝐼𝑞\mathcal{N}\models\mathcal{A}[q_{I}\mapsfrom q]caligraphic_N ⊧ caligraphic_A [ italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ↤ italic_q ] witnessed by an (n+1)𝑛1(n+1)( italic_n + 1 )-bisimulation Z𝑍Zitalic_Z and a run ρ:NQ:𝜌𝑁𝑄\rho:N\to Qitalic_ρ : italic_N → italic_Q. Denote the children of the root of 𝒩𝒩\mathcal{N}caligraphic_N by N0subscript𝑁0N_{0}italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. Since ρ𝜌\rhoitalic_ρ is a run, the set S=ρ[N0]𝑆𝜌delimited-[]subscript𝑁0S=\rho[N_{0}]italic_S = italic_ρ [ italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ] of states assigned to N0subscript𝑁0N_{0}italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT belongs to δ(q,c)𝛿𝑞𝑐\delta(q,c)italic_δ ( italic_q , italic_c ). Every vM0𝑣subscript𝑀0v\in M_{0}italic_v ∈ italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is n𝑛nitalic_n-bisimilar to some wN0𝑤subscript𝑁0w\in N_{0}italic_w ∈ italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and hence by the induction hypothesis satisfies ψn,psubscript𝜓𝑛𝑝\psi_{n,p}italic_ψ start_POSTSUBSCRIPT italic_n , italic_p end_POSTSUBSCRIPT for p=ρ(w)S𝑝𝜌𝑤𝑆p=\rho(w)\in Sitalic_p = italic_ρ ( italic_w ) ∈ italic_S. Symmetrically, for every pS𝑝𝑆p\in Sitalic_p ∈ italic_S there is wN0𝑤subscript𝑁0w\in N_{0}italic_w ∈ italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT accepted by 𝒜[qIp]𝒜delimited-[]subscript𝑞𝐼𝑝\mathcal{A}[q_{I}\mapsfrom p]caligraphic_A [ italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ↤ italic_p ]. Since that w𝑤witalic_w is n𝑛nitalic_n-bisimilar to some vM0𝑣subscript𝑀0v\in M_{0}italic_v ∈ italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, by induction hypothesis v𝑣vitalic_v satisfies ψn,psubscript𝜓𝑛𝑝\psi_{n,p}italic_ψ start_POSTSUBSCRIPT italic_n , italic_p end_POSTSUBSCRIPT. It follows that the root of \mathcal{M}caligraphic_M satisfies {ψn,p|pS}conditionalsubscript𝜓𝑛𝑝𝑝𝑆\nabla\{\psi_{n,p}\ |\ p\in S\}∇ { italic_ψ start_POSTSUBSCRIPT italic_n , italic_p end_POSTSUBSCRIPT | italic_p ∈ italic_S } and therefore also ψn+1,qsubscript𝜓𝑛1𝑞\psi_{n+1,q}italic_ψ start_POSTSUBSCRIPT italic_n + 1 , italic_q end_POSTSUBSCRIPT. ∎

Let us remark that Lemma 2 can be easily adapted to deal with vocabulary restrictions. That is, given P𝖯𝗋𝗈𝗉𝑃𝖯𝗋𝗈𝗉P\subseteq\mathsf{Prop}italic_P ⊆ sansserif_Prop we could construct ψnPsuperscriptsubscript𝜓𝑛𝑃\psi_{n}^{P}italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT similar to ψnsubscript𝜓𝑛\psi_{n}italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT but only using atomic propositions from P𝑃Pitalic_P and only entailing 𝖬𝖫nsuperscript𝖬𝖫𝑛\mathsf{ML}^{n}sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT-consequences of φ𝜑\varphiitalic_φ whose vocabulary is contained in P𝑃Pitalic_P. To that end, it suffices to project-out atomic propositions not in P𝑃Pitalic_P from the automaton 𝒜𝒜\mathcal{A}caligraphic_A and only then proceed with our construction. The automaton 𝒜Psuperscript𝒜𝑃\mathcal{A}^{P}caligraphic_A start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT obtained by such a projection accepts a model \mathcal{M}caligraphic_M iff \mathcal{M}caligraphic_M is P𝑃Pitalic_P-bisimilar to some 𝒩𝒜models𝒩𝒜\mathcal{N}\models\mathcal{A}caligraphic_N ⊧ caligraphic_A. It follows that for every θ𝖬𝖫n,P𝜃superscript𝖬𝖫𝑛𝑃\theta\in\mathsf{ML}^{n,P}italic_θ ∈ sansserif_ML start_POSTSUPERSCRIPT italic_n , italic_P end_POSTSUPERSCRIPT:

ψnPθ𝒜Pθ𝒜θformulae-sequencemodelssuperscriptsubscript𝜓𝑛𝑃𝜃iffformulae-sequencemodelssuperscript𝒜𝑃𝜃iffmodels𝒜𝜃\displaystyle\psi_{n}^{P}\models\theta\hskip 14.22636pt\iff\hskip 14.22636pt% \mathcal{A}^{P}\models\theta\hskip 14.22636pt\iff\hskip 14.22636pt\mathcal{A}\models\thetaitalic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT ⊧ italic_θ ⇔ caligraphic_A start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT ⊧ italic_θ ⇔ caligraphic_A ⊧ italic_θ

where the first equivalence uses θ𝖬𝖫n𝜃superscript𝖬𝖫𝑛\theta\in\mathsf{ML}^{n}italic_θ ∈ sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and the latter θ𝖬𝖫P𝜃superscript𝖬𝖫𝑃\theta\in\mathsf{ML}^{P}italic_θ ∈ sansserif_ML start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT. Such (P,n)𝑃𝑛(P,n)( italic_P , italic_n )-uniform consequence ψnPsuperscriptsubscript𝜓𝑛𝑃\psi_{n}^{P}italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT of φ𝜑\varphiitalic_φ can then be taken as a Craig modal separator, in the same way as ψnsubscript𝜓𝑛\psi_{n}italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT serves as a modal separator.

4.2 Lower bounds

For the lower bounds, we show that over arbitrary structures (in fact, already binary trees) μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML is doubly exponentially more succinct than 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML. The example is essentially taken from [7, Section 3.1]. There the authors use game-theoretic tools which are later applied to more complicated cases. Since we are only interested in this example, we provide a straightforward self-contained argument.

Proposition 3.

There is a sequence (φn)nsubscriptsubscript𝜑𝑛𝑛(\varphi_{n})_{n\in\mathbb{N}}( italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_n ∈ blackboard_N end_POSTSUBSCRIPT of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formulae of size polynomial in n𝑛nitalic_n such that each φnsubscript𝜑𝑛\varphi_{n}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is equivalent to a 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-formula but every ψ𝖬𝖫𝜓𝖬𝖫\psi\in\mathsf{ML}italic_ψ ∈ sansserif_ML equivalent to φnsubscript𝜑𝑛\varphi_{n}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT has size at least 22nsuperscript2superscript2𝑛2^{2^{n}}2 start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT.

Proof.

We only give a sketch, the details are found in Appendix E. We assume two different actions 𝖺𝖺\mathsf{a}sansserif_a and 𝖻𝖻\mathsf{b}sansserif_b. For each n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N consider the property:

  • Bnsubscript𝐵𝑛B_{n}italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT:

    “No path (over all actions) longer than 2nsuperscript2𝑛2^{n}2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT starts in the root.”

This can be enforced by encoding an n𝑛nitalic_n-bit binary counter into the structure of the model, and requiring that on every path the counter values are strictly increasing. Let Cnsubscript𝐶𝑛C_{n}italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT be this (technically stronger) property expressing the behavior of the encoded counter. Assuming that the encoding is reasonably efficient, Cnsubscript𝐶𝑛C_{n}italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT can be easily expressed by a μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formula φnsubscript𝜑𝑛\varphi_{n}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT of size polynomial in n𝑛nitalic_n (in fact, a weak fragment of 𝖯𝖣𝖫𝖯𝖣𝖫\mathsf{PDL}sansserif_PDL is already sufficient). Since the lengths of paths are bounded, Cnsubscript𝐶𝑛C_{n}italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT can be also expressed in 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML.

However, every ψ𝖬𝖫𝜓𝖬𝖫\psi\in\mathsf{ML}italic_ψ ∈ sansserif_ML equivalent to φnsubscript𝜑𝑛\varphi_{n}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT has size at least 22nsuperscript2superscript2𝑛2^{2^{n}}2 start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT. The reason is that for every sequence of actions 𝖺𝖺\mathsf{a}sansserif_a and 𝖻𝖻\mathsf{b}sansserif_b of length 2nsuperscript2𝑛2^{n}2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, the syntax tree of ψ𝜓\psiitalic_ψ must contain a descending sequence of subformulae of length 2nsuperscript2𝑛2^{n}2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT such that the i𝑖iitalic_i-th subformula begins with a modal operator corresponding to the i𝑖iitalic_i-th action. This allows to embed a binary tree of height 2nsuperscript2𝑛2^{n}2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT into the syntax tree of ψ𝜓\psiitalic_ψ. ∎

Note that the presence of two different actions 𝖺𝖺\mathsf{a}sansserif_a and 𝖻𝖻\mathsf{b}sansserif_b is essential for the argument. We conjecture that μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML is doubly exponentially more succinct than 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML already in the monomodal setting. Consider the following Property Pnsubscript𝑃𝑛P_{n}italic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, parameterized by n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N:

  • Pnsubscript𝑃𝑛P_{n}italic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT:

    Cnsubscript𝐶𝑛C_{n}italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and there exists a maximal path on which the number of points satisfying τ𝜏\tauitalic_τ is even.”

where Cnsubscript𝐶𝑛C_{n}italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is the same as in Proposition 3. It is not difficult to come up with small, that is, of size polynomial in n𝑛nitalic_n, μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formulae φnsubscriptsuperscript𝜑𝑛\varphi^{\prime}_{n}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT expressing Pnsubscript𝑃𝑛P_{n}italic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. Unfortunately, proving that no small 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML formula can be equivalent to φnsuperscriptsubscript𝜑𝑛\varphi_{n}^{\prime}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT seems difficult. For instance, consider models where every non-leaf point has a child satisfying τ𝜏\tauitalic_τ and a child satisfying ¬τ𝜏\neg\tau¬ italic_τ. Then a trick similar to the famous example of Potthoff (showing, roughly, that the language of all binary trees of even depth is first-order definable) [18, Example 1] can be exploited to get a modal formula equivalent to φnsuperscriptsubscript𝜑𝑛\varphi_{n}^{\prime}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (over such models), but of size only single exponential in n𝑛nitalic_n. Moreover, the results in the next Section 5 show that looking at words only is not sufficient either.

5 Optimal Separators: Word Case

In this section we show that optimal modal separators (over words) can be computed exponentially faster and are exponentially smaller compared to the case with arbitrary models.

Theorem 6.

If φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are modally separable over words, then a separator of size exponential in |φ|+|φ|𝜑superscript𝜑|\varphi|+|\varphi^{\prime}|| italic_φ | + | italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | exists and can be computed in exponential time.

As with arbitrary models, Proposition 2 gives an upper bound on the modal depth of a separator and so it suffices to construct n𝑛nitalic_n-uniform consequences of φ𝜑\varphiitalic_φ of small size.

We illustrate the idea first. Consider the classes EVENn and ODDn, n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N of all word structures of length n𝑛nitalic_n in which proposition a𝑎aitalic_a is satisfied in an even and odd, respectively, number of points. Constructing modal formulae φnsubscript𝜑𝑛\varphi_{n}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and φnsubscriptsuperscript𝜑𝑛\varphi^{\prime}_{n}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT defining EVENn and ODDn in the following, naive way leads to exponential formulae since φi+1subscript𝜑𝑖1\varphi_{i+1}italic_φ start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT contains both and φisubscript𝜑𝑖\varphi_{i}italic_φ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and φisubscriptsuperscript𝜑𝑖\varphi^{\prime}_{i}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT:

φ0subscript𝜑0\displaystyle\varphi_{0}italic_φ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT =¬aabsent𝑎limit-frombottom\displaystyle=\neg a\wedge\Box\bot= ¬ italic_a ∧ □ ⊥ φi+1subscript𝜑𝑖1\displaystyle\varphi_{i+1}italic_φ start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT =((aφi)(¬aφi))\displaystyle=\Diamond\top\wedge\big{(}(a\wedge\varphi_{i}^{\prime})\vee(\neg a% \wedge\varphi_{i})\big{)}= ◇ ⊤ ∧ ( ( italic_a ∧ italic_φ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∨ ( ¬ italic_a ∧ italic_φ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) )
φ0superscriptsubscript𝜑0\displaystyle\varphi_{0}^{\prime}italic_φ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT =aabsent𝑎limit-frombottom\displaystyle=a\wedge\Box\bot= italic_a ∧ □ ⊥ φi+1superscriptsubscript𝜑𝑖1\displaystyle\varphi_{i+1}^{\prime}italic_φ start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT =((aφi)(¬aφi))\displaystyle=\Diamond\top\wedge\big{(}(a\wedge\varphi_{i})\vee(\neg a\wedge% \varphi_{i}^{\prime})\big{)}= ◇ ⊤ ∧ ( ( italic_a ∧ italic_φ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ∨ ( ¬ italic_a ∧ italic_φ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) )

This exponential blow-up can be avoided, however, using “divide-and-conquer” as follows:

φ2n=(φnnφn)(φnnφn)subscript𝜑2𝑛subscript𝜑𝑛superscript𝑛subscript𝜑𝑛superscriptsubscript𝜑𝑛superscript𝑛superscriptsubscript𝜑𝑛\displaystyle\varphi_{2n}=\big{(}\varphi_{n}\wedge\Diamond^{n}\varphi_{n}\big{% )}\vee\big{(}\varphi_{n}^{\prime}\wedge\Diamond^{n}\varphi_{n}^{\prime}\big{)}italic_φ start_POSTSUBSCRIPT 2 italic_n end_POSTSUBSCRIPT = ( italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∧ ◇ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ∨ ( italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∧ ◇ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
φ2n=(φnnφn)(φnnφn)superscriptsubscript𝜑2𝑛subscript𝜑𝑛superscript𝑛superscriptsubscript𝜑𝑛superscriptsubscript𝜑𝑛superscript𝑛subscript𝜑𝑛\displaystyle\varphi_{2n}^{\prime}=\big{(}\varphi_{n}\wedge\Diamond^{n}\varphi% _{n}^{\prime}\big{)}\vee\big{(}\varphi_{n}^{\prime}\wedge\Diamond^{n}\varphi_{% n}\big{)}italic_φ start_POSTSUBSCRIPT 2 italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ( italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∧ ◇ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∨ ( italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∧ ◇ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT )

Although several copies of formulae of smaller index are used as well, but since the index is halved, we end up with formulae of roughly quadradic size. The proof of the following analogue of Lemma 2 relies on this idea.

Lemma 3.

For every n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N and every NPWA 𝒜𝒜\mathcal{A}caligraphic_A with states Q𝑄Qitalic_Q, one can construct a formula ψn𝖬𝖫nsubscript𝜓𝑛superscript𝖬𝖫𝑛\psi_{n}\in\mathsf{ML}^{n}italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT which is 𝒜𝒜\mathcal{A}caligraphic_A’s n𝑛nitalic_n-uniform consequence over words and has size polynomial in n𝑛nitalic_n and |Q|𝑄|Q|| italic_Q |. The construction requires polynomial time.

To see that Lemma 3 implies Theorem 6, let φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT admit a modal separator over words. Let 𝒜𝒜\mathcal{A}caligraphic_A be an NPWA that is equivalent to φ𝜑\varphiitalic_φ. By Theorem 1, 𝒜𝒜\mathcal{A}caligraphic_A has exponentially many states and can be computed in exponential time. Proposition 2 implies that there is a modal separator of modal depth l𝑙litalic_l at most exponential in k=|φ|+|φ|𝑘𝜑superscript𝜑k=|\varphi|+|\varphi^{\prime}|italic_k = | italic_φ | + | italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT |. As with arbitrary models, 𝒜𝒜\mathcal{A}caligraphic_A’s l𝑙litalic_l-uniform consequence ψlsubscript𝜓𝑙\psi_{l}italic_ψ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT from Lemma 3 is the sought separator. We now prove the lemma.

Proof.

Let 𝒜=(Q,Σ,δ,qI,𝗋𝖺𝗇𝗄)𝒜𝑄Σ𝛿subscript𝑞𝐼𝗋𝖺𝗇𝗄\mathcal{A}=(Q,\Sigma,\delta,q_{I},\mathsf{rank})caligraphic_A = ( italic_Q , roman_Σ , italic_δ , italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT , sansserif_rank ) be an NPWA. The main idea is to construct, for every p,qQ𝑝𝑞𝑄p,q\in Qitalic_p , italic_q ∈ italic_Q and m𝑚m\in\mathbb{N}italic_m ∈ blackboard_N, a formula ψp,qmsuperscriptsubscript𝜓𝑝𝑞𝑚\psi_{p,q}^{m}italic_ψ start_POSTSUBSCRIPT italic_p , italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT such that for every input word \mathcal{M}caligraphic_M:

ψp,qmthere is a run from p to q over the m-prefix of ,iffmodelssuperscriptsubscript𝜓𝑝𝑞𝑚there is a run from p to q over the m-prefix of \mathcal{M}\models\psi_{p,q}^{m}\iff\text{there is a run from $p$ to $q$ over the $m$-prefix of $\mathcal{M}$},caligraphic_M ⊧ italic_ψ start_POSTSUBSCRIPT italic_p , italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ⇔ there is a run from italic_p to italic_q over the italic_m -prefix of caligraphic_M ,

The key step is the recursive splitting similar to the definitions of EVENn and ODDn above. Intuitively, ψp,q2msuperscriptsubscript𝜓𝑝𝑞2𝑚\psi_{p,q}^{2m}italic_ψ start_POSTSUBSCRIPT italic_p , italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 italic_m end_POSTSUPERSCRIPT is the disjunction over all sQ𝑠𝑄s\in Qitalic_s ∈ italic_Q of the conditions “there is a run from p𝑝pitalic_p in the initial position to s𝑠sitalic_s in position m𝑚mitalic_m, and a run from s𝑠sitalic_s in position m𝑚mitalic_m to q𝑞qitalic_q in position 2m2𝑚2m2 italic_m.” The latter conditions are recursively expressed using ψp,smsuperscriptsubscript𝜓𝑝𝑠𝑚\psi_{p,s}^{m}italic_ψ start_POSTSUBSCRIPT italic_p , italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT and ψs,qmsuperscriptsubscript𝜓𝑠𝑞𝑚\psi_{s,q}^{m}italic_ψ start_POSTSUBSCRIPT italic_s , italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT. The constructed formulas ψqI,qmsuperscriptsubscript𝜓subscript𝑞𝐼𝑞𝑚\psi_{q_{I},q}^{m}italic_ψ start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT , italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT, mn𝑚𝑛m\leq nitalic_m ≤ italic_n are then used to describe all possible n𝑛nitalic_n-prefixes of models of 𝒜𝒜\mathcal{A}caligraphic_A. The details of the construction can be found in Appendix F. ∎

We conclude the section with the comment that Theorem 6 is optimal in the sense that there are modally separable formulae which require a large separator. We actually show the following stronger statement implying that, over words, μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML is exponentially more succinct than 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML.

Proposition 4.

There is a sequence of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formulae (φn)nsubscriptsubscript𝜑𝑛𝑛(\varphi_{n})_{n\in\mathbb{N}}( italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_n ∈ blackboard_N end_POSTSUBSCRIPT of size polynomial in n𝑛nitalic_n such that each φnsubscript𝜑𝑛\varphi_{n}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is equivalent to a 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-formula but every ψ𝖬𝖫𝜓𝖬𝖫\psi\in\mathsf{ML}italic_ψ ∈ sansserif_ML equivalent to φnsubscript𝜑𝑛\varphi_{n}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT has size at least 2nsuperscript2𝑛2^{n}2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT.

The proof is entirely standard. The main idea is that, already in 𝖯𝖣𝖫𝖯𝖣𝖫\mathsf{PDL}sansserif_PDL one can stipulate (with a small formula) a finite word of exponential length. Clearly, any 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML-formula expressing this requires exponential size. The only difficulty is doing it with a fixed signature: instead of encoding i𝑖iitalic_i-bit counters using i𝑖iitalic_i propositions, we use just two propositions and encode numbers in i𝑖iitalic_i consecutive points.

6 Conclusion and Open Problems

We have studied the problem of deciding separability of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML-formulae by fixpoint free formulae from 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML, and computing separators if they exist. Our results cover several interesting classes of models such as trees, finite trees, and words. Due to the great expressivity of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML the results remain valid in the presence of ontologies.

A notably missing case is the class of trees of fixed outdegree d𝑑ditalic_d independent from formulae. This is surprisingly different from the classes we studied. The key difficulty here lies in the fact that the implication (iii) \Rightarrow (ii) from Proposition 1 is not true over such trees.

An intriguing challenge left for future study is to look at extensions of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML and/or 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML. Natural extensions are inverse modalities, the universal modality, graded modalities, and constants (corresponding to inverse roles, the universal role, counting quantifiers, and nominals in DL speech). We expect the adaptation to inverse modalities to be only minor. Also graded modalities look innocent if they are allowed both in the larger logic and in the separator logic. If we only extend μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML with graded modalities and ask for separators in 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML (without graded modailites), we would have to combine our techniques with the ones from [12], which is potentially challenging. We expect universal modality and/or constants to pose more technical difficulties as well. Intuitively, adding a universal modality or constants leads to the loss of the strong locality underlying Proposition 1.

References

  • [1] Alessandro Artale, Jean Christoph Jung, Andrea Mazzullo, Ana Ozaki, and Frank Wolter. Living without beth and craig: Definitions and interpolants in description and modal logics with nominals and role inclusions. ACM Trans. Comput. Log., 24(4):34:1–34:51, 2023.
  • [2] Christel Baier and Joost-Pieter Katoen. Principles of model checking. MIT Press, 2008.
  • [3] Patrick Blackburn, J. F. A. K. van Benthem, and Frank Wolter, editors. Handbook of Modal Logic, volume 3 of Studies in logic and practical reasoning. North-Holland, 2007.
  • [4] Mikołaj Bojańczyk and Wojciech Czerwiński. Automata Toolbox. Univrsity of Warsaw, 2018.
  • [5] Giovanna D’Agostino and Marco Hollenberg. Uniform interpolation, automata and the modal μ𝜇\muitalic_μ-calculus. In Marcus Kracht, Maarten de Rijke, Heinrich Wansing, and Michael Zakharyaschev, editors, Advances in Modal Logic 1, papers from the first workshop on "Advances in Modal logic," held in Berlin, Germany, 8-10 October 1996, pages 73–84. CSLI Publications, 1996.
  • [6] Michael J. Fischer and Richard E. Ladner. Propositional dynamic logic of regular programs. J. Comput. Syst. Sci., 18(2):194–211, 1979.
  • [7] Tim French, Wiebe van der Hoek, Petar Iliev, and Barteld P. Kooi. On the succinctness of some modal logics. Artif. Intell., 197:56–85, 2013.
  • [8] David Janin and Igor Walukiewicz. On the expressive completeness of the propositional mu-calculus with respect to monadic second order logic. In Ugo Montanari and Vladimiro Sassone, editors, CONCUR ’96: Concurrency Theory, pages 263–277, 1996.
  • [9] Jean Christoph Jung, Carsten Lutz, Hadrien Pulcini, and Frank Wolter. Logical separability of labeled data examples under ontologies. Artif. Intell., 313:103785, 2022.
  • [10] Jean Christoph Jung and Frank Wolter. Living without beth and craig: Definitions and interpolants in the guarded and two-variable fragments. In 36th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2021, Rome, Italy, June 29 - July 2, 2021, pages 1–14. IEEE, 2021.
  • [11] Dexter Kozen. Results on the propositional mu-calculus. Theor. Comput. Sci., 27:333–354, 1983.
  • [12] Louwe Kuijer, Tony Tan, Frank Wolter, and Michael Zakharyaschev. Separating counting from non-counting in fragments of two-variable first-order logic (extended abstract). In Proc. of DL 2024, 2024.
  • [13] Karoliina Lehtinen and Sandra Quickert. Deciding the first levels of the modal mu alternation hierarchy by formula construction. In Stephan Kreutzer, editor, 24th EACSL Annual Conference on Computer Science Logic, CSL 2015, September 7-10, 2015, Berlin, Germany, volume 41 of LIPIcs, pages 457–471. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2015.
  • [14] Denis Mayr Lima Martins. Reverse engineering database queries from examples: State-of-the-art, challenges, and research opportunities. Inf. Syst., 83:89–100, 2019.
  • [15] Maarten Marx. Interpolation in modal logic. In International Conference on Algebraic Methodology and Software Technology, 1999.
  • [16] Martin Otto. Eliminating recursion in the μ𝜇\mathrm{\mu}italic_μ-calculus. In Christoph Meinel and Sophie Tison, editors, STACS 99, 16th Annual Symposium on Theoretical Aspects of Computer Science, Trier, Germany, March 4-6, 1999, Proceedings, volume 1563 of Lecture Notes in Computer Science, pages 531–540. Springer, 1999.
  • [17] Thomas Place and Marc Zeitoun. Separating regular languages with first-order logic. Log. Methods Comput. Sci., 12(1), 2016.
  • [18] Andreas Potthoff. First-order logic on finite trees. In Theory and Practice of Software Development, 1995.
  • [19] Vaughan R. Pratt. A decidable mu-calculus: Preliminary report. In 22nd Annual Symposium on Foundations of Computer Science, Nashville, Tennessee, USA, 28-30 October 1981, pages 421–427. IEEE Computer Society, 1981.
  • [20] A. Prasad Sistla and Edmund M. Clarke. The complexity of propositional linear temporal logics. J. ACM, 32(3):733–749, 1985.
  • [21] Moshe Y. Vardi. Reasoning about the past with two-way automata. In Kim Guldstrand Larsen, Sven Skyum, and Glynn Winskel, editors, Automata, Languages and Programming, 25th International Colloquium, ICALP’98, Aalborg, Denmark, July 13-17, 1998, Proceedings, volume 1443 of Lecture Notes in Computer Science, pages 628–641. Springer, 1998.
  • [22] Yde Venema. Lectures on the modal μ𝜇\muitalic_μ-calculus, 2020.
  • [23] Albert Visser. Uniform interpolation and layered bisimulation. In Petr Hájek, editor, Gödel ’96: Logical Foundations of Mathematics, Computer Science and Physics - Kurt Gödel’s Legacy, Lecture Notes in Logic, page 139–164. Springer, 1996.

Throughout the appendix we use the term tallness. The tallness of a tree is the distance from the root to the closest leaf (or \infty if it the tree has no leafs).

Appendix A Separability Coincides with Craig Separability

We prove Theorem 2 which says that φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are modally separable iff they are Craig modally separable.

Proof.

Clearly, any Craig modal separator is also a modal separapator.

Conversely, suppose there is a modal separator ψ𝜓\psiitalic_ψ of φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, and let n𝑛nitalic_n be its modal depth. We use the n𝑛nitalic_n-uniform consequences as defined in Definition 2. Let θ𝜃\thetaitalic_θ be the n𝑛nitalic_n-uniform consequence of φ𝜑\varphiitalic_φ and θsuperscript𝜃\theta^{\prime}italic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT the n𝑛nitalic_n-uniform consequence of φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (both exist, see the discussion after Definition 2). Note that ¬θ¬φmodelssuperscript𝜃superscript𝜑\neg\theta^{\prime}\models\neg\varphi^{\prime}¬ italic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ ¬ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. By definition of n𝑛nitalic_n-uniform consequence, we have θψmodels𝜃𝜓\theta\models\psiitalic_θ ⊧ italic_ψ and ψ¬θmodels𝜓superscript𝜃\psi\models\neg\theta^{\prime}italic_ψ ⊧ ¬ italic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, and thus θθmodels𝜃superscript𝜃\theta\models\theta^{\prime}italic_θ ⊧ italic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Since 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML enjoys Craig interpolation, there is an interpolant ψsuperscript𝜓\psi^{\prime}italic_ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT for θ,θ𝜃superscript𝜃\theta,\theta^{\prime}italic_θ , italic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT which is also a Craig modal separator of φ,φ𝜑superscript𝜑\varphi,\varphi^{\prime}italic_φ , italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. ∎

Appendix B Model-theoretic Characterization

We prove Proposition 1.

The implication (i) \Rightarrow (iv) is straightforward. Separator ψ𝖬𝖫n𝜓superscript𝖬𝖫𝑛\psi\in\mathsf{ML}^{n}italic_ψ ∈ sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT cannot distinguish models identical up to depth n𝑛nitalic_n. If φmodels𝜑\mathcal{M}\models\varphicaligraphic_M ⊧ italic_φ then ψmodels𝜓\mathcal{M}\models\psicaligraphic_M ⊧ italic_ψ and so ψmodelssuperscript𝜓\mathcal{M}^{\prime}\models\psicaligraphic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ italic_ψ which in turn implies ¬φmodelssuperscriptsuperscript𝜑\mathcal{M}^{\prime}\models\neg\varphi^{\prime}caligraphic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ ¬ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

Our argument for (iv)\implies(iii) uses a classical characterization of the semantics of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML formulae in terms of parity games. Since we do not want to introduce games we only sketch the (easy) construction; the necessary definitions can be found in [22].

We prove the implication by contrapositive. Assume φmodels𝜑\mathcal{M}\models\varphicaligraphic_M ⊧ italic_φ and φmodelssuperscriptsuperscript𝜑\mathcal{M}^{\prime}\models\varphi^{\prime}caligraphic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT identical up to depth n𝑛nitalic_n. The fact that φmodels𝜑\mathcal{M}\models\varphicaligraphic_M ⊧ italic_φ is equivalent to existence of a winning strategy in an appropriately defined parity game 𝒢φsubscript𝒢𝜑\mathcal{G}_{\varphi}caligraphic_G start_POSTSUBSCRIPT italic_φ end_POSTSUBSCRIPT. Positions of the game are pairs: (v,θ)𝑣𝜃(v,\theta)( italic_v , italic_θ ) with vM𝑣𝑀v\in Mitalic_v ∈ italic_M and θ𝜃\thetaitalic_θ subformula of φ𝜑\varphiitalic_φ and a move from (v,θ)𝑣𝜃(v,\theta)( italic_v , italic_θ ) to (v,θ)superscript𝑣superscript𝜃(v^{\prime},\theta^{\prime})( italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_θ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) is only allowed if vvsuperscriptabsent𝑣superscript𝑣v\stackrel{{\scriptstyle}}{{\to}}v^{\prime}italic_v start_RELOP SUPERSCRIPTOP start_ARG → end_ARG start_ARG end_ARG end_RELOP italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. A similar game 𝒢φsubscript𝒢superscript𝜑\mathcal{G}_{\varphi^{\prime}}caligraphic_G start_POSTSUBSCRIPT italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT captures the semantics of φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

Take positional winning strategies σ𝜎\sigmaitalic_σ and σsuperscript𝜎\sigma^{\prime}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT for veve\exists\text{ve}∃ ve in the games 𝒢φsubscript𝒢𝜑\mathcal{G}_{\varphi}caligraphic_G start_POSTSUBSCRIPT italic_φ end_POSTSUBSCRIPT and 𝒢φsubscript𝒢superscript𝜑\mathcal{G}_{\varphi^{\prime}}caligraphic_G start_POSTSUBSCRIPT italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT. We trim \mathcal{M}caligraphic_M: only keep the points that belong to a position chosen by veve\exists\text{ve}∃ ve in either a σ𝜎\sigmaitalic_σ- or σsuperscript𝜎\sigma^{\prime}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT-play, and remove all the other ones. After that, also remove points that become inaccessible from the root so that the resulting structure is a tree. The tree 𝒩𝒩\mathcal{N}caligraphic_N obtained this way still satisfies φ𝜑\varphiitalic_φ because the strategy σ𝜎\sigmaitalic_σ remains winning. For every vM𝑣𝑀v\in Mitalic_v ∈ italic_M both games together only have k𝑘kitalic_k positions containing v𝑣vitalic_v. Thus, by positionality, among the children of v𝑣vitalic_v only at most k𝑘kitalic_k many belong to a position chosen by σ𝜎\sigmaitalic_σ or σsuperscript𝜎\sigma^{\prime}italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Since only such points belong to N𝑁Nitalic_N, it follows that 𝒩𝒩\mathcal{N}caligraphic_N has branching at most s𝑠sitalic_s. We trim superscript\mathcal{M}^{\prime}caligraphic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT to 𝒩φmodelssuperscript𝒩superscript𝜑\mathcal{N}^{\prime}\models\varphi^{\prime}caligraphic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT in the same way. By construction 𝒩𝒩\mathcal{N}caligraphic_N and 𝒩superscript𝒩\mathcal{N}^{\prime}caligraphic_N start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are identical up to depth n𝑛nitalic_n which violates (iv).

To prove (iii) \Rightarrow (ii) assume towards contradiction that there are models φmodels𝜑\mathcal{M}\models\varphicaligraphic_M ⊧ italic_φ and φmodelssuperscriptsuperscript𝜑\mathcal{M}^{\prime}\models\varphi^{\prime}caligraphic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT linked by an n𝑛nitalic_n-bisimulation Z𝑍Zitalic_Z. We construct trees Zφmodelssubscript𝑍𝜑\mathcal{M}_{Z}\models\varphicaligraphic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ⊧ italic_φ and Zφmodelssuperscriptsubscript𝑍superscript𝜑\mathcal{M}_{Z}^{\prime}\models\varphi^{\prime}caligraphic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT identical up to depth n𝑛nitalic_n, therefore reaching a contradiction with (iii).

Without loss of generality we assume that \mathcal{M}caligraphic_M and superscript\mathcal{M}^{\prime}caligraphic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are trees, otherwise they can be unravelled. The tree Zsubscript𝑍\mathcal{M}_{Z}caligraphic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT is as follows. It has universe:

MZ=MZsubscript𝑀𝑍𝑀𝑍M_{Z}=M\cup Zitalic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT = italic_M ∪ italic_Z

and the pair consisting of the roots of \mathcal{M}caligraphic_M and superscript\mathcal{M}^{\prime}caligraphic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (which by assumption belongs to Z𝑍Zitalic_Z) is taken as the new root. Edges between pairs from Z𝑍Zitalic_Z are defined pointwise (that is: (v,v)(w,w)superscriptabsent𝑣superscript𝑣𝑤𝑤(v,v^{\prime})\stackrel{{\scriptstyle}}{{\to}}(w,w)( italic_v , italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_RELOP SUPERSCRIPTOP start_ARG → end_ARG start_ARG end_ARG end_RELOP ( italic_w , italic_w ) iff vwsuperscriptabsent𝑣𝑤v\stackrel{{\scriptstyle}}{{\to}}witalic_v start_RELOP SUPERSCRIPTOP start_ARG → end_ARG start_ARG end_ARG end_RELOP italic_w and vwsuperscriptabsentsuperscript𝑣superscript𝑤v^{\prime}\stackrel{{\scriptstyle}}{{\to}}w^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_RELOP SUPERSCRIPTOP start_ARG → end_ARG start_ARG end_ARG end_RELOP italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT). Edges between points in M𝑀Mitalic_M are taken from \mathcal{M}caligraphic_M. On top of that, we add (v,v)wsuperscriptabsent𝑣superscript𝑣𝑤(v,v^{\prime})\stackrel{{\scriptstyle}}{{\to}}w( italic_v , italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_RELOP SUPERSCRIPTOP start_ARG → end_ARG start_ARG end_ARG end_RELOP italic_w iff in \mathcal{M}caligraphic_M the point v𝑣vitalic_v is at depth precisely n𝑛nitalic_n and vwsuperscriptabsent𝑣𝑤v\stackrel{{\scriptstyle}}{{\to}}witalic_v start_RELOP SUPERSCRIPTOP start_ARG → end_ARG start_ARG end_ARG end_RELOP italic_w. The colors are inherited from \mathcal{M}caligraphic_M on M𝑀Mitalic_M and from whichever coordinate on Z𝑍Zitalic_Z (points linked by Z𝑍Zitalic_Z always have the same color).

Consider the function f:MZM:𝑓subscript𝑀𝑍𝑀f:M_{Z}\to Mitalic_f : italic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT → italic_M defined as the left projection on Z𝑍Zitalic_Z and identity on M𝑀Mitalic_M:

f(v,v)𝑓𝑣superscript𝑣\displaystyle f(v,v^{\prime})italic_f ( italic_v , italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) =vabsent𝑣\displaystyle=v= italic_v
f(v)𝑓𝑣\displaystyle f(v)italic_f ( italic_v ) =v.absent𝑣\displaystyle=v.= italic_v .

The graph of f𝑓fitalic_f is a bisimulation between Zsubscript𝑍\mathcal{M}_{Z}caligraphic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT and \mathcal{M}caligraphic_M. In particular, Zsubscript𝑍\mathcal{M}_{Z}caligraphic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT satisfies φ𝜑\varphiitalic_φ. We define Zsuperscriptsubscript𝑍\mathcal{M}_{Z}^{\prime}caligraphic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT satisfying φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT symmetrically. Then Zsubscript𝑍\mathcal{M}_{Z}caligraphic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT and Zsuperscriptsubscript𝑍\mathcal{M}_{Z}^{\prime}caligraphic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT satisfy φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and are identical up to depth n𝑛nitalic_n (the n𝑛nitalic_n-prefixes of both Zsubscript𝑍\mathcal{M}_{Z}caligraphic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT and Zsuperscriptsubscript𝑍\mathcal{M}_{Z}^{\prime}caligraphic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are equal Z𝑍Zitalic_Z). Technically, Zsubscript𝑍\mathcal{M}_{Z}caligraphic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT and Zsuperscriptsubscript𝑍\mathcal{M}_{Z}^{\prime}caligraphic_M start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are directed acyclic graphs but not necessarily trees. However, they can be turned into trees by removing inaccessible points and unravelling. Both these operations preserve satisfaction of φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and identity of n𝑛nitalic_n-prefixes.

For the implication (ii) \Rightarrow (i) one can define ψ𝜓\psiitalic_ψ as (any) n𝑛nitalic_n-uniform consequence of φ𝜑\varphiitalic_φ. An explicit instance of such n𝑛nitalic_n-uniform consequence is the disjunction of all 𝖬𝖫nsuperscript𝖬𝖫𝑛\mathsf{ML}^{n}sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT-types consistent with φ𝜑\varphiitalic_φ (see Definition 2 of n𝑛nitalic_n-uniform consequences and the following discussion).

Appendix C Separability over All Models is in ExpTime

We prove the upper bounds in Theorem 3. By Proposition 1 deciding modal separability boils down to checking if there is n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N for which (iv) holds. Using well-known properties of 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML and μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML (namely, closure under relativization) we may reduce the problem to the special case when models under consideration are full k𝑘kitalic_k-ary trees, meaning that every point has precisely k𝑘kitalic_k children. Under this assumption the graphs underlying models are all isomorphic to the (unlabelled) full k𝑘kitalic_k-ary tree 𝒦=(K,)𝒦𝐾superscriptabsent\mathcal{K}=(K,\stackrel{{\scriptstyle}}{{\to}})caligraphic_K = ( italic_K , start_RELOP SUPERSCRIPTOP start_ARG → end_ARG start_ARG end_ARG end_RELOP ). Hence, we identify models with valuations 𝗏𝖺𝗅:KΣ:𝗏𝖺𝗅𝐾Σ\mathsf{val}:K\to\Sigmasansserif_val : italic_K → roman_Σ. Let us call a finite prefix XK𝑋𝐾X\subseteq Kitalic_X ⊆ italic_K sufficient for separation if for every 𝗏𝖺𝗅,𝗏𝖺𝗅:KΣ:𝗏𝖺𝗅superscript𝗏𝖺𝗅𝐾Σ\mathsf{val},\mathsf{val}^{\prime}:K\to\Sigmasansserif_val , sansserif_val start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT : italic_K → roman_Σ identical on X𝑋Xitalic_X: 𝗏𝖺𝗅φmodels𝗏𝖺𝗅𝜑\mathsf{val}\models\varphisansserif_val ⊧ italic_φ implies 𝗏𝖺𝗅¬φmodelssuperscript𝗏𝖺𝗅superscript𝜑\mathsf{val}^{\prime}\models\neg\varphi^{\prime}sansserif_val start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊧ ¬ italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Denote the set of such prefixes:

𝖲𝖤𝖯={XK|X is a finite prefix of K sufficient for separation}.𝖲𝖤𝖯conditional-set𝑋𝐾X is a finite prefix of K sufficient for separation\mathsf{SEP}=\{X\subseteq K\ |\ \text{$X$ is a finite prefix of $K$ sufficient% for separation}\}.sansserif_SEP = { italic_X ⊆ italic_K | italic_X is a finite prefix of italic_K sufficient for separation } .

It follows directly from the definition that for every n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N:

The n𝑛nitalic_n-prefix K|nK_{|_{n}}italic_K start_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT of K𝐾Kitalic_K is in 𝖲𝖤𝖯𝖲𝖤𝖯\mathsf{SEP}sansserif_SEP iff\iff Condition (iv) is true for n𝑛nitalic_n. (3)

The set 𝖲𝖤𝖯𝖲𝖤𝖯\mathsf{SEP}sansserif_SEP can be viewed as a language of finite trees. This language is closed under taking finite supermodels, so it contains K|nK_{|_{n}}italic_K start_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT for some n𝑛nitalic_n iff it is nonempty. It follows that separability of φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is equivalent to nonemptiness of 𝖲𝖤𝖯𝖲𝖤𝖯\mathsf{SEP}sansserif_SEP.

We construct an automaton \mathcal{B}caligraphic_B that accepts finite trees not belonging to 𝖲𝖤𝖯𝖲𝖤𝖯\mathsf{SEP}sansserif_SEP. Take automata 𝒜𝒜\mathcal{A}caligraphic_A and 𝒜superscript𝒜\mathcal{A}^{\prime}caligraphic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT equivalent to φ𝜑\varphiitalic_φ and φsuperscript𝜑\varphi^{\prime}italic_φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT of size exponential in k𝑘kitalic_k. The idea is that given a finite tree XK𝑋𝐾X\subseteq Kitalic_X ⊆ italic_K the automaton \mathcal{B}caligraphic_B guesses 𝗏𝖺𝗅:XΣ:𝗏𝖺𝗅𝑋Σ\mathsf{val}:X\to\Sigmasansserif_val : italic_X → roman_Σ, ρ:XQ:𝜌𝑋𝑄\rho:X\to Qitalic_ρ : italic_X → italic_Q and ρ:XQ:superscript𝜌𝑋superscript𝑄\rho^{\prime}:X\to Q^{\prime}italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT : italic_X → italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT that can be extended to a valuation 𝗏𝖺𝗅+:KΣ:subscript𝗏𝖺𝗅𝐾Σ\mathsf{val}_{+}:K\to\Sigmasansserif_val start_POSTSUBSCRIPT + end_POSTSUBSCRIPT : italic_K → roman_Σ and accepting runs ρ+:KQ:subscript𝜌𝐾𝑄\rho_{+}:K\to Qitalic_ρ start_POSTSUBSCRIPT + end_POSTSUBSCRIPT : italic_K → italic_Q of 𝒜𝒜\mathcal{A}caligraphic_A and ρ+:KQ:superscriptsubscript𝜌𝐾superscript𝑄\rho_{+}^{\prime}:K\to Q^{\prime}italic_ρ start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT : italic_K → italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT of 𝒜superscript𝒜\mathcal{A}^{\prime}caligraphic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT consistent with 𝗏𝖺𝗅+subscript𝗏𝖺𝗅\mathsf{val}_{+}sansserif_val start_POSTSUBSCRIPT + end_POSTSUBSCRIPT. \mathcal{B}caligraphic_B has the set Q=Q×Qsuperscript𝑄𝑄superscript𝑄Q^{\mathcal{B}}=Q\times Q^{\prime}italic_Q start_POSTSUPERSCRIPT caligraphic_B end_POSTSUPERSCRIPT = italic_Q × italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT as states and (qI,qI)subscript𝑞𝐼superscriptsubscript𝑞𝐼(q_{I},q_{I}^{\prime})( italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) is the initial one. Transition function is defined in two steps. First take:

Rδ0(q,q)𝑅superscriptsubscript𝛿0𝑞superscript𝑞\displaystyle R\in\delta_{0}^{\mathcal{B}}(q,q^{\prime})italic_R ∈ italic_δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_B end_POSTSUPERSCRIPT ( italic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
iff\displaystyle\iff
There is cΣ𝑐Σc\in\Sigmaitalic_c ∈ roman_Σ, such that the left projection of R𝑅Ritalic_R
belongs to δ(q,c) and the right one to δ(q,c).belongs to δ(q,c) and the right one to δ(q,c)\displaystyle\text{belongs to $\delta(q,c)$ and the right one to $\delta^{% \prime}(q^{\prime},c)$}.belongs to italic_δ ( italic_q , italic_c ) and the right one to italic_δ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_c ) .

This describes guessing a coloring on the full 𝒦𝒦\mathcal{K}caligraphic_K and runs of both 𝒜𝒜\mathcal{A}caligraphic_A and 𝒜superscript𝒜\mathcal{A}^{\prime}caligraphic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT consistent with that coloring. To handle points with less than k𝑘kitalic_k children we put:

Rδ(q,q)𝑅superscript𝛿𝑞superscript𝑞\displaystyle R\in\delta^{\mathcal{B}}(q,q^{\prime})italic_R ∈ italic_δ start_POSTSUPERSCRIPT caligraphic_B end_POSTSUPERSCRIPT ( italic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
iff\displaystyle\iff
R𝑅Ritalic_R can be obtained from some Rδ0(q,q)superscript𝑅superscriptsubscript𝛿0𝑞superscript𝑞R^{\prime}\in\delta_{0}^{\mathcal{B}}(q,q^{\prime})italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_δ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_B end_POSTSUPERSCRIPT ( italic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
by removing some consistent pairs.

Here consistency of a pair of states (p,p)𝑝superscript𝑝(p,p^{\prime})( italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) means that there exists a model accepted by both 𝒜[qIp]𝒜delimited-[]subscript𝑞𝐼𝑝\mathcal{A}[q_{I}\mapsfrom p]caligraphic_A [ italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ↤ italic_p ] and 𝒜[qIp]superscript𝒜delimited-[]superscriptsubscript𝑞𝐼superscript𝑝\mathcal{A}^{\prime}[q_{I}^{\prime}\mapsfrom p^{\prime}]caligraphic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [ italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ↤ italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ]. The (trivial) rank function of \mathcal{B}caligraphic_B assigns a bad rank to every state so that only finite trees are accepted.

To finish the proof let us first prove Proposition 2.

Proof.

Put l=|Q|×|Q|+1𝑙𝑄superscript𝑄1l=|Q|\times|Q^{\prime}|+1italic_l = | italic_Q | × | italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | + 1. The proposition follows directly from (3) and:

𝖲𝖤𝖯𝖲𝖤𝖯\mathsf{SEP}sansserif_SEP is nonempty iff\iff it contains the l𝑙litalic_l-prefix K|lK_{|_{l}}italic_K start_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT of K𝐾Kitalic_K. (4)

Only the left-to-right implication is nontrivial and we prove it by contrapositive. If K|lK_{|_{l}}italic_K start_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT does not belong to 𝖲𝖤𝖯𝖲𝖤𝖯\mathsf{SEP}sansserif_SEP then this is witnessed by a run ρ𝜌\rhoitalic_ρ of \mathcal{B}caligraphic_B on K|lK_{|_{l}}italic_K start_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT. Since every leaf v𝑣vitalic_v of K|lK_{|_{l}}italic_K start_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT is at depth greater then |Q|superscript𝑄|Q^{\mathcal{B}}|| italic_Q start_POSTSUPERSCRIPT caligraphic_B end_POSTSUPERSCRIPT |, on the path from the root to v𝑣vitalic_v some state must repeat. Hence, the run can be pumped to finite prefixes of K𝐾Kitalic_K of arbitrarily big tallness. It follows that 𝖲𝖤𝖯𝖲𝖤𝖯\mathsf{SEP}sansserif_SEP is empty. This completes the proof of Proposition 2. ∎

The language of \mathcal{B}caligraphic_B is closed under taking submodels. Hence, the right hand side of (4) is equivalent to \mathcal{B}caligraphic_B not accepting any tree of tallness at least l𝑙litalic_l (l𝑙litalic_l is the exponential bound given by Proposition 2). This can be easily checked in time polynomial in the size of \mathcal{B}caligraphic_B.

Appendix D Separability over Words is in PSpace

We prove the upper bounds in Theorem 4. We proceed with the same argument as in the case with general models, with two major simplifications. First, over words bisimilarity and identity coincide. Thus the equivalence of items (ii)(iii) and (iv) of Proposition 1 becomes trivial in the word case. Second, bounded tallness (which over words is the same as finiteness) of the language of the appropriate word automaton \mathcal{B}caligraphic_B can be checked more easily. Instead of writing down the automaton we construct it on the fly, and only remember a single state at every moment. This yields a PSpace (and not ExpTime) decision procedure.

Appendix E Lower Bounds over All Models

We construct formulae of μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML for which there exist equivalent formulae in 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML but only doubly exponentially larger.

Consider the nonstandard modal operator rdelimited-⟨⟩superscript𝑟\langle\exists^{r}\rangle⟨ ∃ start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT ⟩ which means “there exists a point reachable from the root and satisfying”. It is straightforward to encode such modality (and its dual [r]delimited-[]superscript𝑟[\exists^{r}][ ∃ start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT ] meaning “for every reachable point”) by a μ𝖬𝖫𝜇𝖬𝖫\mu\mathsf{ML}italic_μ sansserif_ML formula of constant size. It therefore suffices to define the sequence (φn)nsubscriptsubscript𝜑𝑛𝑛(\varphi_{n})_{n\in\mathbb{N}}( italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_n ∈ blackboard_N end_POSTSUBSCRIPT in the extension 𝖬𝖫+r𝖬𝖫superscript𝑟\mathsf{ML}+\exists^{r}sansserif_ML + ∃ start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT of 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML with such operators.

We construct φnsubscript𝜑𝑛\varphi_{n}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. For convenience assume n+1𝑛1n+1italic_n + 1 atomic propositions τ0,,τnsubscript𝜏0subscript𝜏𝑛\tau_{0},...,\tau_{n}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_τ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT (otherwise we encode these without a significant blowup in the size of φnsubscript𝜑𝑛\varphi_{n}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT). The propositions can be seen as an encoding of an (n+1)𝑛1(n+1)( italic_n + 1 )-bit binary counter. Consider the following property Cnsubscript𝐶𝑛C_{n}italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT of models: the color of the root encodes counter value 00 and for every (reachable) point v𝑣vitalic_v if the color of v𝑣vitalic_v encodes number i𝑖iitalic_i then either i=2n𝑖superscript2𝑛i=2^{n}italic_i = 2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and v𝑣vitalic_v has no children or all v𝑣vitalic_v’s children have a color that encodes i+1𝑖1i+1italic_i + 1. The property Cnsubscript𝐶𝑛C_{n}italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT can be easily expressed by a 𝖬𝖫+r𝖬𝖫superscript𝑟\mathsf{ML}+\exists^{r}sansserif_ML + ∃ start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT formula φnsubscript𝜑𝑛\varphi_{n}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT of size polynomial in n𝑛nitalic_n. Such φnsubscript𝜑𝑛\varphi_{n}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is equivalent to a modal formula. This follows immediately from Proposition 1 since property Cnsubscript𝐶𝑛C_{n}italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT implies that there are no paths longer than 2nsuperscript2𝑛2^{n}2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT.

However, no 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML formula smaller than 22nsuperscript2superscript2𝑛2^{2^{n}}2 start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT can be equivalent to φnsubscript𝜑𝑛\varphi_{n}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. To see this assume ψ𝖬𝖫𝜓𝖬𝖫\psi\in\mathsf{ML}italic_ψ ∈ sansserif_ML equivalent to φnsubscript𝜑𝑛\varphi_{n}italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. For every sequence of colors 𝖺1,𝖺2nsubscript𝖺1subscript𝖺superscript2𝑛\mathsf{a}_{1},...\mathsf{a}_{2^{n}}sansserif_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … sansserif_a start_POSTSUBSCRIPT 2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUBSCRIPT of length 2nsuperscript2𝑛2^{n}2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, ψ𝜓\psiitalic_ψ must contain a sequence ξ1,,ξ2nsubscript𝜉1subscript𝜉superscript2𝑛\xi_{1},...,\xi_{2^{n}}italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_ξ start_POSTSUBSCRIPT 2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUBSCRIPT of subformulae such that:

  • ξisubscript𝜉𝑖\xi_{i}italic_ξ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is a strict subformula of ξjsubscript𝜉𝑗\xi_{j}italic_ξ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT whenever i>j𝑖𝑗i>jitalic_i > italic_j;

  • each ξisubscript𝜉𝑖\xi_{i}italic_ξ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT begins with 𝖺idelimited-⟨⟩subscript𝖺𝑖\langle\mathsf{a}_{i}\rangle⟨ sansserif_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⟩ or [𝖺i]delimited-[]subscript𝖺𝑖[\mathsf{a}_{i}][ sansserif_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ];

  • the least strict superformula of ξisubscript𝜉𝑖\xi_{i}italic_ξ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT beginning with a modal operator is ξi+1subscript𝜉𝑖1\xi_{i+1}italic_ξ start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT.

If ψ𝜓\psiitalic_ψ did not contain such sequence ξ1,,ξ2nsubscript𝜉1subscript𝜉superscript2𝑛\xi_{1},...,\xi_{2^{n}}italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_ξ start_POSTSUBSCRIPT 2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUBSCRIPT, it would be indifferent to what happens in points only reachable via paths labelled with 𝖺1,𝖺2nsubscript𝖺1subscript𝖺superscript2𝑛\mathsf{a}_{1},...\mathsf{a}_{2^{n}}sansserif_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … sansserif_a start_POSTSUBSCRIPT 2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUBSCRIPT. It follows that there is an embedding of the 2nsuperscript2𝑛2^{n}2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT-prefix of the full binary tree to the syntax tree of ψ𝜓\psiitalic_ψ and so it has size at least 22nsuperscript2superscript2𝑛2^{2^{n}}2 start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT.

Note that the use of multiple atomic propositions in the above construction is only for convenience. With any reasonable encoding the proof could be adapted even to the setting with 𝖯𝗋𝗈𝗉=𝖯𝗋𝗈𝗉\mathsf{Prop}=\emptysetsansserif_Prop = ∅.

Appendix F Optimal Separators over Words

We prove Lemma 3. Recall that 𝒜=(Q,Σ,δ,qI,𝗋𝖺𝗇𝗄)𝒜𝑄Σ𝛿subscript𝑞𝐼𝗋𝖺𝗇𝗄\mathcal{A}=(Q,\Sigma,\delta,q_{I},\mathsf{rank})caligraphic_A = ( italic_Q , roman_Σ , italic_δ , italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT , sansserif_rank ) is the automaton under consideration. Since we are working over words, δ(q,c)𝛿𝑞𝑐\delta(q,c)italic_δ ( italic_q , italic_c ) contains only singleton sets and possibly the emtpy set. The latter case, δ(q,c)𝛿𝑞𝑐\emptyset\in\delta(q,c)∅ ∈ italic_δ ( italic_q , italic_c ), is of particular interest because this means that the automaton in state q𝑞qitalic_q reading color c𝑐citalic_c can “accept” even if it has not finished reading the input; in particular, the automaton can accept finite words as well. Denote with AccqsubscriptAcc𝑞\textit{Acc}_{q}Acc start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT the set of colors c𝑐citalic_c with δ(q,c)𝛿𝑞𝑐\emptyset\in\delta(q,c)∅ ∈ italic_δ ( italic_q , italic_c ). Further denote with ContqsubscriptCont𝑞\textit{Cont}_{q}Cont start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT the set of all c𝑐citalic_c such that 𝒜[qIq]𝒜delimited-[]subscript𝑞𝐼𝑞\mathcal{A}[q_{I}\mapsfrom q]caligraphic_A [ italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ↤ italic_q ] accepts a word starting with color c𝑐citalic_c.

We first construct 𝖬𝖫𝖬𝖫\mathsf{ML}sansserif_ML formulae ψpqmsubscriptsuperscript𝜓𝑚𝑝𝑞\psi^{m}_{pq}italic_ψ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p italic_q end_POSTSUBSCRIPT, for m,p,qQformulae-sequence𝑚𝑝𝑞𝑄m\in\mathbb{N},p,q\in Qitalic_m ∈ blackboard_N , italic_p , italic_q ∈ italic_Q such that for every word \mathcal{M}caligraphic_M:

  • ()(\ast)( ∗ )

    ψpqmmodelssubscriptsuperscript𝜓𝑚𝑝𝑞\mathcal{M}\models\psi^{m}_{pq}caligraphic_M ⊧ italic_ψ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p italic_q end_POSTSUBSCRIPT iff there is a run of 𝒜𝒜\mathcal{A}caligraphic_A from p𝑝pitalic_p to q𝑞qitalic_q on the m𝑚mitalic_m-prefix of \mathcal{M}caligraphic_M.

The definition is by induction on m𝑚mitalic_m:

ψpq0subscriptsuperscript𝜓0𝑝𝑞\displaystyle\psi^{0}_{pq}italic_ψ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p italic_q end_POSTSUBSCRIPT = if pq then  else absent if pq then  else \displaystyle=\text{ if $p\neq q$ then $\bot$ else $\top$}= if italic_p ≠ italic_q then ⊥ else ⊤
ψpq1subscriptsuperscript𝜓1𝑝𝑞\displaystyle\psi^{1}_{pq}italic_ψ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p italic_q end_POSTSUBSCRIPT ={ccΣ,{q}δ(p,c)}absentconditional-set𝑐formulae-sequence𝑐Σ𝑞𝛿𝑝𝑐\displaystyle=\bigvee\{c\ \mid\ c\in\Sigma,\{q\}\in\delta(p,c)\}= ⋁ { italic_c ∣ italic_c ∈ roman_Σ , { italic_q } ∈ italic_δ ( italic_p , italic_c ) }
ψpqmsubscriptsuperscript𝜓𝑚𝑝𝑞\displaystyle\psi^{m}_{pq}italic_ψ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p italic_q end_POSTSUBSCRIPT =qQψpqm/2ψqqm/2absentsubscriptsuperscript𝑞𝑄subscriptsuperscript𝜓𝑚2𝑝superscript𝑞subscriptsuperscript𝜓𝑚2superscript𝑞𝑞\displaystyle=\bigvee_{q^{\prime}\in Q}\psi^{\lfloor m/2\rfloor}_{pq^{\prime}}% \wedge\psi^{\lceil m/2\rceil}_{q^{\prime}q}= ⋁ start_POSTSUBSCRIPT italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Q end_POSTSUBSCRIPT italic_ψ start_POSTSUPERSCRIPT ⌊ italic_m / 2 ⌋ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∧ italic_ψ start_POSTSUPERSCRIPT ⌈ italic_m / 2 ⌉ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_q end_POSTSUBSCRIPT

It is routine to verify that ψpqmsubscriptsuperscript𝜓𝑚𝑝𝑞\psi^{m}_{pq}italic_ψ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p italic_q end_POSTSUBSCRIPT satisfies ()(\ast)( ∗ ) and is of size |ψpqm|O(|Q|m2)subscriptsuperscript𝜓𝑚𝑝𝑞𝑂𝑄superscript𝑚2|\psi^{m}_{pq}|\in O(|Q|\cdot m^{2})| italic_ψ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p italic_q end_POSTSUBSCRIPT | ∈ italic_O ( | italic_Q | ⋅ italic_m start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ).

We finish the construction by setting:

ψn=qQ(ψqIqnncContqc)mnqQ(ψqIqmm+1mcAccqc).\psi_{n}=\bigvee_{q\in Q}\left(\psi^{n}_{q_{I}q}\wedge\Box^{n}\bigvee_{c\in% \textit{Cont}_{q}}c\right)\vee\bigvee_{m\leq n}\bigvee_{q\in Q}\left(\psi^{m}_% {q_{I}q}\wedge\Box^{m+1}\bot\wedge\Box^{m}\bigvee_{c\in\textit{Acc}_{q}}c% \right).italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = ⋁ start_POSTSUBSCRIPT italic_q ∈ italic_Q end_POSTSUBSCRIPT ( italic_ψ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ∧ □ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⋁ start_POSTSUBSCRIPT italic_c ∈ Cont start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ) ∨ ⋁ start_POSTSUBSCRIPT italic_m ≤ italic_n end_POSTSUBSCRIPT ⋁ start_POSTSUBSCRIPT italic_q ∈ italic_Q end_POSTSUBSCRIPT ( italic_ψ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ∧ □ start_POSTSUPERSCRIPT italic_m + 1 end_POSTSUPERSCRIPT ⊥ ∧ □ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ⋁ start_POSTSUBSCRIPT italic_c ∈ Acc start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ) .

It is readily checked that ψnsubscript𝜓𝑛\psi_{n}italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT satisfies the required size bounds. To verify that ψnθmodelssubscript𝜓𝑛𝜃\psi_{n}\models\thetaitalic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ⊧ italic_θ for every θ𝖬𝖫n𝜃superscript𝖬𝖫𝑛\theta\in\mathsf{ML}^{n}italic_θ ∈ sansserif_ML start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT with 𝒜θmodels𝒜𝜃\mathcal{A}\models\thetacaligraphic_A ⊧ italic_θ, we show the following equivalence for all words \mathcal{M}caligraphic_M:

ψn  there exists 𝒩𝒜 with 𝒩n.modelssubscript𝜓𝑛  there exists 𝒩𝒜 with 𝒩n.\displaystyle\mathcal{M}\models\psi_{n}\text{\hskip 14.22636pt $\iff$ \hskip 1% 4.22636pt there exists $\mathcal{N}\models\mathcal{A}$ with $\mathcal{N}\leftrightarroweq^{n}\mathcal% {M}$.}caligraphic_M ⊧ italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ⇔ there exists caligraphic_N ⊧ caligraphic_A with caligraphic_N start_ARROW start_RELOP SUPERSCRIPTOP start_ARG - end_ARG start_ARG ↔ end_ARG end_RELOP end_ARROW start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT caligraphic_M . (5)

For \Rightarrow, let \mathcal{M}caligraphic_M be a word structure with ψnmodelssubscript𝜓𝑛\mathcal{M}\models\psi_{n}caligraphic_M ⊧ italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. If ψqIqnncContqcmodelssubscriptsuperscript𝜓𝑛subscript𝑞𝐼𝑞superscript𝑛subscript𝑐subscriptCont𝑞𝑐\mathcal{M}\models\psi^{n}_{q_{I}q}\wedge\Box^{n}\bigvee_{c\in\textit{Cont}_{q% }}ccaligraphic_M ⊧ italic_ψ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ∧ □ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⋁ start_POSTSUBSCRIPT italic_c ∈ Cont start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c for some q𝑞qitalic_q, then by ()(\ast)( ∗ ), there is a run of 𝒜𝒜\mathcal{A}caligraphic_A from the initial state q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to some state qQ𝑞𝑄q\in Qitalic_q ∈ italic_Q when reading the n𝑛nitalic_n-prefix of \mathcal{M}caligraphic_M, and the last color in the prefix is c𝑐citalic_c. Since cContq𝑐subscriptCont𝑞c\in\textit{Cont}_{q}italic_c ∈ Cont start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT, we can extend the n𝑛nitalic_n-prefix of \mathcal{M}caligraphic_M to a word 𝒩𝒩\mathcal{N}caligraphic_N accepted by 𝒜𝒜\mathcal{A}caligraphic_A. If ψqIqmm+1cAccqc\mathcal{M}\models\psi^{m}_{q_{I}q}\wedge\Box^{m+1}\bot\wedge\bigvee_{c\in% \textit{Acc}_{q}}ccaligraphic_M ⊧ italic_ψ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ∧ □ start_POSTSUPERSCRIPT italic_m + 1 end_POSTSUPERSCRIPT ⊥ ∧ ⋁ start_POSTSUBSCRIPT italic_c ∈ Acc start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c, for some mn𝑚𝑛m\leq nitalic_m ≤ italic_n and qQ𝑞𝑄q\in Qitalic_q ∈ italic_Q, then \mathcal{M}caligraphic_M is a finite word of depth m𝑚mitalic_m that is accepted by the automaton. We can take 𝒩=𝒩\mathcal{N}=\mathcal{M}caligraphic_N = caligraphic_M in this case.

For \Leftarrow, let \mathcal{M}caligraphic_M be a word such that there is some 𝒩φmodels𝒩𝜑\mathcal{N}\models\varphicaligraphic_N ⊧ italic_φ with 𝒩nsuperscriptleftrightarroweq𝑛𝒩\mathcal{N}\leftrightarroweq^{n}\mathcal{M}caligraphic_N start_ARROW start_RELOP SUPERSCRIPTOP start_ARG - end_ARG start_ARG ↔ end_ARG end_RELOP end_ARROW start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT caligraphic_M. The former condition implies that 𝒩𝒜models𝒩𝒜\mathcal{N}\models\mathcal{A}caligraphic_N ⊧ caligraphic_A and thus there is an accepting run ρ𝜌\rhoitalic_ρ of 𝒜𝒜\mathcal{A}caligraphic_A on 𝒩𝒩\mathcal{N}caligraphic_N, and the latter implies that 𝒩𝒩\mathcal{N}caligraphic_N and \mathcal{M}caligraphic_M coincide on their n𝑛nitalic_n-prefixes. We distinguish cases. If the depth of 𝒩𝒩\mathcal{N}caligraphic_N is greater than n𝑛nitalic_n, then the n𝑛nitalic_n-prefix of ρ𝜌\rhoitalic_ρ ending in state q𝑞qitalic_q witnesses ψqIqnncContqcmodelssubscriptsuperscript𝜓𝑛subscript𝑞𝐼𝑞superscript𝑛subscript𝑐subscriptCont𝑞𝑐\mathcal{M}\models\psi^{n}_{q_{I}q}\wedge\Box^{n}\bigvee_{c\in\textit{Cont}_{q% }}ccaligraphic_M ⊧ italic_ψ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ∧ □ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⋁ start_POSTSUBSCRIPT italic_c ∈ Cont start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c. Otherwise, the depth of 𝒩𝒩\mathcal{N}caligraphic_N is mn𝑚𝑛m\leq nitalic_m ≤ italic_n and the run ρ𝜌\rhoitalic_ρ ending in q𝑞qitalic_q witnesses that ψqIqmm+1cAccqc\mathcal{M}\models\psi^{m}_{q_{I}q}\wedge\Box^{m+1}\bot\wedge\bigvee_{c\in% \textit{Acc}_{q}}ccaligraphic_M ⊧ italic_ψ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ∧ □ start_POSTSUPERSCRIPT italic_m + 1 end_POSTSUPERSCRIPT ⊥ ∧ ⋁ start_POSTSUBSCRIPT italic_c ∈ Acc start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c.