\forestset

qtree/.style=for tree=parent anchor=south, child anchor=north,align=center,inner sep=0pt \contourlength1.2pt

11institutetext: Radboud University, Nijmegen, the Netherlands
11email: {loes.kruger,sebastian.junges,jurriaan.rot}@ru.nl

State Matching and Multiple References in Adaptive Active Automata Learningthanks: This research is partially supported by the NWO grant No. VI.Vidi.223.096.

Loes Kruger✉[Uncaptioned image]    Sebastian Junges [Uncaptioned image]    Jurriaan Rot [Uncaptioned image]
Abstract

Active automata learning (AAL) is a method to infer state machines by interacting with black-box systems. Adaptive AAL aims to reduce the sample complexity of AAL by incorporating domain specific knowledge in the form of (similar) reference models. Such reference models appear naturally when learning multiple versions or variants of a software system. In this paper, we present state matching, which allows flexible use of the structure of these reference models by the learner. State matching is the main ingredient of adaptive L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT, a novel framework for adaptive learning, built on top of L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT. Our empirical evaluation shows that adaptive L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT improves the state of the art by up to two orders of magnitude.

1 Introduction

Automata learning aims to extract state machines from observed input-output sequences of some system-under-learning (SUL). Active automata learning (AAL) assumes that one has black-box access to this SUL, allowing the learner to incrementally choose inputs and observe the outputs. The models learned by AAL can be used as a documentation effort, but are more typically used as basis for testing, verification, conformance checking, fingerprinting—see [22, 11] for an overview of applications. The classical algorithm for AAL is Lsuperscript𝐿L^{*}italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, introduced by Angluin [2]; state-of-the-art algorithms are, e.g., L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT [23] and TTT [13], which are available in toolboxes such as LearnLib [14] and AALpy [17].

The primary challenge in AAL is to reduce the number of inputs sent to the SUL, referred to as the sample complexity. To learn a 31-state machine with 22 inputs, state-of-the-art learners may send several million inputs to the SUL [23]. This is not necessarily unexpected: the underlying space of 31-state state machines is huge and it is nontrivial how to maximise information gain. The literature has investigated several approaches to accelerate learners, see the overview of [22]. Nevertheless, scalability remains a core challenge for AAL.

We study adaptive AAL [10], which aims to improve the sample efficiency by utilizing expert knowledge already given to the learner. In (regular) AAL, a learner commonly starts learning from scratch. In adaptive AAL, however, the learner is given a reference model, which ought to be similar to the SUL. Reference models occur naturally in many applications of AAL. For instance:linecolor=orange,backgroundcolor=orange!25,bordercolor=orange,]SJ:shortened these a bit (1) Systems evolve over time due to, e.g., bug fixes or new functionalities—and we may have learned the previous system; (2) Standard protocols may be implemented by a variety of tools; (3) The SUL may be a variant of other systems, e.g., being the same system executing in another environment, or a system configured differently.

Several algorithms for adaptive AAL have been proposed [10, 24, 5, 6, 9]. Intuitively, the idea is that these methods try to rebuild the part of the SUL which is similar to the reference model. This is achieved by deriving suitable queries from the reference model, using so-called access sequences to reach states, and so-called separating sequences to distinguish these from other states.linecolor=green,backgroundcolor=green!25,bordercolor=green,]LK:I would prefer to remove so-called and put emph around access and sep seq linecolor=orange,backgroundcolor=orange!25,bordercolor=orange,]SJ:ok with the emph, but i would keep the so-called as we have not introduced them and do not explain them here? It indicates that a reader does not need the definition These algorithms rely on a rather strict notion of similarity that depends on the way we reach these stateslinecolor=orange,backgroundcolor=orange!25,bordercolor=orange,]SJ:pls check. In particular, existing rebuilding algorithms cannot effectively learn an SUL from a reference model that has a different initial state, see Sec. 2. linecolor=green,backgroundcolor=green!25,bordercolor=green,]LK:Should be merged more with the related work possibly. Also feels a bit incomplete

We propose an approach to adaptive AAL based on state matching, which allows flexibly identifying parts of the unknown SUL where the reference model may be an informative guidelinecolor=orange,backgroundcolor=orange!25,bordercolor=orange,]SJ:removed: to search for queries. More specifically, in this approach, we match states in the model that we have learned so far (captured as a tree-shaped automaton) with states in the reference model such that the outputs agree on all enabled input sequences. This matching allows for targeted re-use of separating sequences from the reference model and is independent of the access sequences. We refine the approach by using approximate state matching, where we match a current state with one from the reference model that agrees on most inputs.

Approximate state matching is the essential ingredient for the novel AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT algorithm. This algorithm is a conservative extension of the recent L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT [23]. Along with approximate state matching, AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT includes rebuilding steps, which are similar to existing methods, but tightly integrated in L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT. Finally, AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT is the first approach with dedicated support to use more than one reference model.

Contributions. We make the following contributions to the state-of-the-art in adaptive AAL. First, we present state matching and its generalization to approximate state matching which allows flexible re-use of separating sequences from the reference model. Second, we include state matching and rebuilding in an unifying approach, called AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT, which generalizes the L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT algorithm for non-adaptive automata learning. We analyse the resulting framework in terms of termination and complexity. This framework naturally supports using multiple reference models as well as removing and adding inputs to the alphabet. Our empirical results show the efficacy of AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT. In particular, AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT may reduce the number of inputs to the SUL by two orders of magnitude. linecolor=green,backgroundcolor=green!25,bordercolor=green,]LK:Was in rel work not in intro: we consider pdlstar and IKV state of the art. Maybe put in experiments?

Related work. Adaptive AAL goes back to [10]. That paper, and many of the follow-up approaches [5, 4, 6, 9] re-use access sequences and separating sequences from the reference model (or from the data structures constructed when learning that model). The recent approach in [6] removes redundant access sequences during rebuilding and continues learning with informative separating sequences. In [24], an Lsuperscript𝐿L^{*}italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT-based adaptive AAL approach is proposed where the algorithm starts by including all separating sequences that arise when learning the reference model with Lsuperscript𝐿L^{*}italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, ignoring access sequences. This algorithm is used in [12] for a general study of the usefulness of adaptive AAL: Among others, the authors suggest using more advanced data structures than the observation tables in Lsuperscript𝐿L^{*}italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. Indeed, in [4] the internal data structure of the TTT algorithm is used [13] in the context of lifelong learning; the precise rebuilding approach is not describedlinecolor=green,backgroundcolor=green!25,bordercolor=green,]LK:Too defensive?. The recent [9] proposes an adaptive AAL method based on discrimination trees as used in the Kearns-Vazirani algorithm [15]. We consider the algorithms proposed in [6, 9] the state-of-the-art and have experimentally compared AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT in Sec. 8.

2 Overview

Refer to caption
(a) 𝒮𝒮\mathcal{S}caligraphic_S
Refer to caption
(b) 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT
Refer to caption
(c) 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT
Refer to caption
(d) 3subscript3\mathcal{R}_{3}caligraphic_R start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT
Figure 1: An SUL 𝒮𝒮\mathcal{S}caligraphic_S and three reference models 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and 3subscript3\mathcal{R}_{3}caligraphic_R start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT.

We illustrate (1) how adaptive AAL uses a reference model to help learn a system and (2) how this may reduce the sample complexity of the learner.

MAT framework. We recall the standard setting for AAL: Angluin’s MAT framework, cf. [22, 11]. Here, the learner has no direct access to the SUL, but may ask output queries (OQs): these return, for a given input sequence, the sequence of outputs from the SUL; and equivalence queries (EQs): these take a Mealy machine \mathcal{H}caligraphic_H as input, and return whether or not \mathcal{H}caligraphic_H is equivalent to the SUL. In case it is not, a counterexample is provided in the form of a sequence of inputs for which \mathcal{H}caligraphic_H and the SUL return different outputs. EQs are expensive [25, 3, 8, 21], therefore, we aim to learn the SUL using primarily OQs.

Apartness. Learning algorithms in the MAT framework typically assume that two states are equivalent as long as their known residual languages are equivalent. To discover a new state, we must therefore (1) access it by an input sequence and (2) prove this state distinct (apart) from the other states that we already know. Consider the SUL 𝒮𝒮\mathcal{S}caligraphic_S in Fig. 1(a). The access sequences c𝑐citalic_c, ca𝑐𝑎caitalic_c italic_a access q4subscript𝑞4q_{4}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT and q5subscript𝑞5q_{5}italic_q start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT, respectively, from the initial state. These states are different because the response to executing c𝑐citalic_c from q4subscript𝑞4q_{4}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT and q5subscript𝑞5q_{5}italic_q start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT is distinct: We say c𝑐citalic_c is a separating sequence for q4subscript𝑞4q_{4}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT and q5subscript𝑞5q_{5}italic_q start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT. This difference can be observed by posing OQs for cc𝑐𝑐ccitalic_c italic_c and cac𝑐𝑎𝑐cacitalic_c italic_a italic_c, consisting of the access sequences for q4subscript𝑞4q_{4}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT and q5subscript𝑞5q_{5}italic_q start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT followed by their separating sequence c𝑐citalic_c.

Aim. The aim of adaptive AAL is to learn SULs with fewer inputs, using knowledge in the form of a reference model, known to the learner and preferrably similar to the SUL. The discovery of states is accelerated by extracting candidates for both (1) access sequences and (2) separating sequences from the reference model.

Rebuilding. The state-of-the-art in adaptive AAL uses access sequences and separating sequences from the reference model [6, 9] in an initial phase. Consider the Mealy machine 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in Fig. 1(b) as a reference model for the SUL 𝒮𝒮\mathcal{S}caligraphic_S in Fig. 1(a). The sequences ε𝜀\varepsilonitalic_ε, c𝑐citalic_c, ca𝑐𝑎caitalic_c italic_a can be used to access all orange states in both 𝒮𝒮\mathcal{S}caligraphic_S and 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. The separating sequences c𝑐citalic_c and ac𝑎𝑐acitalic_a italic_c for these states in 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT also separate the orange states in 𝒮𝒮\mathcal{S}caligraphic_S. By asking OQs combining the access sequences and separating sequences, we discover all orange states for 𝒮𝒮\mathcal{S}caligraphic_S.

Limits of rebuilding. However, these rebuilding approaches have limitations. Consider 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in Fig. 1(c). The sequences ε𝜀\varepsilonitalic_ε, b𝑏bitalic_b, bb𝑏𝑏bbitalic_b italic_b and bbb𝑏𝑏𝑏bbbitalic_b italic_b italic_b can be used to access all states in 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Concatenating these with any separating sequences from 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT will not be helpful to learn SUL 𝒮𝒮\mathcal{S}caligraphic_S, because in 𝒮𝒮\mathcal{S}caligraphic_S these sequences all access q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. However, the separating sequences from 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are useful if executed in the right state of 𝒮𝒮\mathcal{S}caligraphic_S. For instance, the sequence bb𝑏𝑏bbitalic_b italic_b separates all states in 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and the blue states in 𝒮𝒮\mathcal{S}caligraphic_S. Thus, rebuilding does not realise the potential of reusing the separating sequences from 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, since the access sequences for the relevant states are different.

State Matching. We extend adaptive AAL with state matching. State matching overcomes the strong dependency on the access sequences and allows the efficient usage of reference models where the residual languages of the individual states are similar. Suppose that while learning, we have not yet separated q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and q1subscript𝑞1q_{1}italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in 𝒮𝒮\mathcal{S}caligraphic_S, but we do know the output of the b𝑏bitalic_b-transition from q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. We may use that output to match q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT with p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT in 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT: these two states agree on input sequences where both are defined. Subsequently, we can use the separating sequence bb𝑏𝑏bbitalic_b italic_b between p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and p0subscript𝑝0p_{0}italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to separate q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and q1subscript𝑞1q_{1}italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, through OQs bb𝑏𝑏bbitalic_b italic_b and abb𝑎𝑏𝑏abbitalic_a italic_b italic_b.

Approximate State Matching. It rarely happens that states in the SUL exactly match states in the reference model: Consider the scenario where we want to learn 𝒮𝒮\mathcal{S}caligraphic_S with reference model 3subscript3\mathcal{R}_{3}caligraphic_R start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT from Fig. 1(d). States q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and s3subscript𝑠3s_{3}italic_s start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT do not match because they have different outputs for input b𝑏bitalic_b but are still similar. This motivates an approximate version of matching, where a state is matched to the reference state which maximises the number of inputs with the same output.

Outline. After the preliminaries (Sec. 3), we recall the L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT algorithm and extend it with rebuilding (Sec. 4). We then introduce adaptive AAL with state matching and its approximate variant (Sec. 5). Together with rebuilding, this results in the AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT algorithm (Sec. 6). We proceed to define a variant that allows the use of multiple reference models (Sec. 7). This is helpful already in the example discussed in this section: given both 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT with multiple reference models allows to discover all states in 𝒮𝒮\mathcal{S}caligraphic_S without any EQs, see App. 0.F.

3 Preliminaries

For a partial map f:XY:𝑓𝑋𝑌f\colon X\rightharpoonup Yitalic_f : italic_X ⇀ italic_Y, we write f(x)𝑓𝑥f(x)\mathord{\downarrow}italic_f ( italic_x ) ↓ if f(x)𝑓𝑥f(x)italic_f ( italic_x ) is defined and f(x)𝑓𝑥f(x)\mathord{\uparrow}italic_f ( italic_x ) ↑ otherwise.

Definition 3.1

A partial Mealy machine is a tuple =(Q,I,O,q0,δ,λ)𝑄𝐼𝑂subscript𝑞0𝛿𝜆\mathcal{M}=(Q,I,O,q_{0},\delta,\lambda)caligraphic_M = ( italic_Q , italic_I , italic_O , italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_δ , italic_λ ), where Q𝑄Qitalic_Q, I𝐼Iitalic_I and O𝑂Oitalic_O are finite sets of states, inputs and outputs respectively; q0Qsubscript𝑞0𝑄q_{0}\in Qitalic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_Q an initial state, δ:Q×IQ:𝛿𝑄𝐼𝑄\delta\colon Q\times I\rightharpoonup Qitalic_δ : italic_Q × italic_I ⇀ italic_Q a transition function, and λ:Q×IO:𝜆𝑄𝐼𝑂\lambda\colon Q\times I\rightharpoonup Oitalic_λ : italic_Q × italic_I ⇀ italic_O an output function such that δ𝛿\deltaitalic_δ and λ𝜆\lambdaitalic_λ have the same domain. A (complete) Mealy machine is a partial Mealy machine where δ𝛿\deltaitalic_δ and λ𝜆\lambdaitalic_λ are total. If not specified otherwise, a Mealy machine is assumed to be complete.

We write |Ievaluated-at𝐼\mathcal{M}|_{I}caligraphic_M | start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT to denote \mathcal{M}caligraphic_M restricted to alphabet I𝐼Iitalic_I. We use the superscript \mathcal{M}caligraphic_M to indicate to which Mealy machine we refer, e.g. Qsuperscript𝑄Q^{\mathcal{M}}italic_Q start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT and δsuperscript𝛿\delta^{\mathcal{M}}italic_δ start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT. The transition and output functions are naturally extended to input sequences of length n𝑛n\in\mathbb{N}italic_n ∈ blackboard_N as functions δ:Q×InQ:𝛿𝑄superscript𝐼𝑛𝑄\delta\colon Q\times I^{n}\rightharpoonup Qitalic_δ : italic_Q × italic_I start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⇀ italic_Q and λ:Q×InOn:𝜆𝑄superscript𝐼𝑛superscript𝑂𝑛\lambda\colon Q\times I^{n}\rightharpoonup O^{n}italic_λ : italic_Q × italic_I start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⇀ italic_O start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. We abbreviate δ(q0,w)𝛿subscript𝑞0𝑤\delta(q_{0},w)italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_w ) by δ(w)𝛿𝑤\delta(w)italic_δ ( italic_w ).

Definition 3.2

Let 1subscript1\mathcal{M}_{1}caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, 2subscript2\mathcal{M}_{2}caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT be partial Mealy machines. States pQ1𝑝superscript𝑄subscript1p\in Q^{\mathcal{M}_{1}}italic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT and qQ2𝑞superscript𝑄subscript2q\in Q^{\mathcal{M}_{2}}italic_q ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT match, written p=q𝑝𝑞p\scalebox{0.65}{${}\overset{\surd}{=}{}$}qitalic_p over√ start_ARG = end_ARG italic_q, if λ(p,σ)=λ(q,σ)𝜆𝑝𝜎𝜆𝑞𝜎\lambda(p,\sigma)=\lambda(q,\sigma)italic_λ ( italic_p , italic_σ ) = italic_λ ( italic_q , italic_σ ) for all σ(I1I2)𝜎superscriptsuperscript𝐼subscript1superscript𝐼subscript2\sigma\in(I^{\mathcal{M}_{1}}\cap I^{\mathcal{M}_{2}})^{*}italic_σ ∈ ( italic_I start_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∩ italic_I start_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT with δ(p,σ)𝛿𝑝𝜎\delta(p,\sigma){\mathord{\downarrow}}italic_δ ( italic_p , italic_σ ) ↓ and δ(q,σ)𝛿𝑞𝜎\delta(q,\sigma){\mathord{\downarrow}}italic_δ ( italic_q , italic_σ ) ↓. If p𝑝pitalic_p and q𝑞qitalic_q do not match, they are apart, written p#q#𝑝𝑞p\mathrel{\#}qitalic_p # italic_q.

If p#q#𝑝𝑞p\mathrel{\#}qitalic_p # italic_q, then there is a separating sequence, i.e., a sequence σ𝜎\sigmaitalic_σ such that λ(p,σ)λ(q,σ)𝜆𝑝𝜎𝜆𝑞𝜎\lambda(p,\sigma)\neq\lambda(q,\sigma)italic_λ ( italic_p , italic_σ ) ≠ italic_λ ( italic_q , italic_σ ); this situation is denoted by σp#qproves𝜎#𝑝𝑞\sigma\vdash p\mathrel{\#}qitalic_σ ⊢ italic_p # italic_q. The definition of matching allows the input (and output) alphabets of the underlying Mealy machines to differ; it requires that they agree on all commonly defined input sequences. If 1subscript1\mathcal{M}_{1}caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 2subscript2\mathcal{M}_{2}caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are complete and have the same alphabet, then the matching of states is referred to as language equivalence. Two complete Mealy machines are equivalent if their initial states are language equivalent.

Let \mathcal{M}caligraphic_M be a partial Mealy machine. A state qQ𝑞superscript𝑄q\in Q^{\mathcal{M}}italic_q ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT is reachable if there exists σI𝜎superscript𝐼\sigma\in I^{*}italic_σ ∈ italic_I start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT such that δ(q0,σ)=qsuperscript𝛿subscript𝑞0𝜎𝑞\delta^{\mathcal{M}}(q_{0},\sigma)=qitalic_δ start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_σ ) = italic_q. The reachable part of \mathcal{M}caligraphic_M contains all reachable states in Qsuperscript𝑄Q^{\mathcal{M}}italic_Q start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT. A sequence σ𝜎\sigmaitalic_σ is an access sequence for qQ𝑞superscript𝑄q\in Q^{\mathcal{M}}italic_q ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT if δ(σ)=qsuperscript𝛿𝜎𝑞\delta^{\mathcal{M}}(\sigma)=qitalic_δ start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT ( italic_σ ) = italic_q. A set PI𝑃superscript𝐼P\subseteq I^{*}italic_P ⊆ italic_I start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is a state cover for \mathcal{M}caligraphic_M if P𝑃Pitalic_P contains an access sequence for every reachable state in \mathcal{M}caligraphic_M. In this paper, a tree 𝒯𝒯\mathcal{T}caligraphic_T is a partial Mealy machine where every state q𝑞qitalic_q has a unique access sequence, denoted by 𝖺𝖼𝖼𝖾𝗌𝗌(q)𝖺𝖼𝖼𝖾𝗌𝗌𝑞\mathsf{access}(q)sansserif_access ( italic_q ).

Definition 3.3

Let \mathcal{M}caligraphic_M be a complete Mealy machine. A set Wq(I)subscript𝑊𝑞superscriptsuperscript𝐼W_{q}\subseteq(I^{\mathcal{M}})^{*}italic_W start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ⊆ ( italic_I start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is a state identifier for qQ𝑞superscript𝑄q\in Q^{\mathcal{M}}italic_q ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT if for all pQ𝑝superscript𝑄p\in Q^{\mathcal{M}}italic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT with p#q#𝑝𝑞p\mathrel{\#}qitalic_p # italic_q there exists σWq𝜎subscript𝑊𝑞\sigma\in W_{q}italic_σ ∈ italic_W start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT such that σp#qproves𝜎#𝑝𝑞\sigma\vdash p\mathrel{\#}qitalic_σ ⊢ italic_p # italic_q. A separating family is a collection of state identifiers {Wp}pQsubscriptsubscript𝑊𝑝𝑝superscript𝑄\{W_{p}\}_{p\in Q^{\mathcal{M}}}{ italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT end_POSTSUBSCRIPT such that for all p,qQ𝑝𝑞superscript𝑄p,q\in Q^{\mathcal{M}}italic_p , italic_q ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT with p#q#𝑝𝑞p\mathrel{\#}qitalic_p # italic_q there exists σWpWq𝜎subscript𝑊𝑝subscript𝑊𝑞\sigma\in W_{p}\cap W_{q}italic_σ ∈ italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∩ italic_W start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT with σp#qproves𝜎#𝑝𝑞\sigma\vdash p\mathrel{\#}qitalic_σ ⊢ italic_p # italic_q.

We use Psuperscript𝑃P^{\mathcal{M}}italic_P start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT and {Wq}superscriptsubscript𝑊𝑞\{W_{q}\}^{\mathcal{M}}{ italic_W start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT } start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT to refer to a minimal state cover and a separating family for \mathcal{M}caligraphic_M respectively. State covers and separating families can be constructed for every Mealy machine, but are not necessarily unique.

4 L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT with Rebuilding

We first recall the L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT algorithm for (standard) AAL [23]. Then, we consider adaptive learning by presenting an L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT-compatible variant of rebuilding.

4.1 Observation Trees

L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT uses an observation tree as data structure to store the observed traces of \mathcal{M}caligraphic_M.

Definition 4.1

A tree 𝒯𝒯\mathcal{T}caligraphic_T is an observation tree if there exists a map** f:Q𝒯Q:𝑓superscript𝑄𝒯superscript𝑄f\colon Q^{\mathcal{T}}\to Q^{\mathcal{M}}italic_f : italic_Q start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT → italic_Q start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT such that f(q0𝒯)=q0𝑓superscriptsubscript𝑞0𝒯superscriptsubscript𝑞0f(q_{0}^{\mathcal{T}})=q_{0}^{\mathcal{M}}italic_f ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ) = italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_M end_POSTSUPERSCRIPT and qi/oq𝑖𝑜𝑞superscript𝑞q\xrightarrow[]{i/o}q^{\prime}italic_q start_ARROW start_OVERACCENT italic_i / italic_o end_OVERACCENT → end_ARROW italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT implies f(q)i/of(q)𝑖𝑜𝑓𝑞𝑓superscript𝑞f(q)\xrightarrow[]{i/o}f(q^{\prime})italic_f ( italic_q ) start_ARROW start_OVERACCENT italic_i / italic_o end_OVERACCENT → end_ARROW italic_f ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ).

In an observation tree, a basis is a subtree that describes unique behaviour present in the SUL. Initially, a basis BQ𝒯𝐵superscript𝑄𝒯B\subseteq Q^{\mathcal{T}}italic_B ⊆ italic_Q start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT contains the root state. All states in the basis are pairwise apart, i.e., for all qqB𝑞superscript𝑞𝐵q\neq q^{\prime}\in Bitalic_q ≠ italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B it holds that q#q#𝑞superscript𝑞q\mathrel{\#}q^{\prime}italic_q # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. For a fixed basis, its frontier is the set of states FQ𝒯𝐹superscript𝑄𝒯F\subseteq Q^{\mathcal{T}}italic_F ⊆ italic_Q start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT which are immediate successors of basis states but which are not in the basis themselves.

Refer to caption
(e) 𝒯𝒯\mathcal{T}caligraphic_T
Refer to caption
(f) \mathcal{H}caligraphic_H
Refer to caption
(g) 𝒯superscript𝒯\mathcal{T}^{\prime}caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
Refer to caption
(h) superscript\mathcal{H}^{\prime}caligraphic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
Figure 2: Observation trees and hypotheses generated while learning 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT with L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT. Basis states are displayed in pink and frontier states in yellow.
Example 4.1

Fig. 2 shows an observation tree 𝒯superscript𝒯\mathcal{T}^{\prime}caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT for the Mealy machine superscript\mathcal{H}^{\prime}caligraphic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT from Fig. 2. The separating sequences c𝑐citalic_c and ac𝑎𝑐acitalic_a italic_c show that the states in basis B={t0,t2,t3}𝐵subscript𝑡0subscript𝑡2subscript𝑡3B=\{t_{0},t_{2},t_{3}\}italic_B = { italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT } are all pairwise apart. The frontier F𝐹Fitalic_F is {t1,t4,t5,t6}subscript𝑡1subscript𝑡4subscript𝑡5subscript𝑡6\{t_{1},t_{4},t_{5},t_{6}\}{ italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT }.

We say that a frontier state is isolated if it is apart from all basis states. A frontier state is identified with a basis state q𝑞qitalic_q if it is apart from all basis states except q𝑞qitalic_q. We say the observation tree is adequate if all frontier states are identified, no frontier states are isolated and each basis state has a transition with every input. If every frontier state is identified and each basis state has a transition for every input, the observation tree can be folded to create a complete Mealy machine (formalized in Def. 0.A.1). The Mealy machine has the same states as the basis. The transitions between basis states are the same as in the observation tree. Transitions from basis states to frontier states are folded back to the basis state the frontier state is identified with. We call the resulting complete Mealy machine a hypothesis whenever this canonical transformation is used.

Example 4.2

In 𝒯superscript𝒯\mathcal{T}^{\prime}caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (Fig. 2) the frontier states are identified as follows: t1t2,t4t3,t5t0formulae-sequencemaps-tosubscript𝑡1subscript𝑡2formulae-sequencemaps-tosubscript𝑡4subscript𝑡3maps-tosubscript𝑡5subscript𝑡0t_{1}\mapsto t_{2},t_{4}\mapsto t_{3},t_{5}\mapsto t_{0}italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ↦ italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ↦ italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT ↦ italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and t6t2maps-tosubscript𝑡6subscript𝑡2t_{6}\mapsto t_{2}italic_t start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT ↦ italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Hypothesis superscript\mathcal{H}^{\prime}caligraphic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (Fig. 2) can be folded back from 𝒯superscript𝒯\mathcal{T}^{\prime}caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. The dashed transitions in Fig. 2 represent the folded transitions.

4.2 The L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT Algorithm

The L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT algorithm maintains an observation tree 𝒯𝒯\mathcal{T}caligraphic_T and a basis B𝐵Bitalic_B. Initially, 𝒯𝒯\mathcal{T}caligraphic_T consists of just a root node q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and B={q0}𝐵subscript𝑞0B=\{q_{0}\}italic_B = { italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT }. We denote the frontier of B𝐵Bitalic_B by F𝐹Fitalic_F. The L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT algorithm then repeatedly applies the following four rules.

  • The promotion rule (P) extends B𝐵Bitalic_B by rF𝑟𝐹r\in Fitalic_r ∈ italic_F when r𝑟ritalic_r is isolated.

  • The extension rule (Ex) poses OQ 𝖺𝖼𝖼𝖾𝗌𝗌(q)i𝖺𝖼𝖼𝖾𝗌𝗌𝑞𝑖\mathsf{access}(q)isansserif_access ( italic_q ) italic_i for qB,iIformulae-sequence𝑞𝐵𝑖𝐼q\in B,i\in Iitalic_q ∈ italic_B , italic_i ∈ italic_I with δ(q,i)𝛿𝑞𝑖\delta(q,i)\mathord{\uparrow}italic_δ ( italic_q , italic_i ) ↑.

  • The separation rule (S) takes a state rF𝑟𝐹r\in Fitalic_r ∈ italic_F that is not apart from q,qB𝑞superscript𝑞𝐵q,q^{\prime}\in Bitalic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B and poses OQ 𝖺𝖼𝖼𝖾𝗌𝗌(r)σ𝖺𝖼𝖼𝖾𝗌𝗌𝑟𝜎\mathsf{access}(r)\sigmasansserif_access ( italic_r ) italic_σ with σq#qproves𝜎#𝑞superscript𝑞\sigma\vdash q\mathrel{\#}q^{\prime}italic_σ ⊢ italic_q # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT that shows r𝑟ritalic_r is apart from q𝑞qitalic_q or qsuperscript𝑞q^{\prime}italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

  • The equivalence rule (Eq) folds 𝒯𝒯\mathcal{T}caligraphic_T into hypothesis \mathcal{H}caligraphic_H, checks whether \mathcal{H}caligraphic_H and 𝒯𝒯\mathcal{T}caligraphic_T agree on all sequences in 𝒯𝒯\mathcal{T}caligraphic_T and poses an EQ. If \mathcal{H}caligraphic_H and the SUL are not equivalent, counterexample processing isolates a frontier state.

The pre- and postconditions of the rules are summarized in (the top rows of) Table 1. A detailed account is given in the paper introducing L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT [23].

Table 1: Extended L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT rules with parameters, preconditions and postconditions.
Rule Parameters Precondition Postcondition
Sec. 4.2 promotion rF𝑟𝐹r\in Fitalic_r ∈ italic_F qB,q#rformulae-sequencefor-all𝑞𝐵#𝑞𝑟\forall q\in B,q\mathrel{\#}r∀ italic_q ∈ italic_B , italic_q # italic_r rB𝑟𝐵r\in Bitalic_r ∈ italic_B
extension qB𝑞𝐵q\in Bitalic_q ∈ italic_B, iI𝑖𝐼i\in Iitalic_i ∈ italic_I δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖absent\delta^{\mathcal{T}}(q,i){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ↑ δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ↓
separation rF,𝑟𝐹r\in F,italic_r ∈ italic_F , ¬(r#q),¬(r#q),qq#𝑟𝑞#𝑟superscript𝑞𝑞superscript𝑞\neg(r\mathrel{\#}q),\neg(r\mathrel{\#}q^{\prime}),q\neq q^{\prime}¬ ( italic_r # italic_q ) , ¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , italic_q ≠ italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT r#qr#q#𝑟𝑞𝑟#superscript𝑞r\mathrel{\#}q\lor r\mathrel{\#}q^{\prime}italic_r # italic_q ∨ italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
q,qB𝑞superscript𝑞𝐵q,q^{\prime}\in Bitalic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B
equivalence - qB.iI.δ𝒯(q,i),formulae-sequencefor-all𝑞𝐵for-all𝑖𝐼superscript𝛿𝒯𝑞𝑖\forall q\in B.~{}\forall i\in I.~{}\delta^{\mathcal{T}}(q,i){\mathord{% \downarrow}},∀ italic_q ∈ italic_B . ∀ italic_i ∈ italic_I . italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ↓ , rF𝑟𝐹\exists r\in F∃ italic_r ∈ italic_F s.t.
rF.qB.formulae-sequencefor-all𝑟𝐹𝑞𝐵\forall r\in F.~{}\exists q\in B.∀ italic_r ∈ italic_F . ∃ italic_q ∈ italic_B . qB.r#qformulae-sequencefor-all𝑞𝐵#𝑟𝑞\forall q\in B.~{}r\mathrel{\#}q∀ italic_q ∈ italic_B . italic_r # italic_q
(¬(r#q)qB{q}.r#q)formulae-sequence#𝑟𝑞for-allsuperscript𝑞𝐵𝑞#𝑟superscript𝑞(\neg(r\mathrel{\#}q)\land~{}\forall q^{\prime}\in B\setminus\{q\}.~{}r% \mathrel{\#}q^{\prime})( ¬ ( italic_r # italic_q ) ∧ ∀ italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B ∖ { italic_q } . italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
Sec 4.3 rebuilding q,qB,𝑞superscript𝑞𝐵q,q^{\prime}\in B,italic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B , δ𝒯(q,i)B,¬(q#δ𝒯(q,i))superscript𝛿𝒯𝑞𝑖𝐵#superscript𝑞superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i)\notin B,\neg(q^{\prime}\mathrel{\#}\delta^{\mathcal{% T}}(q,i))italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ∉ italic_B , ¬ ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ), δ𝒯(q,iσ)superscript𝛿𝒯𝑞𝑖𝜎\delta^{\mathcal{T}}(q,i\sigma){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i italic_σ ) ↓,
iI𝑖𝐼i\in Iitalic_i ∈ italic_I 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i,𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)Psuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞superscript𝑃\mathsf{access}^{\mathcal{T}}(q)i,\mathsf{access}^{\mathcal{T}}(q^{\prime})\in P% ^{\mathcal{R}}sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i , sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT, δ𝒯(q,σ)superscript𝛿𝒯superscript𝑞𝜎\delta^{\mathcal{T}}(q^{\prime},\sigma){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_σ ) ↓
σ=𝗌𝖾𝗉(δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i),δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)))𝜎𝗌𝖾𝗉superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\sigma=\mathsf{sep}\bm{(}\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q)% i),\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))\bm{)}italic_σ = sansserif_sep bold_( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) bold_),
(δ𝒯(q,iσ)δ𝒯(q,σ))(\delta^{\mathcal{T}}(q,i\sigma){\uparrow}\lor\delta^{\mathcal{T}}(q^{\prime},% \sigma){\uparrow})( italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i italic_σ ) ↑ ∨ italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_σ ) ↑ )
prioritized promotion rF𝑟𝐹r\in Fitalic_r ∈ italic_F 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(r)P,qB.q#rformulae-sequenceformulae-sequencesuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑟superscript𝑃for-all𝑞𝐵#𝑞𝑟\mathsf{access}^{\mathcal{T}}(r)\in P^{\mathcal{R}},\forall q\in B.~{}q% \mathrel{\#}rsansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_r ) ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT , ∀ italic_q ∈ italic_B . italic_q # italic_r rB𝑟𝐵r\in Bitalic_r ∈ italic_B
Sec. 35.2 match separation q,qB𝑞superscript𝑞𝐵q,q^{\prime}\in Bitalic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, δ𝒯(q,i)=rF,¬(r#q)formulae-sequencesuperscript𝛿𝒯𝑞𝑖𝑟𝐹#𝑟superscript𝑞\delta^{\mathcal{T}}(q,i)=r\in F,\neg(r\mathrel{\#}q^{\prime})italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) = italic_r ∈ italic_F , ¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ), δ(p,i)=psuperscript𝛿𝑝𝑖superscript𝑝\delta^{\mathcal{R}}(p,i)=p^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_i ) = italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT r#q#𝑟limit-fromsuperscript𝑞r\mathrel{\#}q^{\prime}~{}\loritalic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∨
pQ,iIformulae-sequence𝑝superscript𝑄𝑖𝐼p\in Q^{\mathcal{R}},i\in Iitalic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT , italic_i ∈ italic_I ¬(q′′B\neg(\exists q^{\prime\prime}\in B¬ ( ∃ italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ∈ italic_B s.t. p=q′′),p=qp^{\prime}\scalebox{0.65}{${}\overset{\surd}{=}{}$}q^{\prime\prime}),p% \scalebox{0.65}{${}\overset{\surd}{=}{}$}qitalic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over√ start_ARG = end_ARG italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ) , italic_p over√ start_ARG = end_ARG italic_q (pqr#p)#𝑝𝑞𝑟superscript𝑝(p\scalebox{0.65}{${}\overset{\surd}{\neq}{}$}q\land r\mathrel{\#}p^{\prime})( italic_p over√ start_ARG ≠ end_ARG italic_q ∧ italic_r # italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
match refinement qB,𝑞𝐵q\in B,italic_q ∈ italic_B , p=q,p=q,𝑝𝑞superscript𝑝𝑞p\scalebox{0.65}{${}\overset{\surd}{=}{}$}q,p^{\prime}\scalebox{0.65}{${}% \overset{\surd}{=}{}$}q,italic_p over√ start_ARG = end_ARG italic_q , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over√ start_ARG = end_ARG italic_q , pqpq𝑝𝑞superscript𝑝𝑞p\scalebox{0.65}{${}\overset{\surd}{\neq}{}$}q\lor p^{\prime}\scalebox{0.65}{$% {}\overset{\surd}{\neq}{}$}qitalic_p over√ start_ARG ≠ end_ARG italic_q ∨ italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over√ start_ARG ≠ end_ARG italic_q
p,pQ𝑝superscript𝑝superscript𝑄p,p^{\prime}\in Q^{\mathcal{R}}italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT σ=𝗌𝖾𝗉(p,p)𝜎𝗌𝖾𝗉𝑝superscript𝑝\sigma=\mathsf{sep}(p,p^{\prime})italic_σ = sansserif_sep ( italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
prioritized separation rF,𝑟𝐹r\in F,italic_r ∈ italic_F , ¬(r#q),¬(r#q′′),iI#𝑟superscript𝑞#𝑟superscript𝑞′′𝑖𝐼\neg(r\mathrel{\#}q^{\prime}),\neg(r\mathrel{\#}q^{\prime\prime}),\exists i\in I¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , ¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ) , ∃ italic_i ∈ italic_I s.t. δ𝒯(q,i)=r,superscript𝛿𝒯𝑞𝑖𝑟\delta^{\mathcal{T}}(q,i)=r,italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) = italic_r , r#q′′#𝑟limit-fromsuperscript𝑞′′r\mathrel{\#}q^{\prime\prime}~{}\loritalic_r # italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ∨
q,q′′Bsuperscript𝑞superscript𝑞′′𝐵q^{\prime},q^{\prime\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ∈ italic_B σq#q′′,σp=qWδ(p,i)proves𝜎formulae-sequence#superscript𝑞superscript𝑞′′𝜎subscript𝑝𝑞subscript𝑊superscript𝛿𝑝𝑖\sigma\vdash q^{\prime}\mathrel{\#}q^{\prime\prime},\sigma\in\cup_{p\scalebox{% 0.65}{${}\overset{\surd}{=}{}$}q}W_{\delta^{\mathcal{R}}(p,i)}italic_σ ⊢ italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT , italic_σ ∈ ∪ start_POSTSUBSCRIPT italic_p over√ start_ARG = end_ARG italic_q end_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_i ) end_POSTSUBSCRIPT r#q#𝑟superscript𝑞r\mathrel{\#}q^{\prime}italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
Example 4.3

Suppose we learn 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT from Fig. 1. L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT applies the extension rule twice, resulting in 𝒯𝒯\mathcal{T}caligraphic_T as in Fig. 2. States t1subscript𝑡1t_{1}italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and t2subscript𝑡2t_{2}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are identified with t0subscript𝑡0t_{0}italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT because there is only one basis state. Next, L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT applies the equivalence rule using hypothesis \mathcal{H}caligraphic_H (Fig. 2). Counterexample aac𝑎𝑎𝑐aacitalic_a italic_a italic_c distinguishes \mathcal{H}caligraphic_H from 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. This sequence is added to 𝒯𝒯\mathcal{T}caligraphic_T and processed further by posing OQ ac𝑎𝑐acitalic_a italic_c in the equivalence rule. Observations ac𝑎𝑐acitalic_a italic_c and aac𝑎𝑎𝑐aacitalic_a italic_a italic_c show that the states accessed with ε𝜀\varepsilonitalic_ε, a𝑎aitalic_a and aa𝑎𝑎aaitalic_a italic_a are pairwise apart. States t2subscript𝑡2t_{2}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and t3subscript𝑡3t_{3}italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT are added to the basis using the promotion rule. Next, L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT poses OQ aaa𝑎𝑎𝑎aaaitalic_a italic_a italic_a during the extension rule. To identify all frontier states, L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT may use act2#t3proves𝑎𝑐#subscript𝑡2subscript𝑡3ac\vdash t_{2}\mathrel{\#}t_{3}italic_a italic_c ⊢ italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT # italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, act0#t2proves𝑎𝑐#subscript𝑡0subscript𝑡2ac\vdash t_{0}\mathrel{\#}t_{2}italic_a italic_c ⊢ italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT # italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and ct0#t3proves𝑐#subscript𝑡0subscript𝑡3c\vdash t_{0}\mathrel{\#}t_{3}italic_c ⊢ italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT # italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT. Fig. 2 shows one possible observation tree 𝒯superscript𝒯\mathcal{T}^{\prime}caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT after applying the separation rule multiple times. Next, the equivalence rule constructs hypothesis superscript\mathcal{H}^{\prime}caligraphic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (Fig. 2) from 𝒯superscript𝒯\mathcal{T}^{\prime}caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT terminates because superscript\mathcal{H}^{\prime}caligraphic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT are equivalent.

4.3 Rebuilding in L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT

In this subsection, we combine rebuilding from [6, 9] with L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT and implement this using two rules: rebuilding and prioritized promotion, see also Table 1. Both rules depend on a reference model \mathcal{R}caligraphic_R, which is a complete Mealy machine, with a possibly different alphabet than the SUL 𝒮𝒮\mathcal{S}caligraphic_S. More precisely, these rules depend on a prefix-closed and minimal state cover Psuperscript𝑃P^{\mathcal{R}}italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT and a separating family {Wq}superscriptsubscript𝑊𝑞\{W_{q}\}^{\mathcal{R}}{ italic_W start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT } start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT computed on |I𝒮evaluated-atsuperscript𝐼𝒮\mathcal{R}|_{I^{\mathcal{S}}}caligraphic_R | start_POSTSUBSCRIPT italic_I start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT end_POSTSUBSCRIPT for maximal overlap with 𝒮𝒮\mathcal{S}caligraphic_S. The separating family can be computed with partition refinement [20]. We fix 𝗌𝖾𝗉(p,p)𝗌𝖾𝗉𝑝superscript𝑝\mathsf{sep}(p,p^{\prime})sansserif_sep ( italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) with p,pQ𝑝superscript𝑝superscript𝑄p,p^{\prime}\in Q^{\mathcal{R}}italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT to be a unique sequence from WpWpsubscript𝑊𝑝subscript𝑊superscript𝑝W_{p}\cap W_{p^{\prime}}italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∩ italic_W start_POSTSUBSCRIPT italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT such that 𝗌𝖾𝗉(p,p)p#pproves𝗌𝖾𝗉𝑝superscript𝑝#𝑝superscript𝑝\mathsf{sep}(p,p^{\prime})\vdash p\mathrel{\#}p^{\prime}sansserif_sep ( italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊢ italic_p # italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Below, we use q𝑞qitalic_q for states in B𝐵Bitalic_B, r𝑟ritalic_r for states in F𝐹Fitalic_F and p𝑝pitalic_p for states in Qsuperscript𝑄Q^{\mathcal{R}}italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. In App. 0.A, we depict the scenarios in the observation tree and reference model required for the new rules to be applicable.

Rule (R): Rebuilding. Let qB𝑞𝐵q\in Bitalic_q ∈ italic_B, iI𝑖𝐼i\in Iitalic_i ∈ italic_I and suppose δ𝒯(q,i)Bsuperscript𝛿𝒯𝑞𝑖𝐵\delta^{\mathcal{T}}(q,i)\notin Bitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ∉ italic_B. The aim of the rebuilding rule is to show apartness between δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) and a basis state qsuperscript𝑞q^{\prime}italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, using the state cover and separating family from \mathcal{R}caligraphic_R. The rebuilding rule is applicable when 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞\mathsf{access}^{\mathcal{T}}(q)sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) and 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)isuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖\mathsf{access}^{\mathcal{T}}(q)isansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i are in Psuperscript𝑃P^{\mathcal{R}}italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. If 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)Psuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞superscript𝑃\mathsf{access}^{\mathcal{T}}(q^{\prime})\in P^{\mathcal{R}}sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT then there exists a sequence σ𝜎\sigmaitalic_σ such that σ=𝗌𝖾𝗉(δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i),δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)))𝜎𝗌𝖾𝗉superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\sigma=\mathsf{sep}\bm{(}\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q)% i),\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))\bm{)}italic_σ = sansserif_sep bold_( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) bold_). We pose OQs 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)iσsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖𝜎\mathsf{access}^{\mathcal{T}}(q)i\sigmasansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i italic_σ and 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)σsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞𝜎\mathsf{access}^{\mathcal{T}}(q^{\prime})\sigmasansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) italic_σ.

Lemma 1

Suppose 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)Psuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞superscript𝑃\mathsf{access}^{\mathcal{T}}(q^{\prime})\in P^{\mathcal{R}}sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT for all qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B. Consider qB𝑞𝐵q\in Bitalic_q ∈ italic_B, iI𝑖𝐼i\in Iitalic_i ∈ italic_I such that δ𝒯(q,i)Bsuperscript𝛿𝒯𝑞𝑖𝐵\delta^{\mathcal{T}}(q,i)\notin Bitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ∉ italic_B and 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)iPsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝑃\mathsf{access}^{\mathcal{T}}(q)i\in P^{\mathcal{R}}sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. If for all qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B it holds that 𝗌𝖾𝗉(δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i),δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)))δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i)#δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q))proves𝗌𝖾𝗉superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞#superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\mathsf{sep}\bm{(}\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q)i),% \delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))\bm{)}\vdash% \delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q)i)\mathrel{\#}\delta^{% \mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))sansserif_sep bold_( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) bold_) ⊢ italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) # italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ), then after applying the rebuilding rule for q𝑞qitalic_q, i𝑖iitalic_i and all qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B with ¬(q#δ𝒯(q,i))#superscript𝑞superscript𝛿𝒯𝑞𝑖\neg(q^{\prime}\mathrel{\#}\delta^{\mathcal{T}}(q,i))¬ ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ), state δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) is isolated.

If a state is isolated, it can be added to the basis using the promotion rule.

Rule (PP): Prioritized promotion. Like (regular) promotion, prioritized promotion extends the basis. However, prioritized promotion only applies to states r𝑟ritalic_r with 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(r)Psuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑟superscript𝑃\mathsf{access}^{\mathcal{T}}(r)\in P^{\mathcal{R}}sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_r ) ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. This enforces that the access sequences for basis states are in Psuperscript𝑃P^{\mathcal{R}}italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT as often as possible, enabling the use of the rebuilding rule.

Example 4.4

Consider reference 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and SUL 𝒮𝒮\mathcal{S}caligraphic_S from Fig. 1. We learn the orange states similarly as described in Sec. 2: We apply the rebuilding rule with 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)=ε,𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)=ε,i=cformulae-sequencesuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝜀formulae-sequencesuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞𝜀𝑖𝑐\mathsf{access}^{\mathcal{T}}(q)=\varepsilon,\mathsf{access}^{\mathcal{T}}(q^{% \prime})=\varepsilon,i=csansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) = italic_ε , sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_ε , italic_i = italic_c which results in OQs cac𝑐𝑎𝑐cacitalic_c italic_a italic_c and ac𝑎𝑐acitalic_a italic_c. Next, we promote δ𝒯(c)superscript𝛿𝒯𝑐\delta^{\mathcal{T}}(c)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_c ) with the prioritized promotion rule. We apply the rebuilding rule with 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)=c,𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)=cformulae-sequencesuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑐superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞𝑐\mathsf{access}^{\mathcal{T}}(q)=c,\mathsf{access}^{\mathcal{T}}(q^{\prime})=csansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) = italic_c , sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_c and i=a𝑖𝑎i=aitalic_i = italic_a which results in OQs cac𝑐𝑎𝑐cacitalic_c italic_a italic_c (already present in 𝒯𝒯\mathcal{T}caligraphic_T) and cc𝑐𝑐ccitalic_c italic_c. Lastly, we promote δ𝒯(ca)superscript𝛿𝒯𝑐𝑎\delta^{\mathcal{T}}(ca)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_c italic_a ) with prioritized promotion.

The overlap between 𝒮𝒮\mathcal{S}caligraphic_S and Psuperscript𝑃P^{\mathcal{R}}italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT and {Wq}superscriptsubscript𝑊𝑞\{W_{q}\}^{\mathcal{R}}{ italic_W start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT } start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT determines how many states of 𝒮𝒮\mathcal{S}caligraphic_S can be discovered via rebuilding. The statement follows from Lemma 1 above.

Theorem 4.1

If q0superscriptsubscript𝑞0q_{0}^{\mathcal{R}}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT matches q0𝒮superscriptsubscript𝑞0𝒮q_{0}^{\mathcal{S}}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT and 𝒯𝒯\mathcal{T}caligraphic_T only contains a root q0𝒯superscriptsubscript𝑞0𝒯q_{0}^{\mathcal{T}}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT, then after applying only the rebuilding and prioritized promotion rules until they are no longer applicable, the basis consists of n𝑛nitalic_n states where n𝑛nitalic_n is the number of equivalence classes (w.r.t. language equivalence) in the reachable part of 𝒮|Ievaluated-at𝒮superscript𝐼\mathcal{S}|_{I^{\mathcal{R}}}caligraphic_S | start_POSTSUBSCRIPT italic_I start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT.

Corollary 1

Suppose we learn SUL 𝒮𝒮\mathcal{S}caligraphic_S with reference 𝒮𝒮\mathcal{S}caligraphic_S. Using the rebuilding and prioritized promotion rules, we can add all reachable states in 𝒮𝒮\mathcal{S}caligraphic_S to the basis.

5 L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT using State Matching

In this section, we describe another way to reuse information from references, called state matching, which is independent of the state cover. First, we present a version of state matching using the matching relation ( ={}\overset{\surd}{=}{}over√ start_ARG = end_ARG ) from Def. 3.2 and then we weaken this notion to approximate state matching.

5.1 State Matching

Refer to caption
(a) Observation tree 𝒯0subscript𝒯0\mathcal{T}_{0}caligraphic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT
Refer to caption
(b) Observation tree 𝒯1subscript𝒯1\mathcal{T}_{1}caligraphic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT
Refer to caption
(c) Observation tree 𝒯2subscript𝒯2\mathcal{T}_{2}caligraphic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT
Figure 3: Observation trees generated while learning 𝒮𝒮\mathcal{S}caligraphic_S with 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

With state matching, the learner maintains the matching relation ={}\overset{\surd}{=}{}over√ start_ARG = end_ARG between basis states and reference model states during learning. In the implementation, before applying a matching rule, the matching is updated based on the OQs asked since the previous match computation. We present two key rules here and an optimisation in the next subsection.

Rule (MS): Match separation. This rule aims to show apartness between the frontier and a basis state using separating sequences from the reference separating family. Let q𝑞qitalic_q, qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, rF𝑟𝐹r\in Fitalic_r ∈ italic_F with δ𝒯(q,i)=rsuperscript𝛿𝒯𝑞𝑖𝑟\delta^{\mathcal{T}}(q,i)=ritalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) = italic_r for some iI𝑖𝐼i\in Iitalic_i ∈ italic_I, and p,pQ𝑝superscript𝑝superscript𝑄p,p^{\prime}\in Q^{\mathcal{R}}italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. Suppose that δ(p,i)=psuperscript𝛿𝑝𝑖superscript𝑝\delta^{\mathcal{R}}(p,i)=p^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_i ) = italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, ¬(r#q)#𝑟superscript𝑞\neg(r\mathrel{\#}q^{\prime})¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ), p=q𝑝𝑞p\scalebox{0.65}{${}\overset{\surd}{=}{}$}qitalic_p over√ start_ARG = end_ARG italic_q and psuperscript𝑝p^{\prime}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT does not match any basis state. In particular, there exists some separating sequence σ𝜎\sigmaitalic_σ for p#q#superscript𝑝superscript𝑞p^{\prime}\mathrel{\#}q^{\prime}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. The match separation rule poses OQ 𝖺𝖼𝖼𝖾𝗌𝗌(q)iσ𝖺𝖼𝖼𝖾𝗌𝗌𝑞𝑖𝜎\mathsf{access}(q)i\sigmasansserif_access ( italic_q ) italic_i italic_σ to either show r#q#𝑟superscript𝑞r\mathrel{\#}q^{\prime}italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT or qp𝑞𝑝q\scalebox{0.65}{${}\overset{\surd}{\neq}{}$}pitalic_q over√ start_ARG ≠ end_ARG italic_p and r#p#𝑟superscript𝑝r\mathrel{\#}p^{\prime}italic_r # italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

Example 5.1

Suppose we learn 𝒮𝒮\mathcal{S}caligraphic_S using 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT from Fig. 1. After applying the extension rule three times, we get 𝒯0subscript𝒯0\mathcal{T}_{0}caligraphic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (Fig. 3). State t0subscript𝑡0t_{0}italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT matches p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT as their outputs coincide on sequences from alphabet I𝒮I2={a,b}superscript𝐼𝒮superscript𝐼subscript2𝑎𝑏I^{\mathcal{S}}\cap I^{\mathcal{R}_{2}}=\{a,b\}italic_I start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ∩ italic_I start_POSTSUPERSCRIPT caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT = { italic_a , italic_b }. State p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT transitions to the unmatched state p0subscript𝑝0p_{0}italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT with input a𝑎aitalic_a. The match separation rule conjectures t1subscript𝑡1t_{1}italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT may match p0subscript𝑝0p_{0}italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT which implies t1#t0#subscript𝑡1subscript𝑡0t_{1}\mathrel{\#}t_{0}italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT # italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. We use OQ 𝖺𝖼𝖼𝖾𝗌𝗌(t1)a𝖺𝖼𝖼𝖾𝗌𝗌subscript𝑡1𝑎\mathsf{access}(t_{1})asansserif_access ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_a to test this conjecture and indeed find that t1subscript𝑡1t_{1}italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT can be added to the basis using promotion.

Lemma 2

We fix pQ𝑝superscript𝑄p\in Q^{\mathcal{R}}italic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT, qB𝑞𝐵q\in Bitalic_q ∈ italic_B, iI𝑖𝐼i\in Iitalic_i ∈ italic_I and δ𝒯(q,i)=rFsuperscript𝛿𝒯𝑞𝑖𝑟𝐹\delta^{\mathcal{T}}(q,i)=r\in Fitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) = italic_r ∈ italic_F. Suppose δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q))=psuperscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑝\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q))\scalebox{0.65}{${}% \overset{\surd}{=}{}$}pitalic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) ) over√ start_ARG = end_ARG italic_p. If δ(p,i)qsuperscript𝛿𝑝𝑖superscript𝑞\delta^{\mathcal{R}}(p,i)\scalebox{0.65}{${}\overset{\surd}{\neq}{}$}q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_i ) over√ start_ARG ≠ end_ARG italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT for all qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, then after applying the match separation rule with q,p,i𝑞𝑝𝑖q,p,iitalic_q , italic_p , italic_i for all qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B with ¬(q#r)#superscript𝑞𝑟\neg(q^{\prime}\mathrel{\#}r)¬ ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_r ), state r𝑟ritalic_r is isolated.

Rule (MR): Match refinement. Let qB𝑞𝐵q\in Bitalic_q ∈ italic_B and p,pQ𝑝superscript𝑝superscript𝑄p,p^{\prime}\in Q^{\mathcal{R}}italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. Suppose q𝑞qitalic_q matches both p𝑝pitalic_p and psuperscript𝑝p^{\prime}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and let σ=𝗌𝖾𝗉(p,p)𝜎𝗌𝖾𝗉𝑝superscript𝑝\sigma=\mathsf{sep}(p,p^{\prime})italic_σ = sansserif_sep ( italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). The match refinement rule poses OQ 𝖺𝖼𝖼𝖾𝗌𝗌(q)σ𝖺𝖼𝖼𝖾𝗌𝗌𝑞𝜎\mathsf{access}(q)\sigmasansserif_access ( italic_q ) italic_σ resulting in q𝑞qitalic_q no longer being matched to p𝑝pitalic_p or psuperscript𝑝p^{\prime}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

Example 5.2

Suppose we continue learning 𝒮𝒮\mathcal{S}caligraphic_S using 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT from observation tree 𝒯1subscript𝒯1\mathcal{T}_{1}caligraphic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (Fig. 3). State t1subscript𝑡1t_{1}italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT matches both p0subscript𝑝0p_{0}italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. After posing OQ 𝖺𝖼𝖼𝖾𝗌𝗌(t1)bb𝖺𝖼𝖼𝖾𝗌𝗌subscript𝑡1𝑏𝑏\mathsf{access}(t_{1})bbsansserif_access ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_b italic_b where bbp0#p1proves𝑏𝑏#subscript𝑝0subscript𝑝1bb\vdash p_{0}\mathrel{\#}p_{1}italic_b italic_b ⊢ italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT # italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, t1subscript𝑡1t_{1}italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT no longer matches p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

If the initial state of SUL 𝒮𝒮\mathcal{S}caligraphic_S is language equivalent to some state in the reference model, then we can discover all reachable states in 𝒮𝒮\mathcal{S}caligraphic_S via state matching and L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT rules. The statement uses Lemma 2 above.

Theorem 5.1

Suppose we have reference \mathcal{R}caligraphic_R and SUL 𝒮𝒮\mathcal{S}caligraphic_S equivalent to \mathcal{R}caligraphic_R but with a possibly different initial state. Using only the match refinement, match separation, promotion and extension rules, we can add n𝑛nitalic_n states to the basis where n𝑛nitalic_n is the number of equivalence classes (w.r.t. language equivalence) in the reachable part of 𝒮𝒮\mathcal{S}caligraphic_S.

5.2 Optimised Separation using State Matching

In this subsection, we add an optimisation rule prioritized separation that uses the matching to guide the identification of frontier states. First, we highlight the differences between prioritized separation and the previous separation rules. Both match separation and prioritized separation require that r=p𝑟𝑝r\scalebox{0.65}{${}\overset{\surd}{=}{}$}pitalic_r over√ start_ARG = end_ARG italic_p for rF𝑟𝐹r\in Fitalic_r ∈ italic_F and pQ𝑝superscript𝑄p\in Q^{\mathcal{R}}italic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. The aim of match separation is to isolate r𝑟ritalic_r and requires that p𝑝pitalic_p does not match any basis state. Instead, the aim of prioritized separation is to guide the identification of r𝑟ritalic_r using the state identifier for a p𝑝pitalic_p matched with a basis state. The prioritized separation rule is also different from the separation rule (Sec. 4.2) which randomly selects q,qB𝑞superscript𝑞𝐵q,q^{\prime}\in Bitalic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B to separate r𝑟ritalic_r from q𝑞qitalic_q or qsuperscript𝑞q^{\prime}italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

Rule (PS): Prioritized separation. The prioritized separation rule uses the matching to find a separating sequence from the reference model that is expected to separate a frontier state from a basis state. Let q,q′′Bsuperscript𝑞superscript𝑞′′𝐵q^{\prime},q^{\prime\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ∈ italic_B and rF𝑟𝐹r\in Fitalic_r ∈ italic_F. Suppose r𝑟ritalic_r is not apart from qsuperscript𝑞q^{\prime}italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and q′′superscript𝑞′′q^{\prime\prime}italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT and σq#q′′proves𝜎#superscript𝑞superscript𝑞′′\sigma\vdash q^{\prime}\mathrel{\#}q^{\prime\prime}italic_σ ⊢ italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT. If σ𝜎\sigmaitalic_σ is in {Wp}superscriptsubscript𝑊𝑝\{W_{p}\}^{\mathcal{R}}{ italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT } start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT of a reference model state p𝑝pitalic_p that matches r𝑟ritalic_r, the prioritized separation rule poses OQ 𝖺𝖼𝖼𝖾𝗌𝗌(r)σ𝖺𝖼𝖼𝖾𝗌𝗌𝑟𝜎\mathsf{access}(r)\sigmasansserif_access ( italic_r ) italic_σ resulting in r𝑟ritalic_r being apart from qsuperscript𝑞q^{\prime}italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT or q′′superscript𝑞′′q^{\prime\prime}italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT111The precise specification is more involved, as the learner only keeps track of the match relation on B×Q𝐵superscript𝑄B\times Q^{\mathcal{R}}italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT..

Example 5.3

Suppose we learn 𝒮𝒮\mathcal{S}caligraphic_S using 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT from Fig. 1. Assume we have discovered all states in 𝒮𝒮\mathcal{S}caligraphic_S and want to identify δ𝒯(ca,c)Fsuperscript𝛿𝒯𝑐𝑎𝑐𝐹\delta^{\mathcal{T}}(ca,c)\in Fitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_c italic_a , italic_c ) ∈ italic_F, which is currently not apart from any basis state. The prioritized separation rule can only be applied with basis states q,q′′Bsuperscript𝑞superscript𝑞′′𝐵q^{\prime},q^{\prime\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ∈ italic_B such that cq#q′′proves𝑐#superscript𝑞superscript𝑞′′c\vdash q^{\prime}\mathrel{\#}q^{\prime\prime}italic_c ⊢ italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT, as c𝑐citalic_c is the only sequence in the state identifier of r2subscript𝑟2r_{2}italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT which is the state that matches δ𝒯(ca,c)superscript𝛿𝒯𝑐𝑎𝑐\delta^{\mathcal{T}}(ca,c)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_c italic_a , italic_c ). From the sequences {bb,ac,c}𝑏𝑏𝑎𝑐𝑐\{bb,ac,c\}{ italic_b italic_b , italic_a italic_c , italic_c } possibly used by L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT, only c𝑐citalic_c immediately identifies δ𝒯(ca,c)superscript𝛿𝒯𝑐𝑎𝑐\delta^{\mathcal{T}}(ca,c)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_c italic_a , italic_c ).

5.3 Approximate State Matching

In this subsection, we introduce an approximate version of matching, by quantifying matching via a matching degree. Let 𝒯𝒯\mathcal{T}caligraphic_T be a tree and \mathcal{R}caligraphic_R be a (partial) Mealy machine. Let I=I𝒯I𝐼superscript𝐼𝒯superscript𝐼I=I^{\mathcal{T}}\cap I^{\mathcal{R}}italic_I = italic_I start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ∩ italic_I start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. We define 𝖶𝖨(q)={(w,i)I×Iδ𝒯(q,wi)}𝖶𝖨𝑞conditional-set𝑤𝑖superscript𝐼𝐼superscript𝛿𝒯𝑞𝑤𝑖\mathsf{WI}(q)=\{(w,i)\in I^{*}\times I\mid\delta^{\mathcal{T}}(q,wi)\mathord{% \downarrow}\}sansserif_WI ( italic_q ) = { ( italic_w , italic_i ) ∈ italic_I start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT × italic_I ∣ italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_w italic_i ) ↓ } as prefix-suffix pairs that are defined from qQ𝒯𝑞superscript𝑄𝒯q\in Q^{\mathcal{T}}italic_q ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT onwards. Then, we define the matching degree 𝗆𝖽𝖾𝗀:Q𝒯×Q:𝗆𝖽𝖾𝗀superscript𝑄𝒯superscript𝑄\mathsf{mdeg}:Q^{\mathcal{T}}\times Q^{\mathcal{R}}\to\mathbb{R}sansserif_mdeg : italic_Q start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT → blackboard_R as

𝗆𝖽𝖾𝗀(q,p)=|{(w,i)𝖶𝖨(q)λ𝒯(δ𝒯(q,w),i)=λ(δ(p,w),i)}||𝖶𝖨(q)|.𝗆𝖽𝖾𝗀𝑞𝑝conditional-set𝑤𝑖𝖶𝖨𝑞superscript𝜆𝒯superscript𝛿𝒯𝑞𝑤𝑖superscript𝜆superscript𝛿𝑝𝑤𝑖𝖶𝖨𝑞\mathsf{mdeg}(q,p)=\frac{\left|\{(w,i)\in\mathsf{WI}(q)\mid\lambda^{\mathcal{T% }}\bigg{(}\delta^{\mathcal{T}}(q,w),i\bigg{)}=\lambda^{\mathcal{R}}\\ \bigg{(}\delta^{\mathcal{R}}(p,w),i\bigg{)}\}\right|}{\left|\mathsf{WI}(q)% \right|}.sansserif_mdeg ( italic_q , italic_p ) = divide start_ARG | { ( italic_w , italic_i ) ∈ sansserif_WI ( italic_q ) ∣ italic_λ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_w ) , italic_i ) = italic_λ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_w ) , italic_i ) } | end_ARG start_ARG | sansserif_WI ( italic_q ) | end_ARG .
Example 5.4

Consider t1subscript𝑡1t_{1}italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT from 𝒯2subscript𝒯2\mathcal{T}_{2}caligraphic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (Fig. 3) and p0subscript𝑝0p_{0}italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT from 2subscript2\mathcal{R}_{2}caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (Fig. 1). We derive 𝖶𝖨(t1)={(ε,a),(ε,b),(b,a),(b,b),(bb,b)}𝖶𝖨subscript𝑡1𝜀𝑎𝜀𝑏𝑏𝑎𝑏𝑏𝑏𝑏𝑏\mathsf{WI}(t_{1})=\{(\varepsilon,a),(\varepsilon,b),(b,a),(b,b),(bb,b)\}sansserif_WI ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = { ( italic_ε , italic_a ) , ( italic_ε , italic_b ) , ( italic_b , italic_a ) , ( italic_b , italic_b ) , ( italic_b italic_b , italic_b ) } from 𝒯2subscript𝒯2\mathcal{T}_{2}caligraphic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT where I=I𝒯2I2={a,b}𝐼superscript𝐼subscript𝒯2superscript𝐼subscript2𝑎𝑏I=I^{\mathcal{T}_{2}}\cap I^{\mathcal{R}_{2}}=\{a,b\}italic_I = italic_I start_POSTSUPERSCRIPT caligraphic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∩ italic_I start_POSTSUPERSCRIPT caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT = { italic_a , italic_b }. On these pairs, all the suffix outputs for p0subscript𝑝0p_{0}italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and t1subscript𝑡1t_{1}italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT are equivalent, 𝗆𝖽𝖾𝗀(t1,p0)=5/5=1𝗆𝖽𝖾𝗀subscript𝑡1subscript𝑝0551\mathsf{mdeg}(t_{1},p_{0})=\nicefrac{{5}}{{5}}=1sansserif_mdeg ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = / start_ARG 5 end_ARG start_ARG 5 end_ARG = 1. The matching degree between t1subscript𝑡1t_{1}italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is only 3/535\nicefrac{{3}}{{5}}/ start_ARG 3 end_ARG start_ARG 5 end_ARG because λ2(p1,bbb)=120112=λ𝒯(t1,bbb)superscript𝜆subscript2subscript𝑝1𝑏𝑏𝑏120112superscript𝜆𝒯subscript𝑡1𝑏𝑏𝑏\lambda^{\mathcal{R}_{2}}(p_{1},bbb)=120\neq 112=\lambda^{\mathcal{T}}(t_{1},bbb)italic_λ start_POSTSUPERSCRIPT caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b italic_b italic_b ) = 120 ≠ 112 = italic_λ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b italic_b italic_b ) which impacts pairs (b,b)𝑏𝑏(b,b)( italic_b , italic_b ) and (bb,b)𝑏𝑏𝑏(bb,b)( italic_b italic_b , italic_b ).

A state q𝑞qitalic_q in an observation tree 𝒯𝒯\mathcal{T}caligraphic_T approximately matches a state pQ𝑝superscript𝑄p\in Q^{\mathcal{R}}italic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT, written qp𝑞similar-to-or-equals𝑝q\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}pitalic_q over√ start_ARG ≃ end_ARG italic_p, if there does not exist a pQsuperscript𝑝superscript𝑄p^{\prime}\in Q^{\mathcal{R}}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT such that 𝗆𝖽𝖾𝗀(q,p)>𝗆𝖽𝖾𝗀(q,p)𝗆𝖽𝖾𝗀𝑞superscript𝑝𝗆𝖽𝖾𝗀𝑞𝑝\mathsf{mdeg}(q,p^{\prime})>\mathsf{mdeg}(q,p)sansserif_mdeg ( italic_q , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) > sansserif_mdeg ( italic_q , italic_p ).

Lemma 3

For any qQ𝒯,pQformulae-sequence𝑞superscript𝑄𝒯𝑝superscript𝑄q\in Q^{\mathcal{T}},p\in Q^{\mathcal{R}}italic_q ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT , italic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT: 𝗆𝖽𝖾𝗀(q,p)=1𝗆𝖽𝖾𝗀𝑞𝑝1\mathsf{mdeg}(q,p)=1sansserif_mdeg ( italic_q , italic_p ) = 1 implies q=p𝑞𝑝q\scalebox{0.65}{${}\overset{\surd}{=}{}$}pitalic_q over√ start_ARG = end_ARG italic_p.

We define rules approximate match separation (AMS), approximate match refinement (AMR) and approximate prioritized separation (APS) that represent the approximate matching variations of match separation, match refinement and prioritized separation respectively. These rules have weaker preconditions and postconditions, see Table 3 in App 0.A.

6 Adaptive L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT

The rebuilding, state matching and L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT rules described in Table 1 are ordered and combined into one adaptive learning algorithm called adaptive L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT (written AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT). A non-ordered listing of the rules can be found in Algorithm 1 in App. 0.A. We use the abbreviations for the rules defined in previous sections.

Definition 6.1

The AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT algorithm repeatedly applies the rules from Table 1 (see Algorithm 1), with the following ordering: Ex, APS, (S if APS was not applicable), P, if the observation tree is adequate we try AMR, AMS, Eq. The algorithm starts by applying R and PP until they are no longer applicable; these rules are not applied anymore afterwards.

Similar to L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT, the correctness of AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT amounts to showing termination because the algorithm can only terminate when the teacher indicates that the SUL and hypothesis are equivalent. We prove termination of AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT by proving that each rule application lowers a ranking function. The necessary ingredients for the ranking function are derived from the post-conditions of Table 1.

Theorem 6.1

AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT learns the correct Mealy machine within 𝒪(kn2+kno+no2+nlogm)𝒪𝑘superscript𝑛2𝑘𝑛𝑜𝑛superscript𝑜2𝑛𝑚\mathcal{O}(kn^{2}+kno+no^{2}+n\log m)caligraphic_O ( italic_k italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_k italic_n italic_o + italic_n italic_o start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_n roman_log italic_m ) output queries and at most n1𝑛1n-1italic_n - 1 equivalence queries where n𝑛nitalic_n is the number of equivalence classes for 𝒮𝒮\mathcal{S}caligraphic_S, o𝑜oitalic_o is the number of equivalence classes for \mathcal{R}caligraphic_R, k𝑘kitalic_k is the number of input symbols and m𝑚mitalic_m the length of the longest counterexample.

7 Adaptive Learning with Multiple References

Let 𝒳𝒳\mathcal{X}caligraphic_X be a finite set of complete reference models with possibly different alphabets. Assume each reference model 𝒳𝒳\mathcal{R}\in\mathcal{X}caligraphic_R ∈ caligraphic_X has a state cover Psuperscript𝑃P^{\mathcal{R}}italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT and separating family {Wq}superscriptsubscript𝑊𝑞\{W_{q}\}^{\mathcal{R}}{ italic_W start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT } start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. We adapt the arguments for the AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT algorithm to represent the state cover and separating family for the set of reference models.

State cover. We initialize the AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT algorithm with the union of the state cover of each reference model, i.e., 𝒳Psubscript𝒳superscript𝑃\cup_{\mathcal{R}\in\mathcal{X}}P^{\mathcal{R}}∪ start_POSTSUBSCRIPT caligraphic_R ∈ caligraphic_X end_POSTSUBSCRIPT italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. To reduce the size of P𝒳superscript𝑃𝒳P^{\mathcal{X}}italic_P start_POSTSUPERSCRIPT caligraphic_X end_POSTSUPERSCRIPT, the state cover for each reference model is computed using a fixed ordering on inputs.

Separating family. We combine the separating families for multiple reference models using a stronger notion of apartness, called total apartness, which also separates states based on whether inputs are defined. When changing the alphabet of a reference model to the alphabet of the SUL, as is done when computing the separating family, the reference model may become partial. If states from different reference models behave the same on their common alphabet but their alphabets contain different inputs from the SUL, we still want to distinguish the reference models based on which inputs they enable.

Definition 7.1

Let 1,2subscript1subscript2\mathcal{M}_{1},\mathcal{M}_{2}caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT be partial Mealy machines and pQ1,qQ2formulae-sequence𝑝superscript𝑄subscript1𝑞superscript𝑄subscript2p\in Q^{\mathcal{M}_{1}},q\in Q^{\mathcal{M}_{2}}italic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT , italic_q ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT. We say p𝑝pitalic_p and q𝑞qitalic_q are total apart, written p#qsubscript#𝑝𝑞p\mathrel{\#}_{\uparrow}qitalic_p # start_POSTSUBSCRIPT ↑ end_POSTSUBSCRIPT italic_q, if p#q#𝑝𝑞p\mathrel{\#}qitalic_p # italic_q or there exists σ(I1I2)𝜎superscriptsuperscript𝐼subscript1superscript𝐼subscript2\sigma\in(I^{\mathcal{M}_{1}}\cap I^{\mathcal{M}_{2}})^{*}italic_σ ∈ ( italic_I start_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∩ italic_I start_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT such that either δ1(p,w)superscript𝛿subscript1𝑝𝑤absent\delta^{\mathcal{M}_{1}}(p,w){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_p , italic_w ) ↑ or δ2(q,w)superscript𝛿subscript2𝑞𝑤absent\delta^{\mathcal{M}_{2}}(q,w){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_q , italic_w ) ↑ but not both.

We use total apartness to define a total state identifier and a total separating family. This definition is similar to Def. 3.3 but ##\mathrel{\#}# is be replaced by #subscript#\mathrel{\#}_{\uparrow}# start_POSTSUBSCRIPT ↑ end_POSTSUBSCRIPT. We combine the multiple reference models into a single one with an arbitrary initial state, compute the total separating family and use this to initialize AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT.

Example 7.1

A total separating family for 𝒳={1,2}𝒳subscript1subscript2\mathcal{X}=\{\mathcal{R}_{1},\mathcal{R}_{2}\}caligraphic_X = { caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , caligraphic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } and alphabet I𝒮superscript𝐼𝒮I^{\mathcal{S}}italic_I start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT is Wp0=Wp1={c,b,bb},Wp2=Wp3={c,b},Wr0=Wr1={c,ac},Wr2={c}formulae-sequencesubscript𝑊subscript𝑝0subscript𝑊subscript𝑝1𝑐𝑏𝑏𝑏subscript𝑊subscript𝑝2subscript𝑊subscript𝑝3𝑐𝑏subscript𝑊subscript𝑟0subscript𝑊subscript𝑟1𝑐𝑎𝑐subscript𝑊subscript𝑟2𝑐W_{p_{0}}=W_{p_{1}}=\{c,b,bb\},W_{p_{2}}=W_{p_{3}}=\{c,b\},W_{r_{0}}=W_{r_{1}}% =\{c,ac\},W_{r_{2}}=\{c\}italic_W start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_W start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_c , italic_b , italic_b italic_b } , italic_W start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_W start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_c , italic_b } , italic_W start_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_W start_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_c , italic_a italic_c } , italic_W start_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_c }.

We add an optimisation to AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT that only chooses p𝑝pitalic_p and psuperscript𝑝p^{\prime}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT from the same reference model during rebuilding. Theorem 6.1 can be generalized to this setting where o𝑜oitalic_o represents the number of equivalence classes across the reference models.

8 Experimental Evaluation

In this section, we empirically investigate the performance of our implementation of AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT. The source code and all benchmarks are available online222https://gitlab.science.ru.nl/lkruger/adaptive-lsharp-learnlib/[16]. We present four experiments to answer the following research questions:

R1

What is the performance of adaptive AAL algorithms, when …

Exp 1

…learning models from a similar reference model?

Exp 2

…applied to benchmarks from the literature?

R2

Can multiple references help AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT, when learning …

Exp 3

…a model from similar reference models?

Exp 4

…a protocol implementation from reference implementations?

Setup. We implement AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT on top of the L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT LearnLib implementation333Obtained from https://github.com/UCL-PPLV/learnlib.git [9]. We invoke conformance testing for the EQs, using the random Wp method from LearnLib with minimal size=3absent3{=}3= 3 and random length=3absent3{=}3= 3 444These hyperparameters are discussed in the LearnLib documentation, learnlib.de.. We run all experiments with 30 seeds. We measure the performance of the algorithms based on the number of inputs sent to the SUL during both OQs and EQs: Fewer is better.

Table 2: Summed inputs in millions for learning the mutated models with the original models.
Algorithm mut1subscriptmut1\textit{mut}_{1}mut start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT mut2subscriptmut2\textit{mut}_{2}mut start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT mut3subscriptmut3\textit{mut}_{3}mut start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT mut4subscriptmut4\textit{mut}_{4}mut start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT mut5subscriptmut5\textit{mut}_{5}mut start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT mut6subscriptmut6\textit{mut}_{6}mut start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT mut7subscriptmut7\textit{mut}_{7}mut start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT mut8subscriptmut8\textit{mut}_{8}mut start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT mut9subscriptmut9\textit{mut}_{9}mut start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT mut10subscriptmut10\textit{mut}_{10}mut start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT mut11subscriptmut11\textit{mut}_{11}mut start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT mut12subscriptmut12\textit{mut}_{12}mut start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT mut13subscriptmut13\textit{mut}_{13}mut start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT mut14subscriptmut14\textit{mut}_{14}mut start_POSTSUBSCRIPT 14 end_POSTSUBSCRIPT
Lsuperscript𝐿L^{*}italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 115.2 24.2 49.4 69.7 78.7 60.5 50.7 132.9 294.2 36.8 52.5 38.0 18.3 301.9
KV 123.5 17.8 49.6 60.1 68.9 58.7 44.9 103.7 244.3 25.5 28.7 28.0 7.5 253.6
L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT 101.7 14.3 50.0 49.2 73.0 58.7 39.9 100.1 313.9 25.4 38.9 28.0 8.0 234.9
LMsubscriptsuperscript𝐿𝑀\partial L^{*}_{M}∂ italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT [6] 132.7 19.8 22.5 25.0 32.7 26.0 - 178.0 375.0 24.7 25.4 44.1 8.9 256.3
IKV [9] 114.8 18.6 1.6 2.4 0.9 0.8 - 56.6 373.9 11.0 2.1 1.1 5.8 7.0
AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT (new!) 1.2 0.5 1.5 0.8 0.8 0.8 0.6 68.1 141.1 1.4 1.3 0.8 1.9 7.2
LR#subscriptsuperscript𝐿#RL^{\#}_{\scalebox{0.7}{R}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT R end_POSTSUBSCRIPT (new!) 101.7 12.3 1.7 9.4 1.1 7.9 0.7 68.2 306.1 12.6 2.8 1.7 6.4 7.9
L=#subscriptsuperscript𝐿#L^{\#}_{\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{=}{}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT over√ start_ARG = end_ARG end_POSTSUBSCRIPT (new!) 1.2 0.5 3.5 5.2 9.1 7.2 0.7 63.0 36.8 8.7 9.8 10.8 5.7 7.1
L#subscriptsuperscript𝐿#similar-to-or-equalsL^{\#}_{\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG end_POSTSUBSCRIPT (new!) 1.2 0.5 1.7 2.7 2.0 2.1 0.7 70.6 186.5 6.0 6.1 1.7 4.8 7.4
LR,=#subscriptsuperscript𝐿#RL^{\#}_{\scalebox{0.7}{R},\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{=}% {}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT R , over√ start_ARG = end_ARG end_POSTSUBSCRIPT (new!) 1.2 0.5 1.5 0.8 1.0 0.8 0.6 69.3 38.7 3.1 2.0 1.0 4.5 7.3

Experiment 1. We evaluate the performance of AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT against non-adaptive and adaptive algorithms from the literature, in particular Lsuperscript𝐿L^{*}italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT [2], KV [15], and L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT [23] as well as LMsubscriptsuperscript𝐿𝑀\partial L^{*}_{M}∂ italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT[6] and (a Mealy machine adaptation of) IKV [9]. As part of an ablation study, we compare AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT with simpler variations which we refer to as LR#subscriptsuperscript𝐿#RL^{\#}_{\scalebox{0.7}{R}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT R end_POSTSUBSCRIPT , L=#subscriptsuperscript𝐿#L^{\#}_{\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{=}{}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT over√ start_ARG = end_ARG end_POSTSUBSCRIPT , L#subscriptsuperscript𝐿#similar-to-or-equalsL^{\#}_{\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG end_POSTSUBSCRIPT , LR,=#subscriptsuperscript𝐿#RL^{\#}_{\scalebox{0.7}{R},\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{=}% {}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT R , over√ start_ARG = end_ARG end_POSTSUBSCRIPT . The subscripts indicate which rules are added:
R𝑅Ritalic_R: R + PP, ={}\overset{\surd}{=}{}over√ start_ARG = end_ARG : MS + MR + PS, similar-to-or-equals{}\overset{\surd}{\simeq}{}over√ start_ARG ≃ end_ARG : AMS + AMR + APS.

We learn six models from the AutomataWiki benchmarks [18] also used in [23]. We limit ourselves to six models because we mutate every model in 14 different ways (and for 30 seeds). The chosen models represent different types of protocols with varying number of states. We learn the mutated models using the original models, referred to as 𝒮𝒮\mathcal{S}caligraphic_S, as a reference. The mutations may add states, divert transitions, remove inputs, perform multiple mutations, or compose the model with a mutated version of the model. We provide details on the used models and mutations in App. 0.E.

Results. Table 2 shows for an algorithm (rows) and a mutation (columns) the total number of inputs (106absentsuperscript106\cdot 10^{6}⋅ 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT) necessary to learn all models, summed over all seeds555LMsubscriptsuperscript𝐿𝑀\partial L^{*}_{M}∂ italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT and IKV do not support removing input inputs, relevant for mutation M7.. The highlighted values indicate the best performing algorithm. We provide detailed pairwise comparisons between algorithms in App. 0.E.

Discussion. First, we observe that AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT always outperforms non-adaptive learning algorithms, as is expected. By combining state matching and rebuilding, AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT mostly outperforms algorithms from the literature, with IKV being competitive on some types of mutations. In mut9(𝒮)subscriptmut9𝒮\textit{mut}_{9}(\mathcal{S})mut start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT ( caligraphic_S ) we append 𝒮𝒮\mathcal{S}caligraphic_S to mut13(𝒮)subscriptmut13𝒮\textit{mut}_{13}(\mathcal{S})mut start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT ( caligraphic_S ), L=#subscriptsuperscript𝐿#L^{\#}_{\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{=}{}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT over√ start_ARG = end_ARG end_POSTSUBSCRIPT outperforms L#subscriptsuperscript𝐿#similar-to-or-equalsL^{\#}_{\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG end_POSTSUBSCRIPT because L#subscriptsuperscript𝐿#similar-to-or-equalsL^{\#}_{\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG end_POSTSUBSCRIPT incorrectly matches mut13(𝒮)subscriptmut13𝒮\textit{mut}_{13}(\mathcal{S})mut start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT ( caligraphic_S ) states with states in 𝒮𝒮\mathcal{S}caligraphic_S, making it harder to learn the 𝒮𝒮\mathcal{S}caligraphic_S fragment. In the pairwise comparisons in App. 0.E, we see that AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT performs much better on models GnuTLS, OpenSSH compared to other adaptive approaches. We conjecture that this effect occurs, as these models are hard to learn in general (high number of total inputs) and thus the potential benefit of AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT is higher.

Experiment 2. We evaluate L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT, LMsubscriptsuperscript𝐿𝑀\partial L^{*}_{M}∂ italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT, IKV and AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT on benchmarks that contain reference models. Adaptive-OpenSSL [7], used in [6], contains models learned from different git development branches for the OpenSSL server side. Adaptive-Philips [19] contains models representing some legacy code which evolved over time due to bug fixes and allowing more inputs.

Refer to caption
(a) Averaged inputs for learning Adaptive-Philips (starting with m) and Adaptive-OpenSSL.

[][Summed inputs in millions for learning some 𝒮𝒮\mathcal{S}caligraphic_S.] {mut10(𝒮)subscriptmut10𝒮\textit{mut}_{10}(\mathcal{S})mut start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( caligraphic_S ), {mut12(𝒮)subscriptmut12𝒮\textit{mut}_{12}(\mathcal{S})mut start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ( caligraphic_S ),  mut10(𝒮)subscriptmut10𝒮\textit{mut}_{10}(\mathcal{S})mut start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( caligraphic_S ),  mut12(𝒮)subscriptmut12𝒮\textit{mut}_{12}(\mathcal{S})mut start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ( caligraphic_S ), SUL {mut10(𝒮)subscriptmut10𝒮\textit{mut}_{10}(\mathcal{S})mut start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( caligraphic_S )} {mut12(𝒮)subscriptmut12𝒮\textit{mut}_{12}(\mathcal{S})mut start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ( caligraphic_S )}  mut10(𝒮)subscriptmut10𝒮\textit{mut}_{10}(\mathcal{S})mut start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( caligraphic_S )}  mut12(𝒮)subscriptmut12𝒮\textit{mut}_{12}(\mathcal{S})mut start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ( caligraphic_S )} 𝒮𝒮\mathcal{S}caligraphic_S 33.1 52.7  17.4  22.9
[][Summed inputs in millions for learning some mutated 𝒮𝒮\mathcal{S}caligraphic_S.] SUL {𝒮𝒮\mathcal{S}caligraphic_S} {mut13(𝒮)subscriptmut13𝒮\textit{mut}_{13}(\mathcal{S})mut start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT ( caligraphic_S )} {𝒮𝒮\mathcal{S}caligraphic_S,mut13(𝒮)subscriptmut13𝒮\textit{mut}_{13}(\mathcal{S})mut start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT ( caligraphic_S )} mut8(𝒮)subscriptmut8𝒮\textit{mut}_{8}(\mathcal{S})mut start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT ( caligraphic_S ) 68.1 96.3 25.7 mut9(𝒮)subscriptmut9𝒮\textit{mut}_{9}(\mathcal{S})mut start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT ( caligraphic_S ) 141.1 263.0 35.3 mut14(𝒮)subscriptmut14𝒮\textit{mut}_{14}(\mathcal{S})mut start_POSTSUBSCRIPT 14 end_POSTSUBSCRIPT ( caligraphic_S ) 7.2 212.1 3.2

Figure 4: Results Experiments 2 and 3.

Results. Fig. 4(a) shows the mean total number of inputs required for learning a model from the associated reference model, depicting the 5th95thsuperscript5thsuperscript95th5^{\text{th}}-95^{\text{th}}5 start_POSTSUPERSCRIPT th end_POSTSUPERSCRIPT - 95 start_POSTSUPERSCRIPT th end_POSTSUPERSCRIPT percentile (line) and average (mark) over the seeds.

Discussion. We observe that L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT and LMsubscriptsuperscript𝐿𝑀\partial L^{*}_{M}∂ italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT perform worse than AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT. AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT often outperforms IKV by a factor 2-4, despite that these models are relatively small and thus easy to learn.

Experiment 3. We evaluate AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT with one or multiple references on the models used in Experiment 1. We either (1) learn 𝒮𝒮\mathcal{S}caligraphic_S using several mutations of 𝒮𝒮\mathcal{S}caligraphic_S or (2) learn a mutation that represents a combination of the 𝒮𝒮\mathcal{S}caligraphic_S and mut13(𝒮)subscriptmut13𝒮\textit{mut}_{13}(\mathcal{S})mut start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT ( caligraphic_S ).

Results. Tables 4, 4 show for every type of SUL (rows) and every set of references (columns) the total number of inputs (106absentsuperscript106\cdot 10^{6}⋅ 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT) necessary to learn all models, summed over all seeds. Highlighted values indicate the best performing set of references. Column {𝒮}𝒮\{\mathcal{S}\}{ caligraphic_S } in Table 4 corresponds to values in row AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT of Table 2; they are added in Table 4 for clarity.

Discussion. We observe that using multiple references outperforms using one reference, as is expected. We hypothesize that learning with reference mut13(𝒮)subscriptmut13𝒮\textit{mut}_{13}(\mathcal{S})mut start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT ( caligraphic_S ) instead of 𝒮𝒮\mathcal{S}caligraphic_S often leads to an increase in total inputs because mut13(𝒮)subscriptmut13𝒮\textit{mut}_{13}(\mathcal{S})mut start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT ( caligraphic_S ) is less complex due to the random transitions. Therefore, discovering states belonging to the 𝒮𝒮\mathcal{S}caligraphic_S fragment in mut8(𝒮)subscriptmut8𝒮\textit{mut}_{8}(\mathcal{S})mut start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT ( caligraphic_S ), mut9(𝒮)subscriptmut9𝒮\textit{mut}_{9}(\mathcal{S})mut start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT ( caligraphic_S ) and mut14(𝒮)subscriptmut14𝒮\textit{mut}_{14}(\mathcal{S})mut start_POSTSUBSCRIPT 14 end_POSTSUBSCRIPT ( caligraphic_S ) becomes more difficult.

Experiment 4. We evaluate the performance of AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT with one or multiple references on learning DTLS and TCP models from AutomataWiki666References represent related models instead of previous models as in Experiment 2.. We consider seven DTLS implementations selected to have the same key exchange algorithm and certification requirement. We consider three TCP client implementations.

Refer to caption
Refer to caption
Figure 5: Averaged inputs for learning 𝒮𝒮\mathcal{S}caligraphic_S with multiple references.

Results. Fig. 5 shows the required inputs for learning 𝒮𝒮\mathcal{S}caligraphic_S (x-axis) with only the reference model indicated by the colored data point, averaged over the seeds. For each DTLS model, we included learning 𝒮𝒮\mathcal{S}caligraphic_S with the 𝒮𝒮\mathcal{S}caligraphic_S as a reference model. The * mark indicates using all models except the 𝒮𝒮\mathcal{S}caligraphic_S as references, the ×\times× mark indicates using no references, e.g., non-adaptive L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT.

Discussion. We observe that using all references except 𝒮𝒮\mathcal{S}caligraphic_S usually performs as well as the best performing reference model that is distinct from 𝒮𝒮\mathcal{S}caligraphic_S. In scand-lat, using a set of references outperforms single reference models, almost matching the performance of learning 𝒮𝒮\mathcal{S}caligraphic_S with 𝒮𝒮\mathcal{S}caligraphic_S as a reference.

9 Conclusion

We introduced the adaptive L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT algorithm (AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT), a new algorithm for adaptive active automata learning that allows to flexibly use domain knowledge in the form of (preferably similar) reference models and thereby aims to reduce the sample complexity for learning new models. Experiments show that the algorithm can lead to significant improvements over the state-of-the-art (Sec. 8).

9.0.1 Future work.

Approximate state matching is sometimes too eager and may mislead the learner, as happens for mut9subscriptmut9\textit{mut}_{9}mut start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT in Experiment 1 (Sec. 8). This may be addressed by only applying matching rules when the matching degree is above some threshold. It is currently unclear how to determine an appropriate threshold.

Further, adaptive methods typically perform well when the reference model and SUL are similar [12]. We would like to dynamically determine which (parts of) reference models are similar, and incorporate this in the rebuilding rule.

Adaptive AAL allows the re-use of information in the form of a Mealy machine. Other sources of information that can be re-used in AAL are, for instance, system logs, realised by combining active and passive learning [25, 1]. An interesting direction of research is the development of a more general methodology that allows the re-use of various forms of previous knowledge.

References

  • [1] Bernhard K. Aichernig, Edi Muskardin, and Andrea Pferscher. Active vs. passive: A comparison of automata learning paradigms for network protocols. In FMAS/ASYDE@SEFM, volume 371 of EPTCS, pages 1–19, 2022.
  • [2] Dana Angluin. Learning regular sets from queries and counterexamples. Inf. Comput., 75(2):87–106, 1987.
  • [3] Kousar Aslam, Loek Cleophas, Ramon R. H. Schiffelers, and Mark van den Brand. Interface protocol inference to aid understanding legacy software components. Softw. Syst. Model., 19(6):1519–1540, 2020.
  • [4] Alexander Bainczyk, Bernhard Steffen, and Falk Howar. Lifelong learning of reactive systems in practice. In The Logic of Software. A Tasting Menu of Formal Methods, volume 13360 of LNCS, pages 38–53. Springer, 2022.
  • [5] Sagar Chaki, Edmund M. Clarke, Natasha Sharygina, and Nishant Sinha. Verification of evolving software via component substitutability analysis. Formal Methods Syst. Des., 32(3):235–266, 2008.
  • [6] Carlos Diego Nascimento Damasceno, Mohammad Reza Mousavi, and Adenilso da Silva Simão. Learning to reuse: Adaptive model learning for evolving systems. In IFM, volume 11918 of LNCS, pages 138–156. Springer, 2019.
  • [7] Joeri de Ruiter. A tale of the OpenSSL state machine: A large-scale black-box analysis. In NordSec, volume 10014 of LNCS, pages 169–184, 2016.
  • [8] Joeri de Ruiter and Erik Poll. Protocol state fuzzing of TLS implementations. In USENIX Security Symposium, pages 193–206. USENIX Association, 2015.
  • [9] Tiago Ferreira, Gerco van Heerdt, and Alexandra Silva. Tree-based adaptive model learning. In A Journey from Process Algebra via Timed Automata to Model Learning, volume 13560 of LNCS, pages 164–179. Springer, 2022.
  • [10] Alex Groce, Doron A. Peled, and Mihalis Yannakakis. Adaptive model checking. Log. J. IGPL, 14(5):729–744, 2006.
  • [11] Falk Howar and Bernhard Steffen. Active automata learning in practice - an annotated bibliography of the years 2011 to 2016. In Machine Learning for Dynamic Software Analysis, volume 11026 of LNCS, pages 123–148. Springer, 2018.
  • [12] David Huistra, Jeroen Meijer, and Jaco van de Pol. Adaptive learning for learn-based regression testing. In FMICS, volume 11119 of LNCS, pages 162–177. Springer, 2018.
  • [13] Malte Isberner, Falk Howar, and Bernhard Steffen. The TTT algorithm: A redundancy-free approach to active automata learning. In RV, volume 8734 of LNCS, pages 307–322. Springer, 2014.
  • [14] Malte Isberner, Falk Howar, and Bernhard Steffen. The open-source learnlib - A framework for active automata learning. In CAV (1), volume 9206 of LNCS, pages 487–495. Springer, 2015.
  • [15] Michael J. Kearns and Umesh V. Vazirani. An Introduction to Computational Learning Theory. MIT Press, 1994. URL: https://mitpress.mit.edu/books/introduction-computational-learning-theory.
  • [16] Loes Kruger, Sebastian Junges, and Jurriaan Rot. State Matching and Multiple References in Adaptive Active Automata Learning: Supplementary Material, June 2024. doi:10.5281/zenodo.12517574.
  • [17] Edi Muskardin, Bernhard K. Aichernig, Ingo Pill, Andrea Pferscher, and Martin Tappler. AALpy: An active automata learning library. In ATVA, volume 12971 of LNCS, pages 67–73. Springer, 2021.
  • [18] Daniel Neider, Rick Smetsers, Frits W. Vaandrager, and Harco Kuppens. Benchmarks for automata learning and conformance testing. In Models, Mindsets, Meta, volume 11200 of LNCS, pages 390–416. Springer, 2018.
  • [19] Mathijs Schuts, Jozef Hooman, and Frits W. Vaandrager. Refactoring of legacy software using model learning and equivalence checking: An industrial experience report. In IFM, volume 9681 of LNCS, pages 311–325. Springer, 2016.
  • [20] Rick Smetsers, Joshua Moerman, and David N. Jansen. Minimal separating sequences for all pairs of states. In LATA, volume 9618 of LNCS, pages 181–193. Springer, 2016.
  • [21] Martin Tappler, Bernhard K. Aichernig, and Roderick Bloem. Model-based testing IoT communication via active automata learning. CoRR, abs/1904.07075, 2019.
  • [22] Frits W. Vaandrager. Model learning. Commun. ACM, 60(2):86–95, 2017.
  • [23] Frits W. Vaandrager, Bharat Garhewal, Jurriaan Rot, and Thorsten Wißmann. A new approach for active automata learning based on apartness. In TACAS (1), volume 13243 of LNCS, pages 223–243. Springer, 2022.
  • [24] Stephan Windmüller, Johannes Neubauer, Bernhard Steffen, Falk Howar, and Oliver Bauer. Active continuous quality control. In CBSE, pages 111–120. ACM, 2013.
  • [25] Nan Yang, Kousar Aslam, Ramon R. H. Schiffelers, Leonard Lensink, Dennis Hendriks, Loek Cleophas, and Alexander Serebrenik. Improving model inference in industry by combining active and passive learning. In SANER, pages 253–263. IEEE, 2019.

Appendix 0.A Additional Definition, Figure, Table and Algorithm

We define how to fold back an observation tree to a complete Mealy machine.

Definition 0.A.1

Let 𝒯𝒯\mathcal{T}caligraphic_T be an observation tree for SUL 𝒮𝒮\mathcal{S}caligraphic_S. If each basis state has a transition for every input and each frontier state is identified with a basis state, then 𝒯𝒯\mathcal{T}caligraphic_T is folded back to complete Mealy machine =(B,I,O,q0𝒯,δ,λ𝒯)𝐵𝐼𝑂superscriptsubscript𝑞0𝒯superscript𝛿superscript𝜆𝒯\mathcal{H}=(B,I,O,q_{0}^{\mathcal{T}},\delta^{\mathcal{H}},\lambda^{\mathcal{% T}})caligraphic_H = ( italic_B , italic_I , italic_O , italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT , italic_δ start_POSTSUPERSCRIPT caligraphic_H end_POSTSUPERSCRIPT , italic_λ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ) where for all qB𝑞𝐵q\in Bitalic_q ∈ italic_B and iI𝑖𝐼i\in Iitalic_i ∈ italic_I:

δ(q,i)={δ𝒯(q,i) if δ𝒯(q,i)Bq if δ𝒯(q,i)=rF and r is identified with qBsuperscript𝛿𝑞𝑖casessuperscript𝛿𝒯𝑞𝑖 if superscript𝛿𝒯𝑞𝑖𝐵superscript𝑞 if superscript𝛿𝒯𝑞𝑖𝑟𝐹 and 𝑟 is identified with superscript𝑞𝐵\delta^{\mathcal{H}}(q,i)=\begin{cases}\delta^{\mathcal{T}}(q,i)&\text{ if }% \delta^{\mathcal{T}}(q,i)\in B\\ q^{\prime}&\text{ if }\delta^{\mathcal{T}}(q,i)=r\in F\text{ and }r\text{ is % identified with }q^{\prime}\in B\\ \end{cases}italic_δ start_POSTSUPERSCRIPT caligraphic_H end_POSTSUPERSCRIPT ( italic_q , italic_i ) = { start_ROW start_CELL italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) end_CELL start_CELL if italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ∈ italic_B end_CELL end_ROW start_ROW start_CELL italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL if italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) = italic_r ∈ italic_F and italic_r is identified with italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B end_CELL end_ROW

In Fig. 6, we show the scenarios in the observation tree and the reference model necessary to apply the rebuilding, match refinement, match separation and prioritized separation rules.

Refer to caption
(a) Rebuilding
Refer to caption
(b) Match refinement
Refer to caption
(c) Match separation
Refer to caption
(d) Prioritized separation
Figure 6: Scenario in the observation tree (left) and reference model (right) required to apply the rule, additional preconditions are written below the scenario. The dashed green lines indicate that two states are matched.

In Algorithm 1, we list the rules used for AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT in a non-deterministic ordering.

procedure ExtendedLSharp(P,{Wq}superscript𝑃superscriptsubscript𝑊𝑞P^{\mathcal{R}},\{W_{q}\}^{\mathcal{R}}italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT , { italic_W start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT } start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT)
     𝒯{q0}absent𝒯subscript𝑞0\mathcal{T}\xleftarrow{}\{q_{0}\}caligraphic_T start_ARROW start_OVERACCENT end_OVERACCENT ← end_ARROW { italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } s.t. δ𝒯(ε)=q0superscript𝛿𝒯𝜀subscript𝑞0\delta^{\mathcal{T}}(\varepsilon)=q_{0}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_ε ) = italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT
     B{q0}absent𝐵subscript𝑞0B\xleftarrow{}\{q_{0}\}italic_B start_ARROW start_OVERACCENT end_OVERACCENT ← end_ARROW { italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT }
      do δ𝒯(q,i)Bsuperscript𝛿𝒯𝑞𝑖𝐵\delta^{\mathcal{T}}(q,i)\notin Bitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ∉ italic_B and ¬(q#δ𝒯(q,i))#superscript𝑞superscript𝛿𝒯𝑞𝑖\neg(q^{\prime}\mathrel{\#}\delta^{\mathcal{T}}(q,i))¬ ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ) for q,qB𝑞superscript𝑞𝐵q,q^{\prime}\in Bitalic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, iI𝑖𝐼i\in Iitalic_i ∈ italic_I s.t.
    𝖺𝖼𝖼𝖾𝗌𝗌(q)i,𝖺𝖼𝖼𝖾𝗌𝗌(q)P𝖺𝖼𝖼𝖾𝗌𝗌𝑞𝑖𝖺𝖼𝖼𝖾𝗌𝗌superscript𝑞superscript𝑃\mathsf{access}(q)i,\mathsf{access}(q^{\prime})\in P^{\mathcal{R}}sansserif_access ( italic_q ) italic_i , sansserif_access ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT and (δ𝒯(q,iσ)superscript𝛿𝒯𝑞𝑖𝜎absent\delta^{\mathcal{T}}(q,i\sigma){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i italic_σ ) ↑ or δ𝒯(q,σ)superscript𝛿𝒯superscript𝑞𝜎absent\delta^{\mathcal{T}}(q^{\prime},\sigma){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_σ ) ↑)
    σ=𝗌𝖾𝗉(δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i),δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)))𝜎𝗌𝖾𝗉superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\sigma=\mathsf{sep}\bm{(}\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q)% i),\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))\bm{)}italic_σ = sansserif_sep bold_( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) bold_) \rightarrow \triangleright rebuilding
         OutputQuery(𝖺𝖼𝖼𝖾𝗌𝗌(q)iσ𝖺𝖼𝖼𝖾𝗌𝗌𝑞𝑖𝜎\mathsf{access}(q)i\sigmasansserif_access ( italic_q ) italic_i italic_σ)
         OutputQuery(𝖺𝖼𝖼𝖾𝗌𝗌(q)σ𝖺𝖼𝖼𝖾𝗌𝗌superscript𝑞𝜎\mathsf{access}(q^{\prime})\sigmasansserif_access ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) italic_σ)
      d rF𝑟𝐹r\in Fitalic_r ∈ italic_F is isolated and 𝖺𝖼𝖼𝖾𝗌𝗌(r)P𝖺𝖼𝖼𝖾𝗌𝗌𝑟superscript𝑃\mathsf{access}(r)\in P^{\mathcal{R}}sansserif_access ( italic_r ) ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT \rightarrow \triangleright prefix promotion
         BB{r}𝐵𝐵𝑟B\leftarrow B\cup\{r\}italic_B ← italic_B ∪ { italic_r }
      d rF𝑟𝐹r\in Fitalic_r ∈ italic_F is isolated \rightarrow \triangleright promotion
         BB{r}𝐵𝐵𝑟B\leftarrow B\cup\{r\}italic_B ← italic_B ∪ { italic_r }
      d δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖absent\delta^{\mathcal{T}}(q,i){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ↑, for some qB,iIformulae-sequence𝑞𝐵𝑖𝐼q\in B,i\in Iitalic_q ∈ italic_B , italic_i ∈ italic_I \rightarrow \triangleright extension
         OutputQuery(𝖺𝖼𝖼𝖾𝗌𝗌(q)i)OutputQuery𝖺𝖼𝖼𝖾𝗌𝗌𝑞𝑖\text{{OutputQuery}}(\mathsf{access}(q)i)OutputQuery ( sansserif_access ( italic_q ) italic_i )
      d ¬(r#q)#𝑟𝑞\neg(r\mathrel{\#}q)¬ ( italic_r # italic_q ), ¬(r#q)#𝑟superscript𝑞\neg(r\mathrel{\#}q^{\prime})¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ), for some rF𝑟𝐹r\in Fitalic_r ∈ italic_F, q,qB𝑞superscript𝑞𝐵q,q^{\prime}\in Bitalic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, qq𝑞superscript𝑞q\neq q^{\prime}italic_q ≠ italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT \rightarrow \triangleright separation
         σwitness of q#q𝜎witness of q#q\sigma\leftarrow\text{witness of $q\mathrel{\#}q^{\prime}$}italic_σ ← witness of italic_q # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
         OutputQuery(𝖺𝖼𝖼𝖾𝗌𝗌(r)σ)OutputQuery𝖺𝖼𝖼𝖾𝗌𝗌𝑟𝜎\text{{OutputQuery}}(\mathsf{access}(r)\sigma)OutputQuery ( sansserif_access ( italic_r ) italic_σ )
      d pq𝑝similar-to-or-equals𝑞p\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}qitalic_p over√ start_ARG ≃ end_ARG italic_q for some qB𝑞𝐵q\in Bitalic_q ∈ italic_B and there is some iI𝑖𝐼i\in Iitalic_i ∈ italic_I s.t.
    δ𝒯(q,i)=rFsuperscript𝛿𝒯𝑞𝑖𝑟𝐹\delta^{\mathcal{T}}(q,i)=r\in Fitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) = italic_r ∈ italic_F, ¬(r#q)#𝑟superscript𝑞\neg(r\mathrel{\#}q^{\prime})¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) for some qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B and
    ¬(r#p)#𝑟superscript𝑝\neg(r\mathrel{\#}p^{\prime})¬ ( italic_r # italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) for δ(p,i)=psuperscript𝛿𝑝𝑖superscript𝑝\delta^{\mathcal{R}}(p,i)=p^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_i ) = italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and
    p≄q′′superscript𝑝not-similar-to-or-equalssuperscript𝑞′′p^{\prime}\scalebox{0.65}{${}\overset{\surd}{\not\simeq}{}$}q^{\prime\prime}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over√ start_ARG ≄ end_ARG italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT for any q′′Bsuperscript𝑞′′𝐵q^{\prime\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ∈ italic_B \rightarrow \triangleright match separation
         σwitness for q#p𝜎witness for superscript𝑞#superscript𝑝\sigma\leftarrow\mbox{witness for }q^{\prime}\mathrel{\#}p^{\prime}italic_σ ← witness for italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
         OutputQuery(𝖺𝖼𝖼𝖾𝗌𝗌(q)iσ)OutputQuery𝖺𝖼𝖼𝖾𝗌𝗌𝑞𝑖𝜎\textsc{OutputQuery}(\mathsf{access}(q)i\sigma)OutputQuery ( sansserif_access ( italic_q ) italic_i italic_σ )
      d pq𝑝similar-to-or-equals𝑞p\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}qitalic_p over√ start_ARG ≃ end_ARG italic_q and pqsuperscript𝑝similar-to-or-equals𝑞p^{\prime}\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}qitalic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over√ start_ARG ≃ end_ARG italic_q for some qB𝑞𝐵q\in Bitalic_q ∈ italic_B and p,pQ𝑝superscript𝑝superscript𝑄p,p^{\prime}\in Q^{\mathcal{R}}italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT with
    σ=𝗌𝖾𝗉(p,p)𝜎𝗌𝖾𝗉𝑝superscript𝑝\sigma=\mathsf{sep}(p,p^{\prime})italic_σ = sansserif_sep ( italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) and δ𝒯(q,σ)superscript𝛿𝒯𝑞𝜎absent\delta^{\mathcal{T}}(q,\sigma){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_σ ) ↑ \rightarrow \triangleright match refinement
         OutputQuery(𝖺𝖼𝖼𝖾𝗌𝗌(q)σ)OutputQuery𝖺𝖼𝖼𝖾𝗌𝗌𝑞𝜎\textsc{OutputQuery}(\mathsf{access}(q)\sigma)OutputQuery ( sansserif_access ( italic_q ) italic_σ )
      d ¬(r#q)#𝑟superscript𝑞\neg(r\mathrel{\#}q^{\prime})¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ), ¬(r#q′′)#𝑟superscript𝑞′′\neg(r\mathrel{\#}q^{\prime\prime})¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ), for some rF𝑟𝐹r\in Fitalic_r ∈ italic_F, q,q,q′′B𝑞superscript𝑞superscript𝑞′′𝐵q,q^{\prime},q^{\prime\prime}\in Bitalic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ∈ italic_B s.t.
    δ𝒯(q,i)=rsuperscript𝛿𝒯𝑞𝑖𝑟\delta^{\mathcal{T}}(q,i)=ritalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) = italic_r for some iI𝑖𝐼i\in Iitalic_i ∈ italic_I, σq#qproves𝜎#𝑞superscript𝑞\sigma\vdash q\mathrel{\#}q^{\prime}italic_σ ⊢ italic_q # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
    σpqWδ(p,i)𝜎subscript𝑝similar-to-or-equals𝑞subscript𝑊superscript𝛿𝑝𝑖\sigma\in\cup_{p\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}q}W_{\delta^{% \mathcal{R}}(p,i)}italic_σ ∈ ∪ start_POSTSUBSCRIPT italic_p over√ start_ARG ≃ end_ARG italic_q end_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_i ) end_POSTSUBSCRIPT \rightarrow \triangleright prioritized separation
         OutputQuery(𝖺𝖼𝖼𝖾𝗌𝗌(r)σ)OutputQuery𝖺𝖼𝖼𝖾𝗌𝗌𝑟𝜎\text{{OutputQuery}}(\mathsf{access}(r)\sigma)OutputQuery ( sansserif_access ( italic_r ) italic_σ )
      d All rF𝑟𝐹r\in Fitalic_r ∈ italic_F are identified and δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ↓ for all qB,iIformulae-sequence𝑞𝐵𝑖𝐼q\in B,i\in Iitalic_q ∈ italic_B , italic_i ∈ italic_I \rightarrow \triangleright equivalence
         BuildHypothesisBuildHypothesis\mathcal{H}\leftarrow\text{{BuildHypothesis}}caligraphic_H ← BuildHypothesis
         (b,σ)CheckConsistency()𝑏𝜎CheckConsistency(b,\sigma)\leftarrow\text{{CheckConsistency}}(\mathcal{H})( italic_b , italic_σ ) ← CheckConsistency ( caligraphic_H )
         if b=yes𝑏yesb=\texttt{yes}italic_b = yes then
              (b,ρ)EquivalenceQuery()𝑏𝜌EquivalenceQuery(b,\rho)\leftarrow\text{{EquivalenceQuery}}(\mathcal{H})( italic_b , italic_ρ ) ← EquivalenceQuery ( caligraphic_H )
              if b=yes𝑏yesb=\texttt{yes}italic_b = yes then: return \mathcal{H}caligraphic_H
              else: σ𝜎absent\sigma\leftarrowitalic_σ ← shortest prefix of ρ𝜌\rhoitalic_ρ such that δ(q0,σ)#δ𝒯(q0𝒯,σ)#superscript𝛿superscriptsubscript𝑞0𝜎superscript𝛿𝒯superscriptsubscript𝑞0𝒯𝜎\delta^{\mathcal{H}}(q_{0}^{\mathcal{H}},\sigma)\mathrel{\#}\delta^{\mathcal{T% }}(q_{0}^{\mathcal{T}},\sigma)italic_δ start_POSTSUPERSCRIPT caligraphic_H end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_H end_POSTSUPERSCRIPT , italic_σ ) # italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT , italic_σ ) (in 𝒯𝒯\mathcal{T}caligraphic_T)          
         ProcCounterEx(,σ)ProcCounterEx𝜎\text{{ProcCounterEx}}(\mathcal{H},\sigma)ProcCounterEx ( caligraphic_H , italic_σ )
      end do
Algorithm 1 Extended L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT algorithm

Table 3 shows the pre- and postconditions of the approximate matching variations of the state matching rules.

Table 3: Approximate state matching rules with parameters, preconditions and postconditions.
Rule Parameters Precondition Postcondition
Sec 5.3 approximate match separation q,qB𝑞superscript𝑞𝐵q,q^{\prime}\in Bitalic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, δ𝒯(q,i)=rF,¬(r#q),formulae-sequencesuperscript𝛿𝒯𝑞𝑖𝑟𝐹#𝑟superscript𝑞\delta^{\mathcal{T}}(q,i)=r\in F,\neg(r\mathrel{\#}q^{\prime}),italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) = italic_r ∈ italic_F , ¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , r#q#𝑟limit-fromsuperscript𝑞r\mathrel{\#}q^{\prime}~{}\lor~{}italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∨
pQ,𝑝superscript𝑄p\in Q^{\mathcal{R}},italic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT , ¬(r#p),pq,δ(p,i)=p#𝑟superscript𝑝𝑝similar-to-or-equals𝑞superscript𝛿𝑝𝑖superscript𝑝\neg(r\mathrel{\#}p^{\prime}),p\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}q% ,\delta^{\mathcal{R}}(p,i)=p^{\prime}¬ ( italic_r # italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , italic_p over√ start_ARG ≃ end_ARG italic_q , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_i ) = italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, r#p#𝑟superscript𝑝r\mathrel{\#}p^{\prime}italic_r # italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
iI𝑖𝐼i\in Iitalic_i ∈ italic_I ¬(q′′B\neg(\exists q^{\prime\prime}\in B¬ ( ∃ italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ∈ italic_B s.t. pq′′)p^{\prime}\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}q^{\prime\prime})italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over√ start_ARG ≃ end_ARG italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT )
approximate match refinement qB,𝑞𝐵q\in B,italic_q ∈ italic_B , pq,pq,𝑝similar-to-or-equals𝑞superscript𝑝similar-to-or-equals𝑞p\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}q,p^{\prime}\scalebox{0.65}{${}% \overset{\surd}{\simeq}{}$}q,italic_p over√ start_ARG ≃ end_ARG italic_q , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over√ start_ARG ≃ end_ARG italic_q , δ𝒯(q,σ),superscript𝛿𝒯𝑞𝜎\delta^{\mathcal{T}}(q,\sigma){\mathord{\downarrow}},italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_σ ) ↓ ,
p,pQ𝑝superscript𝑝superscript𝑄p,p^{\prime}\in Q^{\mathcal{R}}italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT σ=𝗌𝖾𝗉(p,p),δ𝒯(q,σ)\sigma=\mathsf{sep}(p,p^{\prime}),\delta^{\mathcal{T}}(q,\sigma){\uparrow}italic_σ = sansserif_sep ( italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_σ ) ↑
approximate prioritized separation rF,𝑟𝐹r\in F,italic_r ∈ italic_F , ¬(r#q),¬(r#q′′)#𝑟superscript𝑞#𝑟superscript𝑞′′\neg(r\mathrel{\#}q^{\prime}),\neg(r\mathrel{\#}q^{\prime\prime})¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , ¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ), r#q′′r#q#𝑟superscript𝑞′′𝑟#superscript𝑞r\mathrel{\#}q^{\prime\prime}\lor r\mathrel{\#}q^{\prime}italic_r # italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ∨ italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
q,q′′Bsuperscript𝑞superscript𝑞′′𝐵q^{\prime},q^{\prime\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ∈ italic_B iI𝑖𝐼\exists i\in I∃ italic_i ∈ italic_I s.t. δ𝒯(q,i)=r,superscript𝛿𝒯𝑞𝑖𝑟\delta^{\mathcal{T}}(q,i)=r,italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) = italic_r ,
σq#q′′,σpqWδ(p,i)proves𝜎formulae-sequence#superscript𝑞superscript𝑞′′𝜎subscript𝑝similar-to-or-equals𝑞subscript𝑊superscript𝛿𝑝𝑖\sigma\vdash q^{\prime}\mathrel{\#}q^{\prime\prime},\sigma\in\cup_{p\scalebox{% 0.65}{${}\overset{\surd}{\simeq}{}$}q}W_{\delta^{\mathcal{R}}(p,i)}italic_σ ⊢ italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT , italic_σ ∈ ∪ start_POSTSUBSCRIPT italic_p over√ start_ARG ≃ end_ARG italic_q end_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_i ) end_POSTSUBSCRIPT

Appendix 0.B Proofs of Section 4

Proof of Lemma 1

Proof

Let qB𝑞𝐵q\in Bitalic_q ∈ italic_B, iI𝑖𝐼i\in Iitalic_i ∈ italic_I and σI𝜎superscript𝐼\sigma\in I^{*}italic_σ ∈ italic_I start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. Suppose

  1. (1)

    δ𝒯(q,i)Bsuperscript𝛿𝒯𝑞𝑖𝐵\delta^{\mathcal{T}}(q,i)\notin Bitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ∉ italic_B,

  2. (2)

    𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)iPsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝑃\mathsf{access}^{\mathcal{T}}(q)i\in P^{\mathcal{R}}sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT,

  3. (3)

    For all qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)Psuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞superscript𝑃\mathsf{access}^{\mathcal{T}}(q^{\prime})\in P^{\mathcal{R}}sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT,

  4. (4)

    For all qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, σδ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i)#δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q))proves𝜎#superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\sigma\vdash\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q)i)\mathrel{\#% }\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))italic_σ ⊢ italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) # italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ), where we write σ=𝗌𝖾𝗉(δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i),δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)))𝜎𝗌𝖾𝗉superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\sigma=\mathsf{sep}\bm{(}\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q)% i),\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))\bm{)}italic_σ = sansserif_sep bold_( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) bold_) for conciseness.

We prove for all qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, δ𝒯(q,i)#q#superscript𝛿𝒯𝑞𝑖superscript𝑞\delta^{\mathcal{T}}(q,i)\mathrel{\#}q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT holds from either assumptions (1)-(4) or because the assumptions validate preconditions for the rebuilding rule and after applying the rule we find the required result. Suppose we have a specific qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B. If δ𝒯(q,i)#q#superscript𝛿𝒯𝑞𝑖superscript𝑞\delta^{\mathcal{T}}(q,i)\mathrel{\#}q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT holds, we are done. From now, assume (5) ¬(δ𝒯(q,i)#q)#superscript𝛿𝒯𝑞𝑖superscript𝑞\neg(\delta^{\mathcal{T}}(q,i)\mathrel{\#}q^{\prime})¬ ( italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ).

From (4) we derive that (6) δ𝒯(q,iσ)superscript𝛿𝒯𝑞𝑖𝜎absent\delta^{\mathcal{T}}(q,i\sigma){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i italic_σ ) ↑ or δ𝒯(q,σ)superscript𝛿𝒯superscript𝑞𝜎absent\delta^{\mathcal{T}}(q^{\prime},\sigma){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_σ ) ↑. Otherwise, δ𝒯(q,iσ)superscript𝛿𝒯𝑞𝑖𝜎\delta^{\mathcal{T}}(q,i\sigma){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i italic_σ ) ↓ and δ𝒯(q,σ)superscript𝛿𝒯superscript𝑞𝜎\delta^{\mathcal{T}}(q^{\prime},\sigma){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_σ ) ↓ which implies δ𝒯(q,i)#q#superscript𝛿𝒯𝑞𝑖superscript𝑞\delta^{\mathcal{T}}(q,i)\mathrel{\#}q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT under assumption σδ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i)#δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q))proves𝜎#superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\sigma\vdash\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q)i)\mathrel{\#% }\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))italic_σ ⊢ italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) # italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ). However, δ𝒯(q,i)#q#superscript𝛿𝒯𝑞𝑖superscript𝑞\delta^{\mathcal{T}}(q,i)\mathrel{\#}q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT contradicts (5).

From assumptions (1)-(3),(5),(6), we know rebuilding can be applied which leads to OQ 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)iσsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖𝜎\mathsf{access}^{\mathcal{T}}(q)i\sigmasansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i italic_σ and 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)σsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞𝜎\mathsf{access}^{\mathcal{T}}(q^{\prime})\sigmasansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) italic_σ. After the OQs, we know δ𝒯(q,iσ)superscript𝛿𝒯𝑞𝑖𝜎\delta^{\mathcal{T}}(q,i\sigma){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i italic_σ ) ↓ and δ𝒯(q,σ)superscript𝛿𝒯superscript𝑞𝜎\delta^{\mathcal{T}}(q^{\prime},\sigma){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_σ ) ↓, combining this with σδ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i)#δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q))proves𝜎#superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\sigma\vdash\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q)i)\mathrel{\#% }\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))italic_σ ⊢ italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) # italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) proves that δ𝒯(q,i)#q#superscript𝛿𝒯𝑞𝑖superscript𝑞\delta^{\mathcal{T}}(q,i)\mathrel{\#}q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Thus, for every qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, δ𝒯(q,i)#q#superscript𝛿𝒯𝑞𝑖superscript𝑞\delta^{\mathcal{T}}(q,i)\mathrel{\#}q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, which is exactly the definition of isolated.

Proof of Theorem 4.1

Proof

Let n𝑛nitalic_n be the number of equivalence classes (w.r.t. language equivalence) in the reachable part of 𝒮|Ievaluated-at𝒮superscript𝐼\mathcal{S}|_{I^{\mathcal{R}}}caligraphic_S | start_POSTSUBSCRIPT italic_I start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT. We prove that whenever the basis does not contain n𝑛nitalic_n elements, then there always exists an access sequence in Psuperscript𝑃P^{\mathcal{R}}italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT that leads to a state that can be isolated using the rebuilding rule. Using recursive reasoning, this proves that the basis contains n𝑛nitalic_n states whenever prioritized promotion and rebuilding are not applicable anymore. Let B,F,𝒯𝐵𝐹𝒯B,F,\mathcal{T}italic_B , italic_F , caligraphic_T denote the current basis, frontier and observation tree. From the Theorem statement we know:

  1. (1)

    q0superscriptsubscript𝑞0q_{0}^{\mathcal{R}}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT matches q0𝒮superscriptsubscript𝑞0𝒮q_{0}^{\mathcal{S}}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT,

  2. (2)

    States can only be promoted using prioritized promotion.

We also use the following general assumptions from the paper:

  1. (3)

    Psuperscript𝑃P^{\mathcal{R}}italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT is minimal,

  2. (4)

    Psuperscript𝑃P^{\mathcal{R}}italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT is prefix-closed,

  3. (5)

    \mathcal{R}caligraphic_R and 𝒮𝒮\mathcal{S}caligraphic_S are complete.

First, we note that the state cover and separating family are computed on |I𝒮evaluated-atsuperscript𝐼𝒮\mathcal{R}|_{I^{\mathcal{S}}}caligraphic_R | start_POSTSUBSCRIPT italic_I start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT end_POSTSUBSCRIPT, which means that both only contain sequences in the alphabet II𝒮superscript𝐼superscript𝐼𝒮I^{\mathcal{R}}\cap I^{\mathcal{S}}italic_I start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ∩ italic_I start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT. Because of (3), we know there are |P|superscript𝑃|P^{\mathcal{R}}|| italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT | equivalence classes in the reachable part of |I𝒮evaluated-atsuperscript𝐼𝒮\mathcal{R}|_{I^{\mathcal{S}}}caligraphic_R | start_POSTSUBSCRIPT italic_I start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT end_POSTSUBSCRIPT. From (1) and (5), we derive that for all w(II𝒮)𝑤superscriptsuperscript𝐼superscript𝐼𝒮w\in(I^{\mathcal{R}}\cap I^{\mathcal{S}})^{*}italic_w ∈ ( italic_I start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ∩ italic_I start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, λ(w)=λ𝒮(w)superscript𝜆𝑤superscript𝜆𝒮𝑤\lambda^{\mathcal{R}}(w)=\lambda^{\mathcal{S}}(w)italic_λ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_w ) = italic_λ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( italic_w ). This implies that |P|=nsuperscript𝑃𝑛|P^{\mathcal{R}}|=n| italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT | = italic_n.

If |B|=n𝐵𝑛|B|{}={}n| italic_B | = italic_n, we are done. Otherwise, |B|<n𝐵𝑛|B|{}<n| italic_B | < italic_n. From (1), (3) and |B|<n𝐵𝑛|B|{}<n| italic_B | < italic_n, we know that some state in Q𝒮superscript𝑄𝒮Q^{\mathcal{S}}italic_Q start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT, reachable with a sequence from Psuperscript𝑃P^{\mathcal{R}}italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT, has not been discovered yet. Because of (4), this state must be reachable from the basis with one input symbol. In other words, there must exist a basis state qB𝑞𝐵q\in Bitalic_q ∈ italic_B and iI𝑖𝐼i\in Iitalic_i ∈ italic_I such that 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)iPsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝑃\mathsf{access}^{\mathcal{T}}(q)i\in P^{\mathcal{R}}sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT and δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖absent\delta^{\mathcal{T}}(q,i){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ↑.

From (1), we know that for σ=𝗌𝖾𝗉(δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i),δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)))𝜎𝗌𝖾𝗉superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\sigma=\mathsf{sep}\bm{(}\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q)% i),\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))\bm{)}italic_σ = sansserif_sep bold_( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) bold_) it must hold that σδ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i)#δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q))proves𝜎#superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\sigma\vdash\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q)i)\mathrel{\#% }\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))italic_σ ⊢ italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) # italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) because Psuperscript𝑃P^{\mathcal{R}}italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT and {Wq}superscriptsubscript𝑊𝑞\{W_{q}\}^{\mathcal{R}}{ italic_W start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT } start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT are computed on |I𝒮evaluated-atsuperscript𝐼𝒮\mathcal{R}|_{I^{\mathcal{S}}}caligraphic_R | start_POSTSUBSCRIPT italic_I start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT end_POSTSUBSCRIPT. From (2), we know that for each qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)Psuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞superscript𝑃\mathsf{access}^{\mathcal{T}}(q^{\prime})\in P^{\mathcal{R}}sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT.

Therefore, we can apply Lemma 1 and this will lead to δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) being isolated. Using the prioritized promotion rule, we can add δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) to the basis, leading to |B|=n+1𝐵𝑛1|B|{}=n+1| italic_B | = italic_n + 1 and we can apply the recursive reasoning to find a new state to promote or to terminate with the required result.

Note that the precise ordering of prioritized promotion and rebuilding is irrelevant. We can never promote states that we do not want to promote. Moreover, when a state is isolated, it can never be un-isolated. Therefore, applying the rebuilding rule while prioritized promotion can be applied never leads to problems. Finally, rebuilding cannot be applied for ever (see termination proof 6.1), therefore we have to use prioritized promotion at some point.

Appendix 0.C Proofs of Section 5

Proof of Lemma 2

Proof

Let pQ,qB,iIformulae-sequence𝑝superscript𝑄formulae-sequence𝑞𝐵𝑖𝐼p\in Q^{\mathcal{R}},q\in B,i\in Iitalic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT , italic_q ∈ italic_B , italic_i ∈ italic_I and σI𝜎superscript𝐼\sigma\in I^{*}italic_σ ∈ italic_I start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. Suppose

  1. (1)

    δ𝒯(q,i)=rFsuperscript𝛿𝒯𝑞𝑖𝑟𝐹\delta^{\mathcal{T}}(q,i)=r\in Fitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) = italic_r ∈ italic_F,

  2. (2)

    δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q))=psuperscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑝\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q))\scalebox{0.65}{${}% \overset{\surd}{=}{}$}pitalic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) ) over√ start_ARG = end_ARG italic_p,

  3. (3)

    For all qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, δ(p,i)qsuperscript𝛿𝑝𝑖superscript𝑞\delta^{\mathcal{R}}(p,i)\scalebox{0.65}{${}\overset{\surd}{\neq}{}$}q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_i ) over√ start_ARG ≠ end_ARG italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

We prove that for all qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, r#q#𝑟superscript𝑞r\mathrel{\#}q^{\prime}italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT holds. Suppose we have a specific qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B. If r#q#𝑟superscript𝑞r\mathrel{\#}q^{\prime}italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT already holds, we are done. From now, assume (4) ¬(r#q)#𝑟superscript𝑞\neg(r\mathrel{\#}q^{\prime})¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). Normally, ={}\overset{\surd}{=}{}over√ start_ARG = end_ARG is not a transitive relation, however, because 𝒯𝒯\mathcal{T}caligraphic_T is an observation tree for 𝒮𝒮\mathcal{S}caligraphic_S, δ𝒯(q,w)=δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q),w)superscript𝛿𝒯𝑞𝑤superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑤\delta^{\mathcal{T}}(q,w)=\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q% ),w)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_w ) = italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) , italic_w ) for all wI𝑤superscript𝐼w\in I^{*}italic_w ∈ italic_I start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. Therefore, we can derive δ𝒯(q)=psuperscript𝛿𝒯𝑞𝑝\delta^{\mathcal{T}}(q)\scalebox{0.65}{${}\overset{\surd}{=}{}$}pitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) over√ start_ARG = end_ARG italic_p from (2). From (1)-(4), we know all preconditions required for match separation hold. We apply the rule and execute OQ 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)iσsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖𝜎\mathsf{access}^{\mathcal{T}}(q)i\sigmasansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i italic_σ with σq#δ(p,i)proves𝜎#superscript𝑞superscript𝛿𝑝𝑖\sigma\vdash q^{\prime}\mathrel{\#}\delta^{\mathcal{R}}(p,i)italic_σ ⊢ italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_i ). Note here that σ𝜎\sigmaitalic_σ with σq#δ(p,i)proves𝜎#superscript𝑞superscript𝛿𝑝𝑖\sigma\vdash q^{\prime}\mathrel{\#}\delta^{\mathcal{R}}(p,i)italic_σ ⊢ italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_i ) must exist due to (3). After the OQ, we have (5) δ𝒯(q,iσ)superscript𝛿𝒯𝑞𝑖𝜎\delta^{\mathcal{T}}(q,i\sigma){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i italic_σ ) ↓ and from (3) we derive (6) δ𝒯(q,σ)superscript𝛿𝒯superscript𝑞𝜎\delta^{\mathcal{T}}(q^{\prime},\sigma){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_σ ) ↓. Because σq#δ(p,i)proves𝜎#superscript𝑞superscript𝛿𝑝𝑖\sigma\vdash q^{\prime}\mathrel{\#}\delta^{\mathcal{R}}(p,i)italic_σ ⊢ italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_i ) and δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q))=psuperscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑝\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q))\scalebox{0.65}{${}% \overset{\surd}{=}{}$}pitalic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) ) over√ start_ARG = end_ARG italic_p, it must be that (7) σq#δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q),i)proves𝜎#superscript𝑞superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖\sigma\vdash q^{\prime}\mathrel{\#}\delta^{\mathcal{S}}(\mathsf{access}^{% \mathcal{T}}(q),i)italic_σ ⊢ italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) , italic_i ). Combining (5), (6) and (7) proves r#q#𝑟superscript𝑞r\mathrel{\#}q^{\prime}italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

Proof of Theorem 5.1

Proof

Let 𝒮𝒮\mathcal{S}caligraphic_S be the SUL and \mathcal{R}caligraphic_R the reference with 𝒮𝒮\mathcal{S}caligraphic_S and \mathcal{R}caligraphic_R both complete Mealy machines. Moreover, let 𝒮𝒮\mathcal{S}caligraphic_S be equivalent to \mathcal{R}caligraphic_R but 𝒮𝒮\mathcal{S}caligraphic_S possibly has a different initial state. From this, we derive that (1) there exists a state pQ𝑝superscript𝑄p\in Q^{\mathcal{R}}italic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT such that q0𝒮superscriptsubscript𝑞0𝒮q_{0}^{\mathcal{S}}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT is language equivalent to p𝑝pitalic_p. Let n𝑛nitalic_n be the number of states in the reachable part of 𝒮𝒮\mathcal{S}caligraphic_S. Let B,F,𝒯𝐵𝐹𝒯B,F,\mathcal{T}italic_B , italic_F , caligraphic_T denote the current basis, frontier and observation tree.

We prove that if |B|<n𝐵𝑛|B|{}<n| italic_B | < italic_n, then we can add some state to the basis after applying match refinement, match separation, promotion, extension until none of them are applicable anymore. This trivially terminates when we reach |B|=n𝐵𝑛|B|{}=n| italic_B | = italic_n.

Suppose |B|<n𝐵𝑛|B|{}<n| italic_B | < italic_n. There must be some qB𝑞𝐵q\in Bitalic_q ∈ italic_B and iI𝑖𝐼i\in Iitalic_i ∈ italic_I such that (2) δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q),i)superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q),i)italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) , italic_i ) represents an equivalence class that is different from the equivalence classes δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q))superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) for all qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B. We perform a case distinction on the location of δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) in the current observation tree.

  • Suppose δ𝒯(q,i)Bsuperscript𝛿𝒯𝑞𝑖𝐵\delta^{\mathcal{T}}(q,i)\in Bitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ∈ italic_B, this immediately contradicts (2).

  • Suppose δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖absent\delta^{\mathcal{T}}(q,i){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ↑, then we can apply the extension rule resulting in δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ↓.

  • Suppose δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ↓ and δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) is isolated, then we can apply promotion.

  • Suppose (3) δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ↓ and δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) is not isolated. Moreover, from (1) we derive that there exists a state pQsuperscript𝑝superscript𝑄p^{\prime}\in Q^{\mathcal{R}}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT such that (4) δ(p,𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q))=psuperscript𝛿𝑝superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞superscript𝑝\delta^{\mathcal{R}}(p,\mathsf{access}^{\mathcal{T}}(q))=p^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) ) = italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and (5) psuperscript𝑝p^{\prime}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is language equivalent to δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q))superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q))italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) ). We perform a case distinction based on whether δ(p,i)=qsuperscript𝛿superscript𝑝𝑖superscript𝑞\delta^{\mathcal{R}}(p^{\prime},i)\scalebox{0.65}{${}\overset{\surd}{=}{}$}q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i ) over√ start_ARG = end_ARG italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT for some qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B and show that for each case we can derive a contradiction or apply a rule to make progress.

    • Suppose (6) there exists a qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B such that δ(p,i)=qsuperscript𝛿superscript𝑝𝑖superscript𝑞\delta^{\mathcal{R}}(p^{\prime},i)\scalebox{0.65}{${}\overset{\surd}{=}{}$}q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i ) over√ start_ARG = end_ARG italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. From (1) we derive that (7) there must exist some state p′′Qsuperscript𝑝′′superscript𝑄p^{\prime\prime}\in Q^{\mathcal{R}}italic_p start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT that is language equivalent to δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q))superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ).

      We derive that state (8) δ(p,i)superscript𝛿superscript𝑝𝑖\delta^{\mathcal{R}}(p^{\prime},i)italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i ) is language equivalent to an equivalence class that is different from the equivalence classes δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q))superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) for all qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B because δ(p,i)superscript𝛿superscript𝑝𝑖\delta^{\mathcal{R}}(p^{\prime},i)italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i ) is language equivalent to δ𝒮(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q),i)superscript𝛿𝒮superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖\delta^{\mathcal{S}}(\mathsf{access}^{\mathcal{T}}(q),i)italic_δ start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) , italic_i ) (derived from (5)) and (2). Moreover, (9) p′′superscript𝑝′′p^{\prime\prime}italic_p start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT is language equivalent to a state already in the basis (7). By combining (8) and (9), we find that δ(p,i)#p′′#superscript𝛿superscript𝑝𝑖superscript𝑝′′\delta^{\mathcal{R}}(p^{\prime},i)\mathrel{\#}p^{\prime\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i ) # italic_p start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT. Moreover, because δ(p,i)superscript𝛿superscript𝑝𝑖\delta^{\mathcal{R}}(p^{\prime},i)italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i ) and p′′superscript𝑝′′p^{\prime\prime}italic_p start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT are in Qsuperscript𝑄Q^{\mathcal{R}}italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT and they represent different equivalence classes, sequence σ=𝗌𝖾𝗉(δ(p,i),p′′)𝜎𝗌𝖾𝗉superscript𝛿superscript𝑝𝑖superscript𝑝′′\sigma=\mathsf{sep}(\delta^{\mathcal{R}}(p^{\prime},i),p^{\prime\prime})italic_σ = sansserif_sep ( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i ) , italic_p start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ) exists. This means we can apply match refinement with δ(p,i)superscript𝛿superscript𝑝𝑖\delta^{\mathcal{R}}(p^{\prime},i)italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i ) and p′′superscript𝑝′′p^{\prime\prime}italic_p start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT, resulting in δ(p,i)qsuperscript𝛿superscript𝑝𝑖superscript𝑞\delta^{\mathcal{R}}(p^{\prime},i)\scalebox{0.65}{${}\overset{\surd}{\neq}{}$}% q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i ) over√ start_ARG ≠ end_ARG italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT because otherwise (7) leads to a contradiction.
      This reasoning can be applied for any qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B such that δ(p,i)=qsuperscript𝛿superscript𝑝𝑖superscript𝑞\delta^{\mathcal{R}}(p^{\prime},i)\scalebox{0.65}{${}\overset{\surd}{=}{}$}q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i ) over√ start_ARG = end_ARG italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, resulting in δ(p,i)qsuperscript𝛿superscript𝑝𝑖superscript𝑞\delta^{\mathcal{R}}(p^{\prime},i)\scalebox{0.65}{${}\overset{\surd}{\neq}{}$}% q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i ) over√ start_ARG ≠ end_ARG italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT for all qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B after multiple applications of match refinement. In this case, we can continue with the case below.

    • Suppose there does not exist a qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B such that δ(p,i)=qsuperscript𝛿superscript𝑝𝑖superscript𝑞\delta^{\mathcal{R}}(p^{\prime},i)\scalebox{0.65}{${}\overset{\surd}{=}{}$}q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i ) over√ start_ARG = end_ARG italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. In this case, we can apply Lemma 2 with p,q,isuperscript𝑝𝑞𝑖p^{\prime},q,iitalic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_q , italic_i. This results in δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) being isolated. We can apply promotion which increases the size of the basis.

  • Suppose δ𝒯(q,i)Bsuperscript𝛿𝒯𝑞𝑖𝐵\delta^{\mathcal{T}}(q,i)\notin Bitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ∉ italic_B and δ𝒯(q,i)Fsuperscript𝛿𝒯𝑞𝑖𝐹\delta^{\mathcal{T}}(q,i)\notin Fitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ∉ italic_F, this contradicts the assumption that qB𝑞𝐵q\in Bitalic_q ∈ italic_B and iI𝑖𝐼i\in Iitalic_i ∈ italic_I.

Note that the precise ordering of the promotion, extension, match refinement and match separation is irrelevant. We discuss the reasoning for each rule.

Promotion

States that are isolated can never become un-isolated, therefore, applying other rules before promotion can never lead to problems.

Extension

If we apply rules before applying extension then either extension is not necessary anymore or we can still apply it but both lead to δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖\delta^{\mathcal{T}}(q,i){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ↓.

Match refinement

The only goal of match refinement is to refine the matching. If two reference states match a basis state, we can perform an OQ that leads to one of the reference states no longer being a match. If we apply one of the other rules before match refinement which already leads to this result, then we do not have to perform match refinement but obtain the same result.

Match separation

In this Theorem, the match separation rule always leads to a new apartness pair between the frontier state and the basis. If some other rule already shows the required apartness pair, we do not have to apply match separation but obtain the same result.

Proof of Lemma 3

Proof

Let 𝒯𝒯\mathcal{T}caligraphic_T be an observation tree, \mathcal{R}caligraphic_R a reference model, qQ𝒯𝑞superscript𝑄𝒯q\in Q^{\mathcal{T}}italic_q ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT and pQ𝑝superscript𝑄p\in Q^{\mathcal{R}}italic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. Suppose 𝗆𝖽𝖾𝗀(q,p)=1𝗆𝖽𝖾𝗀𝑞𝑝1\mathsf{mdeg}(q,p)=1sansserif_mdeg ( italic_q , italic_p ) = 1. In particular, for all w(I𝒯I)𝑤superscriptsuperscript𝐼𝒯superscript𝐼w\in(I^{\mathcal{T}}\cap I^{\mathcal{R}})^{*}italic_w ∈ ( italic_I start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ∩ italic_I start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and iIRI𝒯𝑖superscript𝐼𝑅superscript𝐼𝒯i\in I^{R}\cap I^{\mathcal{T}}italic_i ∈ italic_I start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT ∩ italic_I start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT such that δ𝒯(q,wi)superscript𝛿𝒯𝑞𝑤𝑖\delta^{\mathcal{T}}(q,wi){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_w italic_i ) ↓,

λ𝒯(δ𝒯(q,w),i)=λ(δ(p,w),i).superscript𝜆𝒯superscript𝛿𝒯𝑞𝑤𝑖superscript𝜆superscript𝛿𝑝𝑤𝑖\lambda^{\mathcal{T}}(\delta^{\mathcal{T}}(q,w),i)=\lambda^{\mathcal{R}}(% \delta^{\mathcal{R}}(p,w),i).italic_λ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_w ) , italic_i ) = italic_λ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_w ) , italic_i ) .

This is equivalent to for all v(I𝒯I)𝑣superscriptsuperscript𝐼𝒯superscript𝐼v\in(I^{\mathcal{T}}\cap I^{\mathcal{R}})^{*}italic_v ∈ ( italic_I start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ∩ italic_I start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT such that δ𝒯(q,v)superscript𝛿𝒯𝑞𝑣\delta^{\mathcal{T}}(q,v){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_v ) ↓

λ𝒯(q,v)=λ(p,v)superscript𝜆𝒯𝑞𝑣superscript𝜆𝑝𝑣\lambda^{\mathcal{T}}(q,v)=\lambda^{\mathcal{R}}(p,v)italic_λ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_v ) = italic_λ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_v )

Because we assume reference models are complete w.r.t. their own alphabet IRsuperscript𝐼𝑅I^{R}italic_I start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT, the reference is complete w.r.t. IRI𝒯superscript𝐼𝑅superscript𝐼𝒯I^{R}\cap I^{\mathcal{T}}italic_I start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT ∩ italic_I start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT, this implies for all v(I𝒯I)𝑣superscriptsuperscript𝐼𝒯superscript𝐼v\in(I^{\mathcal{T}}\cap I^{\mathcal{R}})^{*}italic_v ∈ ( italic_I start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ∩ italic_I start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT such that δ𝒯(q,v)superscript𝛿𝒯𝑞𝑣\delta^{\mathcal{T}}(q,v){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_v ) ↓ and δ(p,v)superscript𝛿𝑝𝑣\delta^{\mathcal{R}}(p,v){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_v ) ↓

λ𝒯(q,v)=λ(p,v)superscript𝜆𝒯𝑞𝑣superscript𝜆𝑝𝑣\lambda^{\mathcal{T}}(q,v)=\lambda^{\mathcal{R}}(p,v)italic_λ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_v ) = italic_λ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_v )

which is precisely q=p𝑞𝑝q\scalebox{0.65}{${}\overset{\surd}{=}{}$}pitalic_q over√ start_ARG = end_ARG italic_p.

Appendix 0.D Proofs of Section 6

Before we prove the complexity for AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT, we define and prove an additional termination Theorem. We prove termination of AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT by proving that each rule lowers the ranking function. To keep consistent with the L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT complexity proof [23], we actually prove that each rule increases some norm and the norm is bounded by the SUL. Specifically, we use norm N(𝒯)𝑁𝒯N(\mathcal{T})italic_N ( caligraphic_T ):

N(𝒯)=NL#(𝒯)+|N(B×Q×Q)(𝒯)|+|NF#Q(𝒯)|+|N(B×F)(𝒯)|𝑁𝒯subscript𝑁superscript𝐿#𝒯subscript𝑁𝐵superscript𝑄superscript𝑄𝒯subscript𝑁#𝐹superscript𝑄𝒯subscript𝑁𝐵𝐹𝒯N(\mathcal{T})=N_{L^{\#}}(\mathcal{T})+|N_{(B\times Q^{\mathcal{R}}\times Q^{% \mathcal{R}})\mathord{\downarrow}}(\mathcal{T})|~{}+~{}|N_{F\mathrel{\#}Q^{% \mathcal{R}}}(\mathcal{T})|~{}+~{}|N_{(B\times F)\mathord{\downarrow}}(% \mathcal{T})|italic_N ( caligraphic_T ) = italic_N start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T ) + | italic_N start_POSTSUBSCRIPT ( italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) ↓ end_POSTSUBSCRIPT ( caligraphic_T ) | + | italic_N start_POSTSUBSCRIPT italic_F # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T ) | + | italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T ) | (1)

where NL#(𝒯)subscript𝑁superscript𝐿#𝒯N_{L^{\#}}(\mathcal{T})italic_N start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T ) indicates the slightly adapted norm from [23]. The abbreviations for the summands are defined as follows.

NL#(𝒯)=|B|(|B|+1)+|{(q,i)B×Iδ(q,i)}|+|{(q,r)B×Fq#r}|subscript𝑁superscript𝐿#𝒯𝐵𝐵1conditional-set𝑞𝑖𝐵𝐼𝛿𝑞𝑖conditional-set𝑞𝑟𝐵𝐹#𝑞𝑟\displaystyle N_{L^{\#}}(\mathcal{T})=|B|(|B|+1)+|\{(q,i)\in B\times I\mid% \delta(q,i)\mathord{\downarrow}\}|+|\{(q,r)\in B\times F\mid q\mathrel{\#}r\}|italic_N start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T ) = | italic_B | ( | italic_B | + 1 ) + | { ( italic_q , italic_i ) ∈ italic_B × italic_I ∣ italic_δ ( italic_q , italic_i ) ↓ } | + | { ( italic_q , italic_r ) ∈ italic_B × italic_F ∣ italic_q # italic_r } |
N(B×Q×Q)(𝒯)={(q,p,p)B×Q×Qδ(q,σ) with σ=𝗌𝖾𝗉(p,p)}subscript𝑁𝐵superscript𝑄superscript𝑄𝒯conditional-set𝑞𝑝superscript𝑝𝐵superscript𝑄superscript𝑄𝛿𝑞𝜎 with 𝜎𝗌𝖾𝗉𝑝superscript𝑝\displaystyle N_{(B\times Q^{\mathcal{R}}\times Q^{\mathcal{R}})\mathord{% \downarrow}}(\mathcal{T})=\{(q,p,p^{\prime})\in B\times Q^{\mathcal{R}}\times Q% ^{\mathcal{R}}\mid\delta(q,\sigma)\mathord{\downarrow}\text{ with }\sigma=% \mathsf{sep}(p,p^{\prime})\}italic_N start_POSTSUBSCRIPT ( italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) ↓ end_POSTSUBSCRIPT ( caligraphic_T ) = { ( italic_q , italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ∣ italic_δ ( italic_q , italic_σ ) ↓ with italic_σ = sansserif_sep ( italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) }
N(BF)#Q(𝒯)={(q,p)(BF)×Qq#p}subscript𝑁#𝐵𝐹superscript𝑄𝒯conditional-set𝑞𝑝𝐵𝐹superscript𝑄#𝑞𝑝\displaystyle N_{(B\cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T})=\{(q,p)\in% (B\cup F)\times Q^{\mathcal{R}}\mid q\mathrel{\#}p\}italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T ) = { ( italic_q , italic_p ) ∈ ( italic_B ∪ italic_F ) × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ∣ italic_q # italic_p }
N(B×F)(𝒯)={(q,r)B×Fδ(q,σ)δ(r,σ) with\displaystyle N_{(B\times F)\mathord{\downarrow}}(\mathcal{T})=\{(q,r)\in B% \times F\mid~{}\delta(q,\sigma)\mathord{\downarrow}\land\delta(r,\sigma)% \mathord{\downarrow}\text{ with }italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T ) = { ( italic_q , italic_r ) ∈ italic_B × italic_F ∣ italic_δ ( italic_q , italic_σ ) ↓ ∧ italic_δ ( italic_r , italic_σ ) ↓ with
σ=𝗌𝖾𝗉(δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)),δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(r)))}\displaystyle\qquad\qquad\qquad\qquad\qquad\sigma=\mathsf{sep}\bm{(}\delta^{% \mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q)),\delta^{\mathcal{R}}(\mathsf{% access}^{\mathcal{T}}(r))\bm{)}\}italic_σ = sansserif_sep bold_( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_r ) ) bold_) }

The summand N(B×Q×Q)(𝒯)subscript𝑁𝐵superscript𝑄superscript𝑄𝒯N_{(B\times Q^{\mathcal{R}}\times Q^{\mathcal{R}})\mathord{\downarrow}}(% \mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) ↓ end_POSTSUBSCRIPT ( caligraphic_T ) keeps track of which separating sequences of the reference model have been applied to basis states in the new observation tree. The summand N(BF)#Q(𝒯)subscript𝑁#𝐵𝐹superscript𝑄𝒯N_{(B\cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T ) keeps track of unmatched states between states in the basis or frontier and the reference model. The summand N(B×F)(𝒯)subscript𝑁𝐵𝐹𝒯N_{(B\times F)\mathord{\downarrow}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T ) keeps track of separating sequences from the reference model applied to pairs of basis and frontier states. These summands are motivated by the postconditions in Table 1.

Theorem 0.D.1

Every rule application in AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT increases norm N(𝒯)𝑁𝒯N(\mathcal{T})italic_N ( caligraphic_T ).

Proof

Let B,F,𝒯𝐵𝐹𝒯B,F,\mathcal{T}italic_B , italic_F , caligraphic_T denote the values before and B,F,𝒯superscript𝐵superscript𝐹superscript𝒯B^{\prime},F^{\prime},\mathcal{T}^{\prime}italic_B start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT denote the values after the respective rule application. Let \mathcal{R}caligraphic_R denote the reference model. We reuse abbreviations from [23] and the norm definition above

NQ(𝒯)=|B|(|B|+1)subscript𝑁𝑄𝒯𝐵𝐵1\displaystyle N_{Q}(\mathcal{T})=|B|\cdot(|B|+1)italic_N start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ( caligraphic_T ) = | italic_B | ⋅ ( | italic_B | + 1 )
N(𝒯)={(q,i)B×Iδ(q,i)}subscript𝑁𝒯conditional-set𝑞𝑖𝐵𝐼𝛿𝑞𝑖\displaystyle N_{\mathord{\downarrow}}(\mathcal{T})=\{(q,i)\in B\times I\mid% \delta(q,i)\mathord{\downarrow}\}italic_N start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT ( caligraphic_T ) = { ( italic_q , italic_i ) ∈ italic_B × italic_I ∣ italic_δ ( italic_q , italic_i ) ↓ }
N#(𝒯)={(q,r)B×Fq#r}subscript𝑁#𝒯conditional-set𝑞𝑟𝐵𝐹#𝑞𝑟\displaystyle N_{\mathrel{\#}}(\mathcal{T})=\{(q,r)\in B\times F\mid q\mathrel% {\#}r\}italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T ) = { ( italic_q , italic_r ) ∈ italic_B × italic_F ∣ italic_q # italic_r }

The proof that the rules promotion, extension, separation and equivalence increase NQ(𝒯)+|N(𝒯)|+|N#(𝒯)|subscript𝑁𝑄𝒯subscript𝑁𝒯subscript𝑁#𝒯N_{Q}(\mathcal{T})+|N_{\mathord{\downarrow}}(\mathcal{T})|+|N_{\mathrel{\#}}(% \mathcal{T})|italic_N start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ( caligraphic_T ) + | italic_N start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT ( caligraphic_T ) | + | italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T ) | is similar to the proof in [23]. However, we slightly adapted NQ(𝒯)subscript𝑁𝑄𝒯N_{Q}(\mathcal{T})italic_N start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ( caligraphic_T ) which has influence on the proof for promotion and separation and we assume stronger guarantees for equivalence. It remains to show that combined with the new summands the total norm still increases. Therefore, we include the proofs for the L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT rules here.

Rebuilding

Let q,qB𝑞superscript𝑞𝐵q,q^{\prime}\in Bitalic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B, iI𝑖𝐼i\in Iitalic_i ∈ italic_I and σI𝜎superscript𝐼\sigma\in I^{*}italic_σ ∈ italic_I start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. We assume

  1. 1.

    δ𝒯(q,i)Bsuperscript𝛿𝒯𝑞𝑖𝐵\delta^{\mathcal{T}}(q,i)\notin Bitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ∉ italic_B

  2. 2.

    ¬(q#δ𝒯(q,i))#superscript𝑞superscript𝛿𝒯𝑞𝑖\neg(q^{\prime}\mathrel{\#}\delta^{\mathcal{T}}(q,i))¬ ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ),

  3. 3.

    𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞\mathsf{access}^{\mathcal{T}}(q)sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ), 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)iPsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝑃\mathsf{access}^{\mathcal{T}}(q)i\in P^{\mathcal{R}}sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT,

  4. 4.

    δ𝒯(q,iσ)superscript𝛿𝒯𝑞𝑖𝜎absent\delta^{\mathcal{T}}(q,i\sigma){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i italic_σ ) ↑ or δ𝒯(q,σ)superscript𝛿𝒯superscript𝑞𝜎absent\delta^{\mathcal{T}}(q^{\prime},\sigma){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_σ ) ↑

  5. 5.

    σ=𝗌𝖾𝗉(δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i),δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)))𝜎𝗌𝖾𝗉superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\sigma=\mathsf{sep}\bm{(}\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q)% i),\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q^{\prime}))\bm{)}italic_σ = sansserif_sep bold_( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) bold_).

The algorithm performs two queries OQs 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)iσsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖𝜎\mathsf{access}^{\mathcal{T}}(q)i\sigmasansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i italic_σ and 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)σsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞𝜎\mathsf{access}^{\mathcal{T}}(q^{\prime})\sigmasansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) italic_σ. After these OQs, the traces δ𝒯(q,iσ)superscript𝛿𝒯𝑞𝑖𝜎\delta^{\mathcal{T}}(q,i\sigma)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i italic_σ ) and δ𝒯(q,σ)superscript𝛿𝒯superscript𝑞𝜎\delta^{\mathcal{T}}(q^{\prime},\sigma)italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_σ ) are defined. Particularly, because of assumption (1) and qB𝑞𝐵q\in Bitalic_q ∈ italic_B, we know δ𝒯(q,i)Fsuperscript𝛿𝒯𝑞𝑖𝐹\delta^{\mathcal{T}}(q,i)\in Fitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ∈ italic_F after the OQs. Combining this with (4), we find

N(B×F)(𝒯)N(B×F)(𝒯){(q,δ𝒯(q,i))}subscript𝑁𝐵𝐹𝒯superscript𝑞superscript𝛿𝒯𝑞𝑖subscript𝑁𝐵𝐹superscript𝒯N_{(B\times F)\mathord{\downarrow}}(\mathcal{T}^{\prime})\supseteq N_{(B\times F% )\mathord{\downarrow}}(\mathcal{T})\cup\{(q^{\prime},\delta^{\mathcal{T}}(q,i))\}italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T ) ∪ { ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ) }

Note that we implicitly use (3) and (5) to ensure that σ𝜎\sigmaitalic_σ exists. In some cases we might find that δ𝒯(q,i)#q#superscript𝛿𝒯𝑞𝑖superscript𝑞\delta^{\mathcal{T}}(q,i)\mathrel{\#}q^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT which, together with (2), indicates

N#(𝒯)N#(𝒯){(q,δ𝒯(q,i))}subscript𝑁#𝒯superscript𝑞superscript𝛿𝒯𝑞𝑖subscript𝑁#superscript𝒯N_{\mathrel{\#}}(\mathcal{T}^{\prime})\supseteq N_{\mathrel{\#}}(\mathcal{T})% \cup\{(q^{\prime},\delta^{\mathcal{T}}(q,i))\}italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T ) ∪ { ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ) }

Otherwise N#(𝒯)N#(𝒯)subscript𝑁#𝒯subscript𝑁#superscript𝒯N_{\mathrel{\#}}(\mathcal{T}^{\prime})\supseteq N_{\mathrel{\#}}(\mathcal{T})italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T ). Additionally,

NQ(𝒯)=NQ(𝒯)subscript𝑁𝑄superscript𝒯subscript𝑁𝑄𝒯N_{Q}(\mathcal{T}^{\prime})=N_{Q}(\mathcal{T})italic_N start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_N start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ( caligraphic_T )
N(𝒯)N(𝒯)subscript𝑁𝒯subscript𝑁superscript𝒯N_{\downarrow}(\mathcal{T}^{\prime})\supseteq N_{\downarrow}(\mathcal{T})italic_N start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT ( caligraphic_T )
N(BF)#Q(𝒯)N(BF)#Q(𝒯)subscript𝑁#𝐵𝐹superscript𝑄𝒯subscript𝑁#𝐵𝐹superscript𝑄superscript𝒯N_{(B\cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T}^{\prime})\supseteq N_{(B% \cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T )
N(B×Q×Q)(𝒯)N(B×Q×Q)(𝒯)subscript𝑁𝐵superscript𝑄superscript𝑄𝒯subscript𝑁𝐵superscript𝑄superscript𝑄superscript𝒯N_{(B\times Q^{\mathcal{R}}\times Q^{\mathcal{R}})\mathord{\downarrow}}(% \mathcal{T}^{\prime})\supseteq N_{(B\times Q^{\mathcal{R}}\times Q^{\mathcal{R% }})\mathord{\downarrow}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) ↓ end_POSTSUBSCRIPT ( caligraphic_T )

Thus, N(𝒯)N(𝒯)+1𝑁superscript𝒯𝑁𝒯1N(\mathcal{T}^{\prime})\geq N(\mathcal{T})+1italic_N ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ italic_N ( caligraphic_T ) + 1.

Prioritized promotion

Let rF𝑟𝐹r\in Fitalic_r ∈ italic_F. Suppose (1) r𝑟ritalic_r is isolated and suppose (2) 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(r)Psuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑟superscript𝑃\mathsf{access}^{\mathcal{T}}(r)\in P^{\mathcal{R}}sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_r ) ∈ italic_P start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. State r𝑟ritalic_r is moved from F𝐹Fitalic_F to B𝐵Bitalic_B, i.e. B:=B{r}assignsuperscript𝐵𝐵𝑟B^{\prime}:=B\cup\{r\}italic_B start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_B ∪ { italic_r }, then we have

NQ(𝒯)subscript𝑁𝑄superscript𝒯\displaystyle N_{Q}(\mathcal{T}^{\prime})italic_N start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) =|B|(|B|+1)=(|B|+1)(|B|+1+1)absentsuperscript𝐵superscript𝐵1𝐵1𝐵11\displaystyle=|B^{\prime}|\cdot(|B^{\prime}|+1)=(|B|+1)\cdot(|B|+1+1)= | italic_B start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | ⋅ ( | italic_B start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | + 1 ) = ( | italic_B | + 1 ) ⋅ ( | italic_B | + 1 + 1 )
=(|B|+1)|B|+2(|B|+1)=NQ(𝒯)+2|B|+2absent𝐵1𝐵2𝐵1subscript𝑁𝑄𝒯2𝐵2\displaystyle=(|B|+1)\cdot|B|+2(|B|+1)=N_{Q}(\mathcal{T})+2|B|+2= ( | italic_B | + 1 ) ⋅ | italic_B | + 2 ( | italic_B | + 1 ) = italic_N start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ( caligraphic_T ) + 2 | italic_B | + 2

Because we move something from the frontier to the basis, we find

N#(𝒯)N#(𝒯)(B×{r})subscript𝑁#𝒯𝐵𝑟subscript𝑁#superscript𝒯N_{\mathrel{\#}}(\mathcal{T}^{\prime})~{}~{}\supseteq~{}~{}N_{\mathrel{\#}}(% \mathcal{T})\setminus(B\times\{r\})italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T ) ∖ ( italic_B × { italic_r } )
N(B×F)(𝒯)N(B×F)(𝒯)(B×{r})subscript𝑁𝐵𝐹𝒯𝐵𝑟subscript𝑁𝐵𝐹𝒯N_{(B\times F)\mathord{\downarrow}}(\mathcal{T})~{}~{}\supseteq~{}~{}N_{(B% \times F)\mathord{\downarrow}}(\mathcal{T})\setminus(B\times\{r\})italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T ) ∖ ( italic_B × { italic_r } )

and thus

|N#(𝒯)||N#(𝒯)||B|subscript𝑁#superscript𝒯subscript𝑁#𝒯𝐵|N_{\mathrel{\#}}(\mathcal{T}^{\prime})|~{}~{}\geq~{}~{}|N_{\mathrel{\#}}(% \mathcal{T})|-|B|| italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) | ≥ | italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T ) | - | italic_B |
|N(B×F)(𝒯)||N(B×F)(𝒯)||B|subscript𝑁𝐵𝐹superscript𝒯subscript𝑁𝐵𝐹𝒯𝐵|N_{(B\times F)\mathord{\downarrow}}(\mathcal{T}^{\prime})|~{}~{}\geq~{}~{}|N_% {(B\times F)\mathord{\downarrow}}(\mathcal{T})|-|B|| italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) | ≥ | italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T ) | - | italic_B |

Finally,

N(𝒯)N(𝒯)subscript𝑁𝒯subscript𝑁superscript𝒯N_{\downarrow}(\mathcal{T}^{\prime})\supseteq N_{\downarrow}(\mathcal{T})italic_N start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT ( caligraphic_T )
NB×Q×Q(𝒯)NB×Q×Q(𝒯)subscript𝑁𝐵superscript𝑄superscript𝑄𝒯subscript𝑁𝐵superscript𝑄superscript𝑄superscript𝒯N_{B\times Q^{\mathcal{R}}\times Q^{\mathcal{R}}}(\mathcal{T}^{\prime})% \supseteq N_{B\times Q^{\mathcal{R}}\times Q^{\mathcal{R}}}(\mathcal{T})italic_N start_POSTSUBSCRIPT italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T )
N(BF)#Q(𝒯)N(BF)#Q(𝒯)subscript𝑁#𝐵𝐹superscript𝑄𝒯subscript𝑁#𝐵𝐹superscript𝑄superscript𝒯N_{(B\cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T}^{\prime})\supseteq N_{(B% \cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T )

The total norm increases because

N(𝒯)N(𝒯)+2B+2BBN(𝒯)+2𝑁superscript𝒯𝑁𝒯2delimited-∣∣𝐵2delimited-∣∣𝐵delimited-∣∣𝐵𝑁𝒯2N(\mathcal{T}^{\prime})\geq N(\mathcal{T})+2\mid B\mid+~{}2~{}-\mid B\mid-\mid B% \mid~{}\geq N(\mathcal{T})+2italic_N ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ italic_N ( caligraphic_T ) + 2 ∣ italic_B ∣ + 2 - ∣ italic_B ∣ - ∣ italic_B ∣ ≥ italic_N ( caligraphic_T ) + 2
Promotion

Analogous to the proof for prioritized promotion.

Extension

Let δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖absent\delta^{\mathcal{T}}(q,i){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ↑ for some qB𝑞𝐵q\in Bitalic_q ∈ italic_B, iI𝑖𝐼i\in Iitalic_i ∈ italic_I. After OQ 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)isuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖\mathsf{access}^{\mathcal{T}}(q)isansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i, we get N(𝒯)N(𝒯)+1𝑁superscript𝒯𝑁𝒯1N(\mathcal{T}^{\prime})\geq N(\mathcal{T})+1italic_N ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ italic_N ( caligraphic_T ) + 1 from [23]. Additionally,

N(B×Q×Q)(𝒯)N(B×Q×Q)(𝒯)subscript𝑁𝐵superscript𝑄superscript𝑄𝒯subscript𝑁𝐵superscript𝑄superscript𝑄superscript𝒯N_{(B\times Q^{\mathcal{R}}\times Q^{\mathcal{R}})\mathord{\downarrow}}(% \mathcal{T}^{\prime})\supseteq N_{(B\times Q^{\mathcal{R}}\times Q^{\mathcal{R% }})\mathord{\downarrow}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) ↓ end_POSTSUBSCRIPT ( caligraphic_T )
N(B×F)(𝒯)N(B×F)(𝒯)subscript𝑁𝐵𝐹𝒯subscript𝑁𝐵𝐹superscript𝒯N_{(B\times F)\mathord{\downarrow}}(\mathcal{T}^{\prime})\supseteq N_{(B\times F% )\mathord{\downarrow}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T )
N(BF)#Q(𝒯)N(BF)#Q(𝒯)subscript𝑁#𝐵𝐹superscript𝑄𝒯subscript𝑁#𝐵𝐹superscript𝑄superscript𝒯N_{(B\cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T}^{\prime})\supseteq N_{(B% \cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T )

and thus N(𝒯)N(𝒯)+1𝑁superscript𝒯𝑁𝒯1N(\mathcal{T}^{\prime})\geq N(\mathcal{T})+1italic_N ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ italic_N ( caligraphic_T ) + 1.

Separation

Consider a state rF𝑟𝐹r\in Fitalic_r ∈ italic_F and distinct q,qB𝑞superscript𝑞𝐵q,q^{\prime}\in Bitalic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B with ¬(r#q)#𝑟𝑞\neg(r\mathrel{\#}q)¬ ( italic_r # italic_q ) and ¬(r#q)#𝑟superscript𝑞\neg(r\mathrel{\#}q^{\prime})¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). After OQ 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)σsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝜎\mathsf{access}^{\mathcal{T}}(q)\sigmasansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_σ, we have N(𝒯)N(𝒯)+1𝑁superscript𝒯𝑁𝒯1N(\mathcal{T}^{\prime})\geq N(\mathcal{T})+1italic_N ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ italic_N ( caligraphic_T ) + 1 from [23]. Additionally,

N(B×Q×Q)(𝒯)N(B×Q×Q)(𝒯)subscript𝑁𝐵superscript𝑄superscript𝑄𝒯subscript𝑁𝐵superscript𝑄superscript𝑄superscript𝒯N_{(B\times Q^{\mathcal{R}}\times Q^{\mathcal{R}})\mathord{\downarrow}}(% \mathcal{T}^{\prime})\supseteq N_{(B\times Q^{\mathcal{R}}\times Q^{\mathcal{R% }})\mathord{\downarrow}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) ↓ end_POSTSUBSCRIPT ( caligraphic_T )
N(B×F)(𝒯)N(B×F)(𝒯)subscript𝑁𝐵𝐹𝒯subscript𝑁𝐵𝐹superscript𝒯N_{(B\times F)\mathord{\downarrow}}(\mathcal{T}^{\prime})\supseteq N_{(B\times F% )\mathord{\downarrow}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T )
N(BF)#Q(𝒯)N(BF)#Q(𝒯)subscript𝑁#𝐵𝐹superscript𝑄𝒯subscript𝑁#𝐵𝐹superscript𝑄superscript𝒯N_{(B\cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T}^{\prime})\supseteq N_{(B% \cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T )

Thus, N(𝒯)N(𝒯)+1𝑁superscript𝒯𝑁𝒯1N(\mathcal{T}^{\prime})\geq N(\mathcal{T})+1italic_N ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ italic_N ( caligraphic_T ) + 1.

Match separation

Let qB𝑞𝐵q\in Bitalic_q ∈ italic_B, pQ,iIformulae-sequence𝑝superscript𝑄𝑖𝐼p\in Q^{\mathcal{R}},i\in Iitalic_p ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT , italic_i ∈ italic_I, σI𝜎superscript𝐼\sigma\in I^{*}italic_σ ∈ italic_I start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, δ𝒯(q,i)=rFsuperscript𝛿𝒯𝑞𝑖𝑟𝐹\delta^{\mathcal{T}}(q,i)=r\in Fitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) = italic_r ∈ italic_F and δ(p,i)=psuperscript𝛿𝑝𝑖superscript𝑝\delta^{\mathcal{R}}(p,i)=p^{\prime}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_p , italic_i ) = italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Suppose

  1. 1.

    qp𝑞similar-to-or-equals𝑝q\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}pitalic_q over√ start_ARG ≃ end_ARG italic_p,

  2. 2.

    ¬(r#p)#𝑟superscript𝑝\neg(r\mathrel{\#}p^{\prime})¬ ( italic_r # italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ),

  3. 3.

    There is no q′′Bsuperscript𝑞′′𝐵q^{\prime\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ∈ italic_B such that pq′′superscript𝑝similar-to-or-equalssuperscript𝑞′′p^{\prime}\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}q^{\prime\prime}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over√ start_ARG ≃ end_ARG italic_q start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT,

  4. 4.

    There exists qBsuperscript𝑞𝐵q^{\prime}\in Bitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B such that ¬(r#q)#𝑟superscript𝑞\neg(r\mathrel{\#}q^{\prime})¬ ( italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) and σp#qproves𝜎#superscript𝑝superscript𝑞\sigma\vdash p^{\prime}\mathrel{\#}q^{\prime}italic_σ ⊢ italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

After OQ 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)iσsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖𝜎\mathsf{access}^{\mathcal{T}}(q)i\sigmasansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i italic_σ we find either

r#qorr#pformulae-sequence#𝑟superscript𝑞or#𝑟superscript𝑝r\mathrel{\#}q^{\prime}\quad\text{or}\quad r\mathrel{\#}p^{\prime}italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT or italic_r # italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT

If r#q#𝑟superscript𝑞r\mathrel{\#}q^{\prime}italic_r # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, then

N#(𝒯)N#(𝒯){(q,r)}N(BF)#Q(𝒯)N(BF)#Q(𝒯)formulae-sequencesubscript𝑁#𝒯superscript𝑞𝑟subscript𝑁#superscript𝒯subscript𝑁#𝐵𝐹superscript𝑄𝒯subscript𝑁#𝐵𝐹superscript𝑄superscript𝒯N_{\mathrel{\#}}(\mathcal{T}^{\prime})\supseteq N_{\mathrel{\#}}(\mathcal{T})% \cup\{(q^{\prime},r)\}\qquad N_{(B\cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal% {T}^{\prime})\supseteq N_{(B\cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T})italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T ) ∪ { ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_r ) } italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T )

If r#p#𝑟superscript𝑝r\mathrel{\#}p^{\prime}italic_r # italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, then

N#(𝒯)N#(𝒯)N(BF)#Q(𝒯)N(BF)#Q(𝒯){(r,p)}formulae-sequencesubscript𝑁#𝒯subscript𝑁#superscript𝒯subscript𝑁#𝐵𝐹superscript𝑄𝒯𝑟superscript𝑝subscript𝑁#𝐵𝐹superscript𝑄superscript𝒯N_{\mathrel{\#}}(\mathcal{T}^{\prime})\supseteq N_{\mathrel{\#}}(\mathcal{T})% \qquad N_{(B\cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T}^{\prime})\supseteq N% _{(B\cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T})\cup\{(r,p^{\prime})\}italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T ) italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T ) ∪ { ( italic_r , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) }

Additionally,

NQ(𝒯)=NQ(𝒯)subscript𝑁𝑄superscript𝒯subscript𝑁𝑄𝒯N_{Q}(\mathcal{T}^{\prime})=N_{Q}(\mathcal{T})italic_N start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_N start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ( caligraphic_T )
N(𝒯)N(𝒯)subscript𝑁𝒯subscript𝑁superscript𝒯N_{\downarrow}(\mathcal{T}^{\prime})\supseteq N_{\downarrow}(\mathcal{T})italic_N start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT ( caligraphic_T )
N(B×F)(𝒯)N(B×F)(𝒯)subscript𝑁𝐵𝐹𝒯subscript𝑁𝐵𝐹superscript𝒯N_{(B\times F)\mathord{\downarrow}}(\mathcal{T}^{\prime})\supseteq N_{(B\times F% )\mathord{\downarrow}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T )
NB×Q×Q(𝒯)NB×Q×Q(𝒯)subscript𝑁𝐵superscript𝑄superscript𝑄𝒯subscript𝑁𝐵superscript𝑄superscript𝑄superscript𝒯N_{B\times Q^{\mathcal{R}}\times Q^{\mathcal{R}}}(\mathcal{T}^{\prime})% \supseteq N_{B\times Q^{\mathcal{R}}\times Q^{\mathcal{R}}}(\mathcal{T})italic_N start_POSTSUBSCRIPT italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T )

Thus, N(𝒯)N(𝒯)+1𝑁superscript𝒯𝑁𝒯1N(\mathcal{T}^{\prime})\geq N(\mathcal{T})+1italic_N ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ italic_N ( caligraphic_T ) + 1.

Match refinement

Let qB𝑞𝐵q\in Bitalic_q ∈ italic_B and p,pQ𝑝superscript𝑝superscript𝑄p,p^{\prime}\in Q^{\mathcal{R}}italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. Suppose qp𝑞similar-to-or-equals𝑝q\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}pitalic_q over√ start_ARG ≃ end_ARG italic_p and qp𝑞similar-to-or-equalssuperscript𝑝q\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}p^{\prime}italic_q over√ start_ARG ≃ end_ARG italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Note that when using approximate matching this does not imply that ¬(q#p)#𝑞𝑝\neg(q\mathrel{\#}p)¬ ( italic_q # italic_p ) and ¬(q#p)#𝑞superscript𝑝\neg(q\mathrel{\#}p^{\prime})¬ ( italic_q # italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). After OQ 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)σsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝜎\mathsf{access}^{\mathcal{T}}(q)\sigmasansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_σ with σ=𝗌𝖾𝗉(p,p)𝜎𝗌𝖾𝗉𝑝superscript𝑝\sigma=\mathsf{sep}(p,p^{\prime})italic_σ = sansserif_sep ( italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ), we find

NB×Q×Q(𝒯)NB×Q×Q(𝒯){(q,p,p)}subscript𝑁𝐵superscript𝑄superscript𝑄𝒯𝑞𝑝superscript𝑝subscript𝑁𝐵superscript𝑄superscript𝑄superscript𝒯N_{B\times Q^{\mathcal{R}}\times Q^{\mathcal{R}}}(\mathcal{T}^{\prime})% \supseteq N_{B\times Q^{\mathcal{R}}\times Q^{\mathcal{R}}}(\mathcal{T})\cup\{% (q,p,p^{\prime})\}italic_N start_POSTSUBSCRIPT italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T ) ∪ { ( italic_q , italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) }

Additionally,

N(𝒯)N(𝒯)subscript𝑁𝒯subscript𝑁superscript𝒯N_{\downarrow}(\mathcal{T}^{\prime})\supseteq N_{\downarrow}(\mathcal{T})italic_N start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT ( caligraphic_T )
N#(𝒯)N#(𝒯)subscript𝑁#𝒯subscript𝑁#superscript𝒯N_{\mathrel{\#}}(\mathcal{T}^{\prime})\supseteq N_{\mathrel{\#}}(\mathcal{T})italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT ( caligraphic_T )
N(B×F)(𝒯)N(B×F)(𝒯)subscript𝑁𝐵𝐹𝒯subscript𝑁𝐵𝐹superscript𝒯N_{(B\times F)\mathord{\downarrow}}(\mathcal{T}^{\prime})\supseteq N_{(B\times F% )\mathord{\downarrow}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T )
N(BF)#Q(𝒯)N(BF)#Q(𝒯)subscript𝑁#𝐵𝐹superscript𝑄𝒯subscript𝑁#𝐵𝐹superscript𝑄superscript𝒯N_{(B\cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T}^{\prime})\supseteq N_{(B% \cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T )

Because NQsubscript𝑁𝑄N_{Q}italic_N start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT remains unchanged, we have N(𝒯)N(𝒯)+1𝑁superscript𝒯𝑁𝒯1N(\mathcal{T}^{\prime})\geq N(\mathcal{T})+1italic_N ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ italic_N ( caligraphic_T ) + 1.

Prioritized separation

Analogous to the proof for separation, the additional condition on σ𝜎\sigmaitalic_σ does not change the postcondition.

Equivalence

Suppose all rF𝑟𝐹r\in Fitalic_r ∈ italic_F are identified and for all qB𝑞𝐵q\in Bitalic_q ∈ italic_B and iI𝑖𝐼i\in Iitalic_i ∈ italic_I, δ𝒯(q,i)superscript𝛿𝒯𝑞𝑖absent\delta^{\mathcal{T}}(q,i){\downarrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ↓. These conditions are stronger than the conditions from [23]. Therefore, we know at least NL#(𝒯)NL#(𝒯)+1subscript𝑁superscript𝐿#superscript𝒯subscript𝑁superscript𝐿#𝒯1N_{L^{\#}}(\mathcal{T}^{\prime})\geq N_{L^{\#}}(\mathcal{T})+1italic_N start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ italic_N start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T ) + 1 holds. Additionally,

N(B×Q×Q)(𝒯)N(B×Q×Q)(𝒯)subscript𝑁𝐵superscript𝑄superscript𝑄𝒯subscript𝑁𝐵superscript𝑄superscript𝑄superscript𝒯N_{(B\times Q^{\mathcal{R}}\times Q^{\mathcal{R}})\mathord{\downarrow}}(% \mathcal{T}^{\prime})\supseteq N_{(B\times Q^{\mathcal{R}}\times Q^{\mathcal{R% }})\mathord{\downarrow}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ) ↓ end_POSTSUBSCRIPT ( caligraphic_T )
N(B×F)(𝒯)N(B×F)(𝒯)subscript𝑁𝐵𝐹𝒯subscript𝑁𝐵𝐹superscript𝒯N_{(B\times F)\mathord{\downarrow}}(\mathcal{T}^{\prime})\supseteq N_{(B\times F% )\mathord{\downarrow}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT ( caligraphic_T )
N(BF)#Q(𝒯)N(BF)#Q(𝒯)subscript𝑁#𝐵𝐹superscript𝑄𝒯subscript𝑁#𝐵𝐹superscript𝑄superscript𝒯N_{(B\cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T}^{\prime})\supseteq N_{(B% \cup F)\mathrel{\#}Q^{\mathcal{R}}}(\mathcal{T})italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊇ italic_N start_POSTSUBSCRIPT ( italic_B ∪ italic_F ) # italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_T )

Thus, N(𝒯)N(𝒯)+1𝑁superscript𝒯𝑁𝒯1N(\mathcal{T}^{\prime})\geq N(\mathcal{T})+1italic_N ( caligraphic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≥ italic_N ( caligraphic_T ) + 1.

Proof of Theorem 6.1

Proof

First, we prove that if 𝒯𝒯\mathcal{T}caligraphic_T is an observation tree for 𝒮𝒮\mathcal{S}caligraphic_S, then

N(𝒯)𝑁𝒯\displaystyle N(\mathcal{T})italic_N ( caligraphic_T ) n(n+1)+kn+(n1)(kn+1)+no2+(kn+1)o+n(kn+1)absent𝑛𝑛1𝑘𝑛𝑛1𝑘𝑛1𝑛superscript𝑜2𝑘𝑛1𝑜𝑛𝑘𝑛1\displaystyle\leq n(n+1)+kn+(n-1)(kn+1)+no^{2}+(kn+1)o+n(kn+1)≤ italic_n ( italic_n + 1 ) + italic_k italic_n + ( italic_n - 1 ) ( italic_k italic_n + 1 ) + italic_n italic_o start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_k italic_n + 1 ) italic_o + italic_n ( italic_k italic_n + 1 )
𝒪(kn2+kno+no2)absent𝒪𝑘superscript𝑛2𝑘𝑛𝑜𝑛superscript𝑜2\displaystyle\in\mathcal{O}(kn^{2}+kno+no^{2})∈ caligraphic_O ( italic_k italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_k italic_n italic_o + italic_n italic_o start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )

The first part n(n+1)+kn+(n1)(kn+1)𝑛𝑛1𝑘𝑛𝑛1𝑘𝑛1n(n+1)+kn+(n-1)(kn+1)italic_n ( italic_n + 1 ) + italic_k italic_n + ( italic_n - 1 ) ( italic_k italic_n + 1 ) follows from Theorem 3.9 in [23] with some minor adjustments. The set B𝐵Bitalic_B contains at most n𝑛nitalic_n elements and Qsuperscript𝑄Q^{\mathcal{R}}italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT contains at most o𝑜oitalic_o elements. Each state in Qsuperscript𝑄Q^{\mathcal{R}}italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT can be apart from at most |Q|1superscript𝑄1|Q^{\mathcal{R}}|-1| italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT | - 1 other states in Qsuperscript𝑄Q^{\mathcal{R}}italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT. Therefore,

|{(q,p,p)B×Q×Qδ(q,σ) with σ=𝗌𝖾𝗉(p,p)}|no(o1)no2conditional-set𝑞𝑝superscript𝑝𝐵superscript𝑄superscript𝑄𝛿𝑞𝜎 with 𝜎𝗌𝖾𝗉𝑝superscript𝑝𝑛𝑜𝑜1𝑛superscript𝑜2|\{(q,p,p^{\prime})\in B\times Q^{\mathcal{R}}\times Q^{\mathcal{R}}\mid\delta% (q,\sigma)\mathord{\downarrow}\text{ with }\sigma=\mathsf{sep}(p,p^{\prime})\}% |{}\leq no(o-1)\leq no^{2}| { ( italic_q , italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_B × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ∣ italic_δ ( italic_q , italic_σ ) ↓ with italic_σ = sansserif_sep ( italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) } | ≤ italic_n italic_o ( italic_o - 1 ) ≤ italic_n italic_o start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

Since the set BF𝐵𝐹B\cup Fitalic_B ∪ italic_F contains at most kn+1𝑘𝑛1kn+1italic_k italic_n + 1 elements and each state in BF𝐵𝐹B\cup Fitalic_B ∪ italic_F can be apart from at most o𝑜oitalic_o states from Qsuperscript𝑄Q^{\mathcal{R}}italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT, we have

|{(q,q)(BF)×Qq#q}|conditional-set𝑞superscript𝑞𝐵𝐹superscript𝑄#𝑞superscript𝑞\displaystyle|\{(q,q^{\prime})\in(B\cup F)\times Q^{\mathcal{R}}\mid q\mathrel% {\#}q^{\prime}\}|| { ( italic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ ( italic_B ∪ italic_F ) × italic_Q start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ∣ italic_q # italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } | \displaystyle\leq o(kn+1)𝑜𝑘𝑛1\displaystyle o(kn+1)italic_o ( italic_k italic_n + 1 )

The set F𝐹Fitalic_F contains at most kn𝑘𝑛knitalic_k italic_n elements and each pair B×F𝐵𝐹B\times Fitalic_B × italic_F has at most one σ=𝗌𝖾𝗉(δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)),δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(r))\sigma=\mathsf{sep}(\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q)),% \delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(r))italic_σ = sansserif_sep ( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_r ) ), thus we have

|N(B×F)|kn2subscript𝑁𝐵𝐹𝑘superscript𝑛2|N_{(B\times F)\mathord{\downarrow}}|{}\leq kn^{2}| italic_N start_POSTSUBSCRIPT ( italic_B × italic_F ) ↓ end_POSTSUBSCRIPT | ≤ italic_k italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

Combining everything and simplifying it leads to

N(𝒯)𝒪(kn2+kno+no2)𝑁𝒯𝒪𝑘superscript𝑛2𝑘𝑛𝑜𝑛superscript𝑜2N(\mathcal{T})\in\mathcal{O}(kn^{2}+kno+no^{2})italic_N ( caligraphic_T ) ∈ caligraphic_O ( italic_k italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_k italic_n italic_o + italic_n italic_o start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )

The ordering on the rules never block the algorithm and when the norm N(𝒯)𝑁𝒯N(\mathcal{T})italic_N ( caligraphic_T ) cannot be increased further, the only applicable rule is the equivalence rule which is guaranteed to lead to the teacher accepting the hypothesis. Therefore, the correct Mealy machine is learned within 𝒪(kn2+kno+no2)𝒪𝑘superscript𝑛2𝑘𝑛𝑜𝑛superscript𝑜2\mathcal{O}(kn^{2}+kno+no^{2})caligraphic_O ( italic_k italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_k italic_n italic_o + italic_n italic_o start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) rule applications.

In AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT, every (non-terminating) application of the equivalence rule leads to a new basis state. Since the basis is bounded by the number of states in the SUL, which is n𝑛nitalic_n, there can be at most n1𝑛1n-1italic_n - 1 applications of the equivalence rule. Each call to ProcCounterEx requires at most logm𝑚\log mroman_log italic_m output queries (see Theorem 3.11 of [23]).

All rules, except for the equivalence rule, require at most two OQs per rule application. Therefore, the application of these rules requires 𝒪(kn2+kno+no2)𝒪𝑘superscript𝑛2𝑘𝑛𝑜𝑛superscript𝑜2\mathcal{O}(kn^{2}+kno+no^{2})caligraphic_O ( italic_k italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_k italic_n italic_o + italic_n italic_o start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) OQs. Combining everything, we find that AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT requires 𝒪(kn2+kno+no2+nlogm)𝒪𝑘superscript𝑛2𝑘𝑛𝑜𝑛superscript𝑜2𝑛𝑚\mathcal{O}(kn^{2}+kno+no^{2}+n\log m)caligraphic_O ( italic_k italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_k italic_n italic_o + italic_n italic_o start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_n roman_log italic_m ) and at most n1𝑛1n-1italic_n - 1 EQs.

Appendix 0.E Additional Experiment Information

0.E.1 Experiment Models

In Experiments 1 and 3, we use the following six models, available here under Mealy machine benchmarks.

  • learnresult_fix

  • DropBear

  • OpenSSH

  • model1

  • NSS_3.17.4_server_regular

  • GnuTLS_3.3.8_client_full

Due to the mutations, this means that the largest model that we can learn has 62 states (mut8(𝒮)subscriptmut8𝒮\textit{mut}_{8}(\mathcal{S})mut start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT ( caligraphic_S )). In Experiment 2, we use the ordering for the Adaptive-OpenSSL models as implied by Fig. 5 in [6]. The ordering taken for the Adaptive-Philips is chronological. In Experiment 4, we use the client TCP models. Additionally, we use the following DTLS models.

  • ctinydtls_ecdhe_cert_req.dot

  • etinydtls_ecdhe_cert_req.dot

  • gnutls-3.6.7_all_cert_req.dot

  • mbedtls_all_cert_req.dot

  • scandium-2.0.0_ecdhe_cert_req.dot

  • scandium_latest_ecdhe_cert_req.dot

  • wolfssl-4.0.0_dhe_ecdhe_rsa_cert_req.dot

0.E.2 Mutation Explanations

In this section, we call the input Mealy machine 𝒮=(Q,I,O,q0,δ,λ)𝒮𝑄𝐼𝑂subscript𝑞0𝛿𝜆\mathcal{S}=(Q,I,O,q_{0},\delta,\lambda)caligraphic_S = ( italic_Q , italic_I , italic_O , italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_δ , italic_λ ). Every mutation is applied exactly once to generate the mutated model.

mut1subscriptmut1\textit{mut}_{1}mut start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT: New initial state. This mutation adds a new initial state called dummy as well as a fresh symbol i𝑖iitalic_i to 𝒮𝒮\mathcal{S}caligraphic_S. From state dummy, all iI𝑖𝐼i\in Iitalic_i ∈ italic_I self loop with the output from q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (the previous initial state). The fresh symbol transitions from dummy to q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. The fresh symbol self loops in all (other) states qQ𝑞𝑄q\in Qitalic_q ∈ italic_Q with the output of λ(q0,i0)𝜆subscript𝑞0subscript𝑖0\lambda(q_{0},i_{0})italic_λ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_i start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) where i0subscript𝑖0i_{0}italic_i start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is the first input in the alphabet.

mut2subscriptmut2\textit{mut}_{2}mut start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT: Change the initial state. This mutation randomly selects one of the states in Q𝑄Qitalic_Q to pick as the new initial state. Because 𝒮𝒮\mathcal{S}caligraphic_S is not necessarily strongly connected, the number of states in the resulting Mealy machine might be lower.

mut3subscriptmut3\textit{mut}_{3}mut start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT: Add a state. This mutation adds a new state q𝑞qitalic_q to Q𝑄Qitalic_Q. We randomly select a state from qQsuperscript𝑞𝑄q^{\prime}\in Qitalic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Q and iI𝑖𝐼i\in Iitalic_i ∈ italic_I and change the destination of this transition to q𝑞qitalic_q, this ensures q𝑞qitalic_q is reachable. For all iI𝑖𝐼i\in Iitalic_i ∈ italic_I, we randomly select a destination state p𝑝pitalic_p and use the output λ(p,i)𝜆𝑝𝑖\lambda(p,i)italic_λ ( italic_p , italic_i ) 80%percent8080\%80 % of the time or a random output 20%percent2020\%20 %.

mut4subscriptmut4\textit{mut}_{4}mut start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT: Remove a state. This mutation removes a non-initial state q𝑞qitalic_q from 𝒮𝒮\mathcal{S}caligraphic_S. All transitions that lead from p𝑝pitalic_p to q𝑞qitalic_q with input i𝑖iitalic_i are shortcutted to δ(q,i)𝛿𝑞𝑖\delta(q,i)italic_δ ( italic_q , italic_i ) with output λ(p,i)𝜆𝑝𝑖\lambda(p,i)italic_λ ( italic_p , italic_i ). If δ(q,i)=q𝛿𝑞𝑖𝑞\delta(q,i)=qitalic_δ ( italic_q , italic_i ) = italic_q, we self loop in p𝑝pitalic_p.

mut5subscriptmut5\textit{mut}_{5}mut start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT: Divert a transition. This mutation randomly selects q,qQ𝑞superscript𝑞𝑄q,q^{\prime}\in Qitalic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_Q and iI𝑖𝐼i\in Iitalic_i ∈ italic_I. We set δ(q,i)=q𝛿𝑞𝑖superscript𝑞\delta(q,i)=q^{\prime}italic_δ ( italic_q , italic_i ) = italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. While 𝒮𝒮\mathcal{S}caligraphic_S is equivalent to the resulting Mealy machine, we choose a new q,q,i𝑞superscript𝑞𝑖q,q^{\prime},iitalic_q , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i and set δ(q,i)=q𝛿𝑞𝑖superscript𝑞\delta(q,i)=q^{\prime}italic_δ ( italic_q , italic_i ) = italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

mut6subscriptmut6\textit{mut}_{6}mut start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT: Change transition output. This mutation randomly selects qQ𝑞𝑄q\in Qitalic_q ∈ italic_Q, iI𝑖𝐼i\in Iitalic_i ∈ italic_I and oO𝑜𝑂o\in Oitalic_o ∈ italic_O. We set λ(q,i)=o𝜆𝑞𝑖𝑜\lambda(q,i)=oitalic_λ ( italic_q , italic_i ) = italic_o such that o𝑜oitalic_o is distinct from the original λ(q,i)𝜆𝑞𝑖\lambda(q,i)italic_λ ( italic_q , italic_i ).

mut7subscriptmut7\textit{mut}_{7}mut start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT: Remove a symbol. This mutation removes a symbol i𝑖iitalic_i from the input alphabet. Consequently, all the transitions with i𝑖iitalic_i are not contained in the resulting Mealy machine.

mut8subscriptmut8\textit{mut}_{8}mut start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT: Appending a mutated model. This mutation takes a 𝒮𝒮\mathcal{S}caligraphic_S and a natural number n𝑛nitalic_n. It first makes a second Mealy machine 𝒮superscript𝒮\mathcal{S}^{\prime}caligraphic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT by applying mut13subscriptmut13\textit{mut}_{13}mut start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT to 𝒮𝒮\mathcal{S}caligraphic_S. Then it appends 𝒮superscript𝒮\mathcal{S}^{\prime}caligraphic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT to 𝒮𝒮\mathcal{S}caligraphic_S at the nthsuperscript𝑛thn^{\text{th}}italic_n start_POSTSUPERSCRIPT th end_POSTSUPERSCRIPT state of 𝒮𝒮\mathcal{S}caligraphic_S which we call q𝑞qitalic_q, i.e., δ(q,i)=q0𝒮𝛿𝑞𝑖superscriptsubscript𝑞0superscript𝒮\delta(q,i)=q_{0}^{\mathcal{S}^{\prime}}italic_δ ( italic_q , italic_i ) = italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT for some random i𝑖iitalic_i. The natural numbers are chosen based on visual inspection of the models, we consistently choose a state that represents the end of a model. This end state is either the sink state or a state at the end of a very long trace in the model which transitions to the sink state.

mut9subscriptmut9\textit{mut}_{9}mut start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT: Prepending a mutated model. This mutation takes a 𝒮𝒮\mathcal{S}caligraphic_S and a natural number n𝑛nitalic_n. It first makes a second Mealy machine 𝒮superscript𝒮\mathcal{S}^{\prime}caligraphic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT by appling mut13subscriptmut13\textit{mut}_{13}mut start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT to 𝒮𝒮\mathcal{S}caligraphic_S. Then it appends 𝒮𝒮\mathcal{S}caligraphic_S to 𝒮superscript𝒮\mathcal{S}^{\prime}caligraphic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT at the nthsuperscript𝑛thn^{\text{th}}italic_n start_POSTSUPERSCRIPT th end_POSTSUPERSCRIPT state of 𝒮superscript𝒮\mathcal{S}^{\prime}caligraphic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT which we call q𝑞qitalic_q, i.e., δ(q,i)=q0𝒮𝛿𝑞𝑖superscriptsubscript𝑞0𝒮\delta(q,i)=q_{0}^{\mathcal{S}}italic_δ ( italic_q , italic_i ) = italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT for some random i𝑖iitalic_i.

mut10subscriptmut10\textit{mut}_{10}mut start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT: Several mutations. This mutation applies mutations mut3subscriptmut3\textit{mut}_{3}mut start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, mut4subscriptmut4\textit{mut}_{4}mut start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, mut5subscriptmut5\textit{mut}_{5}mut start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT and mut6subscriptmut6\textit{mut}_{6}mut start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT to 𝒮𝒮\mathcal{S}caligraphic_S in this particular order.

mut11subscriptmut11\textit{mut}_{11}mut start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT: Several mutations with different initial state. This mutation applies mut2subscriptmut2\textit{mut}_{2}mut start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, mut3subscriptmut3\textit{mut}_{3}mut start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, mut4subscriptmut4\textit{mut}_{4}mut start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, mut5subscriptmut5\textit{mut}_{5}mut start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT and mut6subscriptmut6\textit{mut}_{6}mut start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT to 𝒮𝒮\mathcal{S}caligraphic_S in this particular order.

mut12subscriptmut12\textit{mut}_{12}mut start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT: Changing many transitions. This mutation applies mut5subscriptmut5\textit{mut}_{5}mut start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT, mut6subscriptmut6\textit{mut}_{6}mut start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT, mut5subscriptmut5\textit{mut}_{5}mut start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT, mut6subscriptmut6\textit{mut}_{6}mut start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT, mut5subscriptmut5\textit{mut}_{5}mut start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT, mut6subscriptmut6\textit{mut}_{6}mut start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT to 𝒮𝒮\mathcal{S}caligraphic_S.

mut13subscriptmut13\textit{mut}_{13}mut start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT: Many mutations. This mutation applies mut10subscriptmut10\textit{mut}_{10}mut start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT three times to 𝒮𝒮\mathcal{S}caligraphic_S.

mut14subscriptmut14\textit{mut}_{14}mut start_POSTSUBSCRIPT 14 end_POSTSUBSCRIPT: Union. This mutation takes 𝒮𝒮\mathcal{S}caligraphic_S and makes a second Mealy machine 𝒮superscript𝒮\mathcal{S}^{\prime}caligraphic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT by applying mut13subscriptmut13\textit{mut}_{13}mut start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT to 𝒮𝒮\mathcal{S}caligraphic_S. We combine 𝒮𝒮\mathcal{S}caligraphic_S and 𝒮superscript𝒮\mathcal{S}^{\prime}caligraphic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT by creating one new dummy initial state with two fresh symbols for which one goes to q0𝒮superscriptsubscript𝑞0𝒮q_{0}^{\mathcal{S}}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_S end_POSTSUPERSCRIPT and the other to q0𝒮superscriptsubscript𝑞0superscript𝒮q_{0}^{\mathcal{S}^{\prime}}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT caligraphic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT. The fresh symbols and other transitions are handled in the same way as in mut1subscriptmut1\textit{mut}_{1}mut start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

0.E.3 Additional Figure Experiment 1

In Fig. 7, we show additional pairwise comparison plots from Experiment 1. Each plot compares a pair of algorithms per model and mutation, where a point (x,y)𝑥𝑦(x,y)( italic_x , italic_y ) represents that the algorithm on the x-axis required x𝑥xitalic_x symbols over all seeds and the algorithm on the y-axis requires y𝑦yitalic_y symbols. Points below the diagonal indicate that the y-algorithm outperforms the x-algorithm, points below the dashed (dotted) line indicate a factor two (ten) improvement, respectively.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 7: Pairwise comparisons between algorithms.

0.E.4 Additional Tables Experiment 2

Tables 5 and 4 display the mean number of inputs per model and algorithm in the same style as Table 2. The reference row indicates which reference model was used for the adaptive algorithms. THe teal values indicate the lowest, and therefore, best score.

Table 4: Mean inputs for learning a Philips model with a reference.
Algorithm  model2  model3  model4  model5  model6
Reference  model1  model2  model3  model4  model5
Lsuperscript𝐿L^{*}italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 657 2196 2196 5340 5340
KV 256 1671 1672 2128 2128
L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT 212 862 862 1730 1730
LMsubscriptsuperscript𝐿𝑀\partial L^{*}_{M}∂ italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT 657 2325 2650 2997 4520
IKV 160 814 458 770 841
AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT 146 918 580 1592 1043
LR#subscriptsuperscript𝐿#RL^{\#}_{\scalebox{0.7}{R}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT R end_POSTSUBSCRIPT 161 954 749 1765 1382
L=#subscriptsuperscript𝐿#L^{\#}_{\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{=}{}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT over√ start_ARG = end_ARG end_POSTSUBSCRIPT 167 956 590 1775 1178
L#subscriptsuperscript𝐿#similar-to-or-equalsL^{\#}_{\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG end_POSTSUBSCRIPT 152 920 590 1602 1178
LR,=#subscriptsuperscript𝐿#RL^{\#}_{\scalebox{0.7}{R},\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{=}% {}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT R , over√ start_ARG = end_ARG end_POSTSUBSCRIPT 161 954 580 1765 1043
Table 5: Mean inputs for learning an OpenSSL model with a reference.
Algorithm 097c 097e 098f 098l 098m 098s 098u 098za 100 100f 100h 100m 101 101h 101k 102 110pre1
Reference 097 097c 097e 098f 098l 098m 098s 098u 098m 100 100f 100h 100h 100 101h 101k 102
Lsuperscript𝐿L^{*}italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 21273 21273 31608 2408 2065 2065 2506 1820 2065 2065 2506 1820 3143 1820 1477 1281 1134
KV 22434 19376 24754 6103 4267 4504 5634 3686 4267 4659 6492 3545 8423 3545 2764 2239 1178
L#superscript𝐿#L^{\#}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT 23512 23305 25412 6786 5145 5934 6562 3684 5145 5819 6283 3983 8938 3983 3075 2293 1452
LMsubscriptsuperscript𝐿𝑀\partial L^{*}_{M}∂ italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT 5155 5155 5155 3203 2065 2065 2317 2363 2065 2065 2317 2430 3820 2430 2115 1281 1134
IKV 3290 4872 1506 2945 2326 2875 3789 792 876 3153 3398 792 2033 831 636 977 376
AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT 3808 1391 1391 861 756 737 2100 638 751 737 1953 638 1778 638 514 461 665
LR#subscriptsuperscript𝐿#RL^{\#}_{\scalebox{0.7}{R}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT R end_POSTSUBSCRIPT 19632 1397 1397 843 756 737 2109 642 751 737 1963 642 1791 642 518 1545 1538
L=#subscriptsuperscript𝐿#L^{\#}_{\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{=}{}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT over√ start_ARG = end_ARG end_POSTSUBSCRIPT 23346 17503 1417 7358 5458 6437 6441 5081 768 6619 7054 4804 1800 4804 532 2606 601
L#subscriptsuperscript𝐿#similar-to-or-equalsL^{\#}_{\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG end_POSTSUBSCRIPT 3558 17201 1417 2764 3856 2641 3365 657 768 3110 3974 657 1800 664 532 464 667
LR,=#subscriptsuperscript𝐿#RL^{\#}_{\scalebox{0.7}{R},\scalebox{0.7}{\scalebox{0.65}{${}\overset{\surd}{=}% {}$}}}italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT start_POSTSUBSCRIPT R , over√ start_ARG = end_ARG end_POSTSUBSCRIPT 19683 1397 1391 843 756 737 2109 638 751 737 1963 638 1782 638 520 937 663

Appendix 0.F Detailed Example Run of AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT

In this section, we give a detailed explanation of how 𝒮𝒮\mathcal{S}caligraphic_S can be learned with references 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 3subscript3\mathcal{R}_{3}caligraphic_R start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT using AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT. From the references, we derive the following state cover and separating family:

P=P1P3={ε,c,ca}{ε,b,bb,bbb}={ε,c,ca,b,bb,bbb}𝑃superscript𝑃subscript1superscript𝑃subscript3𝜀𝑐𝑐𝑎𝜀𝑏𝑏𝑏𝑏𝑏𝑏𝜀𝑐𝑐𝑎𝑏𝑏𝑏𝑏𝑏𝑏P=P^{\mathcal{R}_{1}}\cup P^{\mathcal{R}_{3}}=\{\varepsilon,c,ca\}\cup\{% \varepsilon,b,bb,bbb\}=\{\varepsilon,c,ca,b,bb,bbb\}italic_P = italic_P start_POSTSUPERSCRIPT caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∪ italic_P start_POSTSUPERSCRIPT caligraphic_R start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT = { italic_ε , italic_c , italic_c italic_a } ∪ { italic_ε , italic_b , italic_b italic_b , italic_b italic_b italic_b } = { italic_ε , italic_c , italic_c italic_a , italic_b , italic_b italic_b , italic_b italic_b italic_b }
Wr0=Wr1={c,ac},Wr2={c},Ws0=Ws1=Ws2=Ws3={c,b,bb}formulae-sequencesubscript𝑊subscript𝑟0subscript𝑊subscript𝑟1𝑐𝑎𝑐formulae-sequencesubscript𝑊subscript𝑟2𝑐subscript𝑊subscript𝑠0subscript𝑊subscript𝑠1subscript𝑊subscript𝑠2subscript𝑊subscript𝑠3𝑐𝑏𝑏𝑏W_{r_{0}}=W_{r_{1}}=\{c,ac\},W_{r_{2}}=\{c\},W_{s_{0}}=W_{s_{1}}=W_{s_{2}}=W_{% s_{3}}=\{c,b,bb\}italic_W start_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_W start_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_c , italic_a italic_c } , italic_W start_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_c } , italic_W start_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_W start_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_W start_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_W start_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_c , italic_b , italic_b italic_b }
  1. 1.

    AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT always start with an observation tree containing only the root node.

  2. 2.

    The first rule we apply is the rebuilding rule. We apply this rule with q=q=q0𝑞superscript𝑞subscript𝑞0q=q^{\prime}=q_{0}italic_q = italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (the root state) and i=c𝑖𝑐i=citalic_i = italic_c because the conditions hold:

    • δ𝒯(q0,c)Bsuperscript𝛿𝒯subscript𝑞0𝑐𝐵\delta^{\mathcal{T}}(q_{0},c)\notin Bitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_c ) ∉ italic_B because δ𝒯(q0,c)superscript𝛿𝒯subscript𝑞0𝑐absent\delta^{\mathcal{T}}(q_{0},c){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_c ) ↑,

    • ¬(q#δ𝒯(q,i))=¬(q0#δ𝒯(q0,c))#superscript𝑞superscript𝛿𝒯𝑞𝑖#subscript𝑞0superscript𝛿𝒯subscript𝑞0𝑐\neg(q^{\prime}\mathrel{\#}\delta^{\mathcal{T}}(q,i))=\neg(q_{0}\mathrel{\#}% \delta^{\mathcal{T}}(q_{0},c))¬ ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT # italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) ) = ¬ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT # italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_c ) ) because δ𝒯(q0,c)superscript𝛿𝒯subscript𝑞0𝑐absent\delta^{\mathcal{T}}(q_{0},c){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_c ) ↑,

    • 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q,i)=𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q0,c)=cPsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯subscript𝑞0𝑐𝑐𝑃\mathsf{access}^{\mathcal{T}}(q,i)=\mathsf{access}^{\mathcal{T}}(q_{0},c)=c\in Psansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) = sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_c ) = italic_c ∈ italic_P,

    • 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)=𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q0)=εPsuperscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯subscript𝑞0𝜀𝑃\mathsf{access}^{\mathcal{T}}(q^{\prime})=\mathsf{access}^{\mathcal{T}}(q_{0})% =\varepsilon\in Psansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = italic_ε ∈ italic_P,

    • δ𝒯(q0,cac)δ𝒯(q0,ac)superscript𝛿𝒯subscript𝑞0𝑐𝑎𝑐superscript𝛿𝒯subscript𝑞0𝑎𝑐absent\delta^{\mathcal{T}}(q_{0},cac){\uparrow}\land\delta^{\mathcal{T}}(q_{0},ac){\uparrow}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_c italic_a italic_c ) ↑ ∧ italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_a italic_c ) ↑ with 𝗌𝖾𝗉(δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i),δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)))=𝗌𝖾𝗉(δ(c),δ(ε))=ac=σ𝗌𝖾𝗉superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞𝗌𝖾𝗉superscript𝛿𝑐superscript𝛿𝜀𝑎𝑐𝜎\mathsf{sep}(\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q)i),\delta^{% \mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q^{\prime})))=\mathsf{sep}(\delta^{% \mathcal{R}}(c),\delta^{\mathcal{R}}(\varepsilon))=ac=\sigmasansserif_sep ( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ) = sansserif_sep ( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_c ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_ε ) ) = italic_a italic_c = italic_σ.

    We execute OQ(cac)𝑂𝑄𝑐𝑎𝑐OQ(cac)italic_O italic_Q ( italic_c italic_a italic_c ) and OQ(ac)𝑂𝑄𝑎𝑐OQ(ac)italic_O italic_Q ( italic_a italic_c ).

  3. 3.

    We can now apply prioritized promotion with q1subscript𝑞1q_{1}italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT because acq0#q1proves𝑎𝑐#subscript𝑞0subscript𝑞1ac\vdash q_{0}\mathrel{\#}q_{1}italic_a italic_c ⊢ italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT # italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. The resulting observation tree looks as follows:

    [Uncaptioned image]
  4. 4.

    Next, we try to promote the state reached by ca𝑐𝑎caitalic_c italic_a in 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. Note that we cannot apply rebuilding with q=q1𝑞subscript𝑞1q=q_{1}italic_q = italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, q=q0superscript𝑞subscript𝑞0q^{\prime}=q_{0}italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and i=a𝑖𝑎i=aitalic_i = italic_a because 𝗌𝖾𝗉(δ(ca),δ(ε))=c𝗌𝖾𝗉superscript𝛿𝑐𝑎superscript𝛿𝜀𝑐\mathsf{sep}(\delta^{\mathcal{R}}(ca),\delta^{\mathcal{R}}(\varepsilon))=csansserif_sep ( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_c italic_a ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_ε ) ) = italic_c and δ𝒯(c)superscript𝛿𝒯𝑐\delta^{\mathcal{T}}(c){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_c ) ↓ and δ𝒯(cac)superscript𝛿𝒯𝑐𝑎𝑐\delta^{\mathcal{T}}(cac){\mathord{\downarrow}}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_c italic_a italic_c ) ↓.
    We do apply rebuilding with q=q1𝑞subscript𝑞1q=q_{1}italic_q = italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, q=q1superscript𝑞subscript𝑞1q^{\prime}=q_{1}italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and i=a𝑖𝑎i=aitalic_i = italic_a. All the conditions hold and σ=c𝜎𝑐\sigma=citalic_σ = italic_c. This leads to output queries OQ(cac)𝑂𝑄𝑐𝑎𝑐OQ(cac)italic_O italic_Q ( italic_c italic_a italic_c ) and OQ(cc)𝑂𝑄𝑐𝑐OQ(cc)italic_O italic_Q ( italic_c italic_c ).

  5. 5.

    We can apply prioritized promotion with q2subscript𝑞2q_{2}italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT because cq1#q2proves𝑐#subscript𝑞1subscript𝑞2c\vdash q_{1}\mathrel{\#}q_{2}italic_c ⊢ italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT # italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and cq0#q2proves𝑐#subscript𝑞0subscript𝑞2c\vdash q_{0}\mathrel{\#}q_{2}italic_c ⊢ italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT # italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

  6. 6.

    We again use the rebuilding rule for q=q0,q=q0formulae-sequence𝑞subscript𝑞0superscript𝑞subscript𝑞0q=q_{0},q^{\prime}=q_{0}italic_q = italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and i=b𝑖𝑏i=bitalic_i = italic_b. All the conditions hold and we use σ=𝗌𝖾𝗉(δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)i),δ(𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)))=𝗌𝖾𝗉(δ(b),δ(ε))=bb𝜎𝗌𝖾𝗉superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖superscript𝛿superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞𝗌𝖾𝗉superscript𝛿𝑏superscript𝛿𝜀𝑏𝑏\sigma=\mathsf{sep}(\delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q)i),% \delta^{\mathcal{R}}(\mathsf{access}^{\mathcal{T}}(q^{\prime})))=\mathsf{sep}(% \delta^{\mathcal{R}}(b),\delta^{\mathcal{R}}(\varepsilon))=bbitalic_σ = sansserif_sep ( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q ) italic_i ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ) = sansserif_sep ( italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_b ) , italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_ε ) ) = italic_b italic_b. We execute the queries OQ(bbb)𝑂𝑄𝑏𝑏𝑏OQ(bbb)italic_O italic_Q ( italic_b italic_b italic_b ) and OQ(bb)𝑂𝑄𝑏𝑏OQ(bb)italic_O italic_Q ( italic_b italic_b ). This leads to the following observation tree. Note that we cannot promote q7subscript𝑞7q_{7}italic_q start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT.

    [Uncaptioned image]

    As stated in the Section 7, we only apply the rebuilding rule with 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q,i)superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯𝑞𝑖\mathsf{access}^{\mathcal{T}}(q,i)sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q , italic_i ) and 𝖺𝖼𝖼𝖾𝗌𝗌𝒯(q)superscript𝖺𝖼𝖼𝖾𝗌𝗌𝒯superscript𝑞\mathsf{access}^{\mathcal{T}}(q^{\prime})sansserif_access start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) from the same reference model. We have not explored bb,bbbP𝑏𝑏𝑏𝑏𝑏𝑃bb,bbb\in Pitalic_b italic_b , italic_b italic_b italic_b ∈ italic_P yet but because these access sequences do not reach frontier states, we cannot apply the rebuilding rule any further. The prioritized promotion rule can also not be applied. Therefore, we move on to the other AL#𝐴superscript𝐿#AL^{\#}italic_A italic_L start_POSTSUPERSCRIPT # end_POSTSUPERSCRIPT rules.

  7. 7.

    Next, we apply the extension rule for all the current basis states. This means the following output queries are performed: OQ(cb)𝑂𝑄𝑐𝑏OQ(cb)italic_O italic_Q ( italic_c italic_b ), OQ(caa)𝑂𝑄𝑐𝑎𝑎OQ(caa)italic_O italic_Q ( italic_c italic_a italic_a ), OQ(cab)𝑂𝑄𝑐𝑎𝑏OQ(cab)italic_O italic_Q ( italic_c italic_a italic_b ). This results in the following observation tree.

    [Uncaptioned image]
  8. 8.

    We compute the matching table and then apply prioritized separation:

    state match 𝗆𝖽𝖾𝗀𝗆𝖽𝖾𝗀\mathsf{mdeg}sansserif_mdeg r0subscript𝑟0r_{0}italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT r1subscript𝑟1r_{1}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT r2subscript𝑟2r_{2}italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT s0subscript𝑠0s_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT s1subscript𝑠1s_{1}italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT s2subscript𝑠2s_{2}italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT s3subscript𝑠3s_{3}italic_s start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT
    q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT r0subscript𝑟0r_{0}italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT 1.0 7/7 5/7 4/7 0/4 0/4 0/4 1/4
    q1subscript𝑞1q_{1}italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT r1subscript𝑟1r_{1}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 1.0 3/4 4/4 2/4 0/4 0/4 0/4 1/4
    q2subscript𝑞2q_{2}italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT r2subscript𝑟2r_{2}italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 1.0 1/2 1/2 2/2 0/2 0/2 0/2 1/2
    • To separate q4subscript𝑞4q_{4}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT from the basis, we can use the separating sequence ac𝑎𝑐acitalic_a italic_c because acWr1𝑎𝑐subscript𝑊subscript𝑟1ac\in W_{r_{1}}italic_a italic_c ∈ italic_W start_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT and r1subscript𝑟1r_{1}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is the expected matching reference state of q4subscript𝑞4q_{4}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT because δ𝒯(q0,a)=q4superscript𝛿𝒯subscript𝑞0𝑎subscript𝑞4\delta^{\mathcal{T}}(q_{0},a)=q_{4}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_a ) = italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, q0r0subscript𝑞0similar-to-or-equalssubscript𝑟0q_{0}\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}r_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and δ(r0,a)=r1superscript𝛿subscript𝑟0𝑎subscript𝑟1\delta^{\mathcal{R}}(r_{0},a)=r_{1}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_a ) = italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. Therefore, we execute OQ(aac)𝑂𝑄𝑎𝑎𝑐OQ(aac)italic_O italic_Q ( italic_a italic_a italic_c ).

    • To separate q6subscript𝑞6q_{6}italic_q start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT from the basis, we can use the separating sequence ac𝑎𝑐acitalic_a italic_c because acWr0𝑎𝑐subscript𝑊subscript𝑟0ac\in W_{r_{0}}italic_a italic_c ∈ italic_W start_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT and r0subscript𝑟0r_{0}italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is the expected matching reference state of q6subscript𝑞6q_{6}italic_q start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT because δ𝒯(q1,c)=q6superscript𝛿𝒯subscript𝑞1𝑐subscript𝑞6\delta^{\mathcal{T}}(q_{1},c)=q_{6}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_c ) = italic_q start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT, q1r1subscript𝑞1similar-to-or-equalssubscript𝑟1q_{1}\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}r_{1}italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and δ(r1,c)=r0superscript𝛿subscript𝑟1𝑐subscript𝑟0\delta^{\mathcal{R}}(r_{1},c)=r_{0}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_c ) = italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. Therefore, we execute OQ(ccac)𝑂𝑄𝑐𝑐𝑎𝑐OQ(ccac)italic_O italic_Q ( italic_c italic_c italic_a italic_c ).

    • To separate q11subscript𝑞11q_{11}italic_q start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT from the basis, we can use the separating sequence ac𝑎𝑐acitalic_a italic_c because acWr1𝑎𝑐subscript𝑊subscript𝑟1ac\in W_{r_{1}}italic_a italic_c ∈ italic_W start_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT and r1subscript𝑟1r_{1}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is the expected matching reference state of q11subscript𝑞11q_{11}italic_q start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT because δ𝒯(q2,a)=q11superscript𝛿𝒯subscript𝑞2𝑎subscript𝑞11\delta^{\mathcal{T}}(q_{2},a)=q_{11}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_a ) = italic_q start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT, q2r2subscript𝑞2similar-to-or-equalssubscript𝑟2q_{2}\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}r_{2}italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and δ(r2,c)=r1superscript𝛿subscript𝑟2𝑐subscript𝑟1\delta^{\mathcal{R}}(r_{2},c)=r_{1}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_c ) = italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. Therefore, we execute OQ(caaac)𝑂𝑄𝑐𝑎𝑎𝑎𝑐OQ(caaac)italic_O italic_Q ( italic_c italic_a italic_a italic_a italic_c ).

    • To separate q3subscript𝑞3q_{3}italic_q start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT from the basis, we can use the separating sequence c𝑐citalic_c because cWr2𝑐subscript𝑊subscript𝑟2c\in W_{r_{2}}italic_c ∈ italic_W start_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT and r2subscript𝑟2r_{2}italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is the expected matching reference state of q3subscript𝑞3q_{3}italic_q start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT because δ𝒯(q2,c)=q11superscript𝛿𝒯subscript𝑞2𝑐subscript𝑞11\delta^{\mathcal{T}}(q_{2},c)=q_{11}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_c ) = italic_q start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT, q2r2subscript𝑞2similar-to-or-equalssubscript𝑟2q_{2}\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}r_{2}italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and δ(r2,c)=r2superscript𝛿subscript𝑟2𝑐subscript𝑟2\delta^{\mathcal{R}}(r_{2},c)=r_{2}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_c ) = italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Therefore, we execute OQ(cacc)𝑂𝑄𝑐𝑎𝑐𝑐OQ(cacc)italic_O italic_Q ( italic_c italic_a italic_c italic_c ).

  9. 9.

    Next, we apply the promotion rule for q4subscript𝑞4q_{4}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT because aq4#q0proves𝑎#subscript𝑞4subscript𝑞0a\vdash q_{4}\mathrel{\#}q_{0}italic_a ⊢ italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT # italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, aq4#q1proves𝑎#subscript𝑞4subscript𝑞1a\vdash q_{4}\mathrel{\#}q_{1}italic_a ⊢ italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT # italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and aq4#q2proves𝑎#subscript𝑞4subscript𝑞2a\vdash q_{4}\mathrel{\#}q_{2}italic_a ⊢ italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT # italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. The resulting observation tree looks as follows:

    [Uncaptioned image]
  10. 10.

    Next, we apply extension with q4subscript𝑞4q_{4}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT and input b𝑏bitalic_b, resulting in OQ(ab)𝑂𝑄𝑎𝑏OQ(ab)italic_O italic_Q ( italic_a italic_b ).

  11. 11.

    We perform another round of prioritized separation with the following matching table:

    state match 𝗆𝖽𝖾𝗀𝗆𝖽𝖾𝗀\mathsf{mdeg}sansserif_mdeg r0subscript𝑟0r_{0}italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT r1subscript𝑟1r_{1}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT r2subscript𝑟2r_{2}italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT s0subscript𝑠0s_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT s1subscript𝑠1s_{1}italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT s2subscript𝑠2s_{2}italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT s3subscript𝑠3s_{3}italic_s start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT
    q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT r0subscript𝑟0r_{0}italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT 0.857 12/14 8/14 7/14 2/6 2/6 1/6 3/6
    q1subscript𝑞1q_{1}italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT r1subscript𝑟1r_{1}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 1.0 5/9 9/9 5/9 0/5 0/5 0/5 1/5
    q2subscript𝑞2q_{2}italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT r2subscript𝑟2r_{2}italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 1.0 3/5 2/5 5/5 0/3 0/3 0/3 1/3
    q4subscript𝑞4q_{4}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT s0,s1subscript𝑠0subscript𝑠1s_{0},s_{1}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 1.0 2/3 1/3 1/3 2/2 2/2 1/2 0/2
    • To separate q6subscript𝑞6q_{6}italic_q start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT further, we execute OQ(ccc)𝑂𝑄𝑐𝑐𝑐OQ(ccc)italic_O italic_Q ( italic_c italic_c italic_c ).

    • To separate q11subscript𝑞11q_{11}italic_q start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT further, we execute OQ(caac)𝑂𝑄𝑐𝑎𝑎𝑐OQ(caac)italic_O italic_Q ( italic_c italic_a italic_a italic_c ).

    • To separate q20subscript𝑞20q_{20}italic_q start_POSTSUBSCRIPT 20 end_POSTSUBSCRIPT, we can use the separating sequence cb𝑐𝑏cbitalic_c italic_b because nbWs1Ws2𝑛𝑏subscript𝑊subscript𝑠1subscript𝑊subscript𝑠2nb\in W_{s_{1}}\cup W_{s_{2}}italic_n italic_b ∈ italic_W start_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∪ italic_W start_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT and s1subscript𝑠1s_{1}italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and s2subscript𝑠2s_{2}italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are the expected matching reference states of q20subscript𝑞20q_{20}italic_q start_POSTSUBSCRIPT 20 end_POSTSUBSCRIPT because δ𝒯(q4,b)=q20superscript𝛿𝒯subscript𝑞4𝑏subscript𝑞20\delta^{\mathcal{T}}(q_{4},b)=q_{20}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT , italic_b ) = italic_q start_POSTSUBSCRIPT 20 end_POSTSUBSCRIPT, q4s0subscript𝑞4similar-to-or-equalssubscript𝑠0q_{4}\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}s_{0}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and δ(s0,b)=s1superscript𝛿subscript𝑠0𝑏subscript𝑠1\delta^{\mathcal{R}}(s_{0},b)=s_{1}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_b ) = italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, q4s1subscript𝑞4similar-to-or-equalssubscript𝑠1q_{4}\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}s_{1}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and δ(s1,b)=s2superscript𝛿subscript𝑠1𝑏subscript𝑠2\delta^{\mathcal{R}}(s_{1},b)=s_{2}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b ) = italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Therefore, we execute OQ(abb)𝑂𝑄𝑎𝑏𝑏OQ(abb)italic_O italic_Q ( italic_a italic_b italic_b ).

    • To separate q13subscript𝑞13q_{13}italic_q start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT, we can use the separating sequence b𝑏bitalic_b because bWs0Ws1𝑏subscript𝑊subscript𝑠0subscript𝑊subscript𝑠1b\in W_{s_{0}}\cup W_{s_{1}}italic_b ∈ italic_W start_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∪ italic_W start_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT and s0subscript𝑠0s_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and s1subscript𝑠1s_{1}italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT are the expected matching reference states of q13subscript𝑞13q_{13}italic_q start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT because δ𝒯(q4,a)=q13superscript𝛿𝒯subscript𝑞4𝑎subscript𝑞13\delta^{\mathcal{T}}(q_{4},a)=q_{13}italic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT , italic_a ) = italic_q start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT, q4s0subscript𝑞4similar-to-or-equalssubscript𝑠0q_{4}\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}s_{0}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and δ(s0,a)=s0superscript𝛿subscript𝑠0𝑎subscript𝑠0\delta^{\mathcal{R}}(s_{0},a)=s_{0}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_a ) = italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, q4s1subscript𝑞4similar-to-or-equalssubscript𝑠1q_{4}\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}s_{1}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and δ(s1,a)=s1superscript𝛿subscript𝑠1𝑎subscript𝑠1\delta^{\mathcal{R}}(s_{1},a)=s_{1}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_a ) = italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. Therefore, we execute OQ(aab)𝑂𝑄𝑎𝑎𝑏OQ(aab)italic_O italic_Q ( italic_a italic_a italic_b ).

    The resulting observation tree looks as follows:

    [Uncaptioned image]
  12. 12.

    We can no longer apply prioritized separation so we continue with standard separation. We use the separating sequences b,c𝑏𝑐b,citalic_b , italic_c and ac𝑎𝑐acitalic_a italic_c to separate states. Specifically, we perform the following output queries:

    • OQ(bac)𝑂𝑄𝑏𝑎𝑐OQ(bac)italic_O italic_Q ( italic_b italic_a italic_c ) and OQ(bc)𝑂𝑄𝑏𝑐OQ(bc)italic_O italic_Q ( italic_b italic_c ) to separate q7subscript𝑞7q_{7}italic_q start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT from the basis.

    • OQ(cbac)𝑂𝑄𝑐𝑏𝑎𝑐OQ(cbac)italic_O italic_Q ( italic_c italic_b italic_a italic_c ) to separate q10subscript𝑞10q_{10}italic_q start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT from the basis.

    • OQ(acac)𝑂𝑄𝑎𝑐𝑎𝑐OQ(acac)italic_O italic_Q ( italic_a italic_c italic_a italic_c ) to separate q5subscript𝑞5q_{5}italic_q start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT from the basis.

    • OQ(cabac)𝑂𝑄𝑐𝑎𝑏𝑎𝑐OQ(cabac)italic_O italic_Q ( italic_c italic_a italic_b italic_a italic_c ) and OQ(cabc)𝑂𝑄𝑐𝑎𝑏𝑐OQ(cabc)italic_O italic_Q ( italic_c italic_a italic_b italic_c ) to separate q12subscript𝑞12q_{12}italic_q start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT from the basis.

    The resulting observation tree looks as follows:

    [Uncaptioned image]
  13. 13.

    All frontier states are identified and no frontier states are isolated. This means we can possibly perform match refinement or match separation. The matching table looks as follows:

    state match 𝗆𝖽𝖾𝗀𝗆𝖽𝖾𝗀\mathsf{mdeg}sansserif_mdeg r0subscript𝑟0r_{0}italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT r1subscript𝑟1r_{1}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT r2subscript𝑟2r_{2}italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT s0subscript𝑠0s_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT s1subscript𝑠1s_{1}italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT s2subscript𝑠2s_{2}italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT s3subscript𝑠3s_{3}italic_s start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT
    q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT r0subscript𝑟0r_{0}italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT 0.834 15/18 10/18 8/18 4/9 3/9 2/9 6/9
    q1subscript𝑞1q_{1}italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT r1subscript𝑟1r_{1}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 1.0 6/11 11/11 5/11 0/7 0/7 2/7 2/7
    q2subscript𝑞2q_{2}italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT r2subscript𝑟2r_{2}italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 1.0 4/6 2/6 6/6 0/4 0/4 1/4 2/4
    q4subscript𝑞4q_{4}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT s0subscript𝑠0s_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT 1.0 2/5 2/5 2/5 4/4 3/4 1/4 1/4

    Since every basis state is matched with exactly one reference state, match refinement is not applicable. However, we can apply match separation with q=q4𝑞subscript𝑞4q=q_{4}italic_q = italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, q=q4superscript𝑞subscript𝑞4q^{\prime}=q_{4}italic_q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, p=s0𝑝subscript𝑠0p=s_{0}italic_p = italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and i=b𝑖𝑏i=bitalic_i = italic_b because δ𝒯(q4,b)=q20Fsuperscript𝛿𝒯subscript𝑞4𝑏subscript𝑞20𝐹\delta^{\mathcal{T}}(q_{4},b)=q_{20}\in Fitalic_δ start_POSTSUPERSCRIPT caligraphic_T end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT , italic_b ) = italic_q start_POSTSUBSCRIPT 20 end_POSTSUBSCRIPT ∈ italic_F, ¬(q20#q4)#subscript𝑞20subscript𝑞4\neg(q_{20}\mathrel{\#}q_{4})¬ ( italic_q start_POSTSUBSCRIPT 20 end_POSTSUBSCRIPT # italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ), δ(s0,b)=s1superscript𝛿subscript𝑠0𝑏subscript𝑠1\delta^{\mathcal{R}}(s_{0},b)=s_{1}italic_δ start_POSTSUPERSCRIPT caligraphic_R end_POSTSUPERSCRIPT ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_b ) = italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, s0q4subscript𝑠0similar-to-or-equalssubscript𝑞4s_{0}\scalebox{0.65}{${}\overset{\surd}{\simeq}{}$}q_{4}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT over√ start_ARG ≃ end_ARG italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, and for all qB𝑞𝐵q\in Bitalic_q ∈ italic_B, s1subscript𝑠1s_{1}italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is not the match. We use σ=bb𝜎𝑏𝑏\sigma=bbitalic_σ = italic_b italic_b because bbs1#q4proves𝑏𝑏#subscript𝑠1subscript𝑞4bb\vdash s_{1}\mathrel{\#}q_{4}italic_b italic_b ⊢ italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT # italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT and execute OQ(abbb)𝑂𝑄𝑎𝑏𝑏𝑏OQ(abbb)italic_O italic_Q ( italic_a italic_b italic_b italic_b ). This leads to the following observation tree:

    [Uncaptioned image]
  14. 14.

    The match separation led to isolation of q20subscript𝑞20q_{20}italic_q start_POSTSUBSCRIPT 20 end_POSTSUBSCRIPT which we can now add to the basis with the promotion rule.

  15. 15.

    Additionally, we can apply promotion for state q23subscript𝑞23q_{23}italic_q start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT because it is the only state with output 2 for input b𝑏bitalic_b. Notice that we have found all the states in 𝒮𝒮\mathcal{S}caligraphic_S after promoting q23subscript𝑞23q_{23}italic_q start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT.

  16. 16.

    Next, we apply the extension rule several times

    • OQ(aba)𝑂𝑄𝑎𝑏𝑎OQ(aba)italic_O italic_Q ( italic_a italic_b italic_a ) and OQ(abc)𝑂𝑄𝑎𝑏𝑐OQ(abc)italic_O italic_Q ( italic_a italic_b italic_c ) for q20subscript𝑞20q_{20}italic_q start_POSTSUBSCRIPT 20 end_POSTSUBSCRIPT,

    • OQ(abba)𝑂𝑄𝑎𝑏𝑏𝑎OQ(abba)italic_O italic_Q ( italic_a italic_b italic_b italic_a ) and OQ(abbc)𝑂𝑄𝑎𝑏𝑏𝑐OQ(abbc)italic_O italic_Q ( italic_a italic_b italic_b italic_c ) for q23subscript𝑞23q_{23}italic_q start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT.

  17. 17.

    We apply the prioritized separation rule several times

    • OQ(abac)𝑂𝑄𝑎𝑏𝑎𝑐OQ(abac)italic_O italic_Q ( italic_a italic_b italic_a italic_c ) and OQ(ababb)𝑂𝑄𝑎𝑏𝑎𝑏𝑏OQ(ababb)italic_O italic_Q ( italic_a italic_b italic_a italic_b italic_b ) for q36subscript𝑞36q_{36}italic_q start_POSTSUBSCRIPT 36 end_POSTSUBSCRIPT,

    • OQ(aabb)𝑂𝑄𝑎𝑎𝑏𝑏OQ(aabb)italic_O italic_Q ( italic_a italic_a italic_b italic_b ) for q13subscript𝑞13q_{13}italic_q start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT,

    • OQ(abbbc)𝑂𝑄𝑎𝑏𝑏𝑏𝑐OQ(abbbc)italic_O italic_Q ( italic_a italic_b italic_b italic_b italic_c ) and OQ(abbbb)𝑂𝑄𝑎𝑏𝑏𝑏𝑏OQ(abbbb)italic_O italic_Q ( italic_a italic_b italic_b italic_b italic_b ) for q34subscript𝑞34q_{34}italic_q start_POSTSUBSCRIPT 34 end_POSTSUBSCRIPT,

    • OQ(abbac)𝑂𝑄𝑎𝑏𝑏𝑎𝑐OQ(abbac)italic_O italic_Q ( italic_a italic_b italic_b italic_a italic_c ) and OQ(abbab)𝑂𝑄𝑎𝑏𝑏𝑎𝑏OQ(abbab)italic_O italic_Q ( italic_a italic_b italic_b italic_a italic_b ) for q38subscript𝑞38q_{38}italic_q start_POSTSUBSCRIPT 38 end_POSTSUBSCRIPT,

  18. 18.

    Next, we apply separation a few more times.

    • OQ(abcac)𝑂𝑄𝑎𝑏𝑐𝑎𝑐OQ(abcac)italic_O italic_Q ( italic_a italic_b italic_c italic_a italic_c ) and OQ(abcabb)𝑂𝑄𝑎𝑏𝑐𝑎𝑏𝑏OQ(abcabb)italic_O italic_Q ( italic_a italic_b italic_c italic_a italic_b italic_b ) for q37subscript𝑞37q_{37}italic_q start_POSTSUBSCRIPT 37 end_POSTSUBSCRIPT,

    • OQ(abbcac)𝑂𝑄𝑎𝑏𝑏𝑐𝑎𝑐OQ(abbcac)italic_O italic_Q ( italic_a italic_b italic_b italic_c italic_a italic_c ) and OQ(abbcabb)𝑂𝑄𝑎𝑏𝑏𝑐𝑎𝑏𝑏OQ(abbcabb)italic_O italic_Q ( italic_a italic_b italic_b italic_c italic_a italic_b italic_b ) for q39subscript𝑞39q_{39}italic_q start_POSTSUBSCRIPT 39 end_POSTSUBSCRIPT,

    • OQ(abbbac)𝑂𝑄𝑎𝑏𝑏𝑏𝑎𝑐OQ(abbbac)italic_O italic_Q ( italic_a italic_b italic_b italic_b italic_a italic_c ) for q35subscript𝑞35q_{35}italic_q start_POSTSUBSCRIPT 35 end_POSTSUBSCRIPT,

    • OQ(acabb)𝑂𝑄𝑎𝑐𝑎𝑏𝑏OQ(acabb)italic_O italic_Q ( italic_a italic_c italic_a italic_b italic_b ) for q5subscript𝑞5q_{5}italic_q start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT,

    This leads to the following final observation tree and matching table:

    [Uncaptioned image]
    state match 𝗆𝖽𝖾𝗀𝗆𝖽𝖾𝗀\mathsf{mdeg}sansserif_mdeg r0subscript𝑟0r_{0}italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT r1subscript𝑟1r_{1}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT r2subscript𝑟2r_{2}italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT s0subscript𝑠0s_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT s1subscript𝑠1s_{1}italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT s2subscript𝑠2s_{2}italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT s3subscript𝑠3s_{3}italic_s start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT
    q0subscript𝑞0q_{0}italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT r0subscript𝑟0r_{0}italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT 0.834 15/18 10/18 8/18 12/18 7/18 5/18 14/18
    q1subscript𝑞1q_{1}italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT r1subscript𝑟1r_{1}italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 1.0 6/11 11/11 5/11 0/7 0/7 2/7 2/7
    q2subscript𝑞2q_{2}italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT r2subscript𝑟2r_{2}italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 1.0 4/6 2/6 6/6 0/4 0/4 1/4 2/4
    q4subscript𝑞4q_{4}italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT s0subscript𝑠0s_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT 0.924 2/5 2/5 2/5 12/13 7/13 4/13 5/13
    q20subscript𝑞20q_{20}italic_q start_POSTSUBSCRIPT 20 end_POSTSUBSCRIPT s1subscript𝑠1s_{1}italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 0.889 2/5 2/5 2/5 4/9 8/9 4/9 3/9
    q23subscript𝑞23q_{23}italic_q start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT s2subscript𝑠2s_{2}italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 0.8 2/5 2/5 2/5 1/5 1/5 4/5 2/5
  19. 19.

    Next, the equivalence rule is used to construct a hypothesis from the observation tree. This hypothesis is correct so the algorithm terminates.