qtree/.style=for tree=parent anchor=south, child anchor=north,align=center,inner sep=0pt \contourlength1.2pt
11email: {loes.kruger,sebastian.junges,jurriaan.rot}@ru.nl
State Matching and Multiple References in Adaptive Active Automata Learning††thanks: This research is partially supported by the NWO grant No. VI.Vidi.223.096.
Abstract
Active automata learning (AAL) is a method to infer state machines by interacting with black-box systems. Adaptive AAL aims to reduce the sample complexity of AAL by incorporating domain specific knowledge in the form of (similar) reference models. Such reference models appear naturally when learning multiple versions or variants of a software system. In this paper, we present state matching, which allows flexible use of the structure of these reference models by the learner. State matching is the main ingredient of adaptive , a novel framework for adaptive learning, built on top of . Our empirical evaluation shows that adaptive improves the state of the art by up to two orders of magnitude.
1 Introduction
Automata learning aims to extract state machines from observed input-output sequences of some system-under-learning (SUL). Active automata learning (AAL) assumes that one has black-box access to this SUL, allowing the learner to incrementally choose inputs and observe the outputs. The models learned by AAL can be used as a documentation effort, but are more typically used as basis for testing, verification, conformance checking, fingerprinting—see [22, 11] for an overview of applications. The classical algorithm for AAL is , introduced by Angluin [2]; state-of-the-art algorithms are, e.g., [23] and TTT [13], which are available in toolboxes such as LearnLib [14] and AALpy [17].
The primary challenge in AAL is to reduce the number of inputs sent to the SUL, referred to as the sample complexity. To learn a 31-state machine with 22 inputs, state-of-the-art learners may send several million inputs to the SUL [23]. This is not necessarily unexpected: the underlying space of 31-state state machines is huge and it is nontrivial how to maximise information gain. The literature has investigated several approaches to accelerate learners, see the overview of [22]. Nevertheless, scalability remains a core challenge for AAL.
We study adaptive AAL [10], which aims to improve the sample efficiency by utilizing expert knowledge already given to the learner. In (regular) AAL, a learner commonly starts learning from scratch. In adaptive AAL, however, the learner is given a reference model, which ought to be similar to the SUL. Reference models occur naturally in many applications of AAL. For instance:linecolor=orange,backgroundcolor=orange!25,bordercolor=orange,]SJ:shortened these a bit (1) Systems evolve over time due to, e.g., bug fixes or new functionalities—and we may have learned the previous system; (2) Standard protocols may be implemented by a variety of tools; (3) The SUL may be a variant of other systems, e.g., being the same system executing in another environment, or a system configured differently.
Several algorithms for adaptive AAL have been proposed [10, 24, 5, 6, 9]. Intuitively, the idea is that these methods try to rebuild the part of the SUL which is similar to the reference model. This is achieved by deriving suitable queries from the reference model, using so-called access sequences to reach states, and so-called separating sequences to distinguish these from other states.linecolor=green,backgroundcolor=green!25,bordercolor=green,]LK:I would prefer to remove so-called and put emph around access and sep seq linecolor=orange,backgroundcolor=orange!25,bordercolor=orange,]SJ:ok with the emph, but i would keep the so-called as we have not introduced them and do not explain them here? It indicates that a reader does not need the definition These algorithms rely on a rather strict notion of similarity that depends on the way we reach these stateslinecolor=orange,backgroundcolor=orange!25,bordercolor=orange,]SJ:pls check. In particular, existing rebuilding algorithms cannot effectively learn an SUL from a reference model that has a different initial state, see Sec. 2. linecolor=green,backgroundcolor=green!25,bordercolor=green,]LK:Should be merged more with the related work possibly. Also feels a bit incomplete
We propose an approach to adaptive AAL based on state matching, which allows flexibly identifying parts of the unknown SUL where the reference model may be an informative guidelinecolor=orange,backgroundcolor=orange!25,bordercolor=orange,]SJ:removed: to search for queries. More specifically, in this approach, we match states in the model that we have learned so far (captured as a tree-shaped automaton) with states in the reference model such that the outputs agree on all enabled input sequences. This matching allows for targeted re-use of separating sequences from the reference model and is independent of the access sequences. We refine the approach by using approximate state matching, where we match a current state with one from the reference model that agrees on most inputs.
Approximate state matching is the essential ingredient for the novel algorithm. This algorithm is a conservative extension of the recent [23]. Along with approximate state matching, includes rebuilding steps, which are similar to existing methods, but tightly integrated in . Finally, is the first approach with dedicated support to use more than one reference model.
Contributions. We make the following contributions to the state-of-the-art in adaptive AAL. First, we present state matching and its generalization to approximate state matching which allows flexible re-use of separating sequences from the reference model. Second, we include state matching and rebuilding in an unifying approach, called , which generalizes the algorithm for non-adaptive automata learning. We analyse the resulting framework in terms of termination and complexity. This framework naturally supports using multiple reference models as well as removing and adding inputs to the alphabet. Our empirical results show the efficacy of . In particular, may reduce the number of inputs to the SUL by two orders of magnitude. linecolor=green,backgroundcolor=green!25,bordercolor=green,]LK:Was in rel work not in intro: we consider pdlstar and IKV state of the art. Maybe put in experiments?
Related work. Adaptive AAL goes back to [10]. That paper, and many of the follow-up approaches [5, 4, 6, 9] re-use access sequences and separating sequences from the reference model (or from the data structures constructed when learning that model). The recent approach in [6] removes redundant access sequences during rebuilding and continues learning with informative separating sequences. In [24], an -based adaptive AAL approach is proposed where the algorithm starts by including all separating sequences that arise when learning the reference model with , ignoring access sequences. This algorithm is used in [12] for a general study of the usefulness of adaptive AAL: Among others, the authors suggest using more advanced data structures than the observation tables in . Indeed, in [4] the internal data structure of the TTT algorithm is used [13] in the context of lifelong learning; the precise rebuilding approach is not describedlinecolor=green,backgroundcolor=green!25,bordercolor=green,]LK:Too defensive?. The recent [9] proposes an adaptive AAL method based on discrimination trees as used in the Kearns-Vazirani algorithm [15]. We consider the algorithms proposed in [6, 9] the state-of-the-art and have experimentally compared in Sec. 8.
2 Overview
![Refer to caption](x2.png)
![Refer to caption](x3.png)
![Refer to caption](x4.png)
![Refer to caption](x5.png)
We illustrate (1) how adaptive AAL uses a reference model to help learn a system and (2) how this may reduce the sample complexity of the learner.
MAT framework. We recall the standard setting for AAL: Angluin’s MAT framework, cf. [22, 11]. Here, the learner has no direct access to the SUL, but may ask output queries (OQs): these return, for a given input sequence, the sequence of outputs from the SUL; and equivalence queries (EQs): these take a Mealy machine as input, and return whether or not is equivalent to the SUL. In case it is not, a counterexample is provided in the form of a sequence of inputs for which and the SUL return different outputs. EQs are expensive [25, 3, 8, 21], therefore, we aim to learn the SUL using primarily OQs.
Apartness. Learning algorithms in the MAT framework typically assume that two states are equivalent as long as their known residual languages are equivalent. To discover a new state, we must therefore (1) access it by an input sequence and (2) prove this state distinct (apart) from the other states that we already know. Consider the SUL in Fig. 1(a). The access sequences , access and , respectively, from the initial state. These states are different because the response to executing from and is distinct: We say is a separating sequence for and . This difference can be observed by posing OQs for and , consisting of the access sequences for and followed by their separating sequence .
Aim. The aim of adaptive AAL is to learn SULs with fewer inputs, using knowledge in the form of a reference model, known to the learner and preferrably similar to the SUL. The discovery of states is accelerated by extracting candidates for both (1) access sequences and (2) separating sequences from the reference model.
Rebuilding. The state-of-the-art in adaptive AAL uses access sequences and separating sequences from the reference model [6, 9] in an initial phase. Consider the Mealy machine in Fig. 1(b) as a reference model for the SUL in Fig. 1(a). The sequences , , can be used to access all orange states in both and . The separating sequences and for these states in also separate the orange states in . By asking OQs combining the access sequences and separating sequences, we discover all orange states for .
Limits of rebuilding. However, these rebuilding approaches have limitations. Consider in Fig. 1(c). The sequences , , and can be used to access all states in . Concatenating these with any separating sequences from will not be helpful to learn SUL , because in these sequences all access . However, the separating sequences from are useful if executed in the right state of . For instance, the sequence separates all states in , and the blue states in . Thus, rebuilding does not realise the potential of reusing the separating sequences from , since the access sequences for the relevant states are different.
State Matching. We extend adaptive AAL with state matching. State matching overcomes the strong dependency on the access sequences and allows the efficient usage of reference models where the residual languages of the individual states are similar. Suppose that while learning, we have not yet separated and in , but we do know the output of the -transition from . We may use that output to match with in : these two states agree on input sequences where both are defined. Subsequently, we can use the separating sequence between and to separate and , through OQs and .
Approximate State Matching. It rarely happens that states in the SUL exactly match states in the reference model: Consider the scenario where we want to learn with reference model from Fig. 1(d). States and do not match because they have different outputs for input but are still similar. This motivates an approximate version of matching, where a state is matched to the reference state which maximises the number of inputs with the same output.
Outline. After the preliminaries (Sec. 3), we recall the algorithm and extend it with rebuilding (Sec. 4). We then introduce adaptive AAL with state matching and its approximate variant (Sec. 5). Together with rebuilding, this results in the algorithm (Sec. 6). We proceed to define a variant that allows the use of multiple reference models (Sec. 7). This is helpful already in the example discussed in this section: given both and , with multiple reference models allows to discover all states in without any EQs, see App. 0.F.
3 Preliminaries
For a partial map , we write if is defined and otherwise.
Definition 3.1
A partial Mealy machine is a tuple , where , and are finite sets of states, inputs and outputs respectively; an initial state, a transition function, and an output function such that and have the same domain. A (complete) Mealy machine is a partial Mealy machine where and are total. If not specified otherwise, a Mealy machine is assumed to be complete.
We write to denote restricted to alphabet . We use the superscript to indicate to which Mealy machine we refer, e.g. and . The transition and output functions are naturally extended to input sequences of length as functions and . We abbreviate by .
Definition 3.2
Let , be partial Mealy machines. States and match, written , if for all with and . If and do not match, they are apart, written .
If , then there is a separating sequence, i.e., a sequence such that ; this situation is denoted by . The definition of matching allows the input (and output) alphabets of the underlying Mealy machines to differ; it requires that they agree on all commonly defined input sequences. If and are complete and have the same alphabet, then the matching of states is referred to as language equivalence. Two complete Mealy machines are equivalent if their initial states are language equivalent.
Let be a partial Mealy machine. A state is reachable if there exists such that . The reachable part of contains all reachable states in . A sequence is an access sequence for if . A set is a state cover for if contains an access sequence for every reachable state in . In this paper, a tree is a partial Mealy machine where every state has a unique access sequence, denoted by .
Definition 3.3
Let be a complete Mealy machine. A set is a state identifier for if for all with there exists such that . A separating family is a collection of state identifiers such that for all with there exists with .
We use and to refer to a minimal state cover and a separating family for respectively. State covers and separating families can be constructed for every Mealy machine, but are not necessarily unique.
4 with Rebuilding
We first recall the algorithm for (standard) AAL [23]. Then, we consider adaptive learning by presenting an -compatible variant of rebuilding.
4.1 Observation Trees
uses an observation tree as data structure to store the observed traces of .
Definition 4.1
A tree is an observation tree if there exists a map** such that and implies .
In an observation tree, a basis is a subtree that describes unique behaviour present in the SUL. Initially, a basis contains the root state. All states in the basis are pairwise apart, i.e., for all it holds that . For a fixed basis, its frontier is the set of states which are immediate successors of basis states but which are not in the basis themselves.
![Refer to caption](x6.png)
![Refer to caption](x7.png)
![Refer to caption](x8.png)
![Refer to caption](x9.png)
Example 4.1
We say that a frontier state is isolated if it is apart from all basis states. A frontier state is identified with a basis state if it is apart from all basis states except . We say the observation tree is adequate if all frontier states are identified, no frontier states are isolated and each basis state has a transition with every input. If every frontier state is identified and each basis state has a transition for every input, the observation tree can be folded to create a complete Mealy machine (formalized in Def. 0.A.1). The Mealy machine has the same states as the basis. The transitions between basis states are the same as in the observation tree. Transitions from basis states to frontier states are folded back to the basis state the frontier state is identified with. We call the resulting complete Mealy machine a hypothesis whenever this canonical transformation is used.
4.2 The Algorithm
The algorithm maintains an observation tree and a basis . Initially, consists of just a root node and . We denote the frontier of by . The algorithm then repeatedly applies the following four rules.
-
•
The promotion rule (P) extends by when is isolated.
-
•
The extension rule (Ex) poses OQ for with .
-
•
The separation rule (S) takes a state that is not apart from and poses OQ with that shows is apart from or .
-
•
The equivalence rule (Eq) folds into hypothesis , checks whether and agree on all sequences in and poses an EQ. If and the SUL are not equivalent, counterexample processing isolates a frontier state.
The pre- and postconditions of the rules are summarized in (the top rows of) Table 1. A detailed account is given in the paper introducing [23].
Example 4.3
Suppose we learn from Fig. 1. applies the extension rule twice, resulting in as in Fig. 2. States and are identified with because there is only one basis state. Next, applies the equivalence rule using hypothesis (Fig. 2). Counterexample distinguishes from . This sequence is added to and processed further by posing OQ in the equivalence rule. Observations and show that the states accessed with , and are pairwise apart. States and are added to the basis using the promotion rule. Next, poses OQ during the extension rule. To identify all frontier states, may use , and . Fig. 2 shows one possible observation tree after applying the separation rule multiple times. Next, the equivalence rule constructs hypothesis (Fig. 2) from and terminates because and are equivalent.
4.3 Rebuilding in
In this subsection, we combine rebuilding from [6, 9] with and implement this using two rules: rebuilding and prioritized promotion, see also Table 1. Both rules depend on a reference model , which is a complete Mealy machine, with a possibly different alphabet than the SUL . More precisely, these rules depend on a prefix-closed and minimal state cover and a separating family computed on for maximal overlap with . The separating family can be computed with partition refinement [20]. We fix with to be a unique sequence from such that . Below, we use for states in , for states in and for states in . In App. 0.A, we depict the scenarios in the observation tree and reference model required for the new rules to be applicable.
Rule (R): Rebuilding. Let , and suppose . The aim of the rebuilding rule is to show apartness between and a basis state , using the state cover and separating family from . The rebuilding rule is applicable when and are in . If then there exists a sequence such that . We pose OQs and .
Lemma 1
Suppose for all . Consider , such that and . If for all it holds that , then after applying the rebuilding rule for , and all with , state is isolated.
If a state is isolated, it can be added to the basis using the promotion rule.
Rule (PP): Prioritized promotion. Like (regular) promotion, prioritized promotion extends the basis. However, prioritized promotion only applies to states with . This enforces that the access sequences for basis states are in as often as possible, enabling the use of the rebuilding rule.
Example 4.4
Consider reference and SUL from Fig. 1. We learn the orange states similarly as described in Sec. 2: We apply the rebuilding rule with which results in OQs and . Next, we promote with the prioritized promotion rule. We apply the rebuilding rule with and which results in OQs (already present in ) and . Lastly, we promote with prioritized promotion.
The overlap between and and determines how many states of can be discovered via rebuilding. The statement follows from Lemma 1 above.
Theorem 4.1
If matches and only contains a root , then after applying only the rebuilding and prioritized promotion rules until they are no longer applicable, the basis consists of states where is the number of equivalence classes (w.r.t. language equivalence) in the reachable part of .
Corollary 1
Suppose we learn SUL with reference . Using the rebuilding and prioritized promotion rules, we can add all reachable states in to the basis.
5 using State Matching
In this section, we describe another way to reuse information from references, called state matching, which is independent of the state cover. First, we present a version of state matching using the matching relation ( ) from Def. 3.2 and then we weaken this notion to approximate state matching.
5.1 State Matching
![Refer to caption](x10.png)
![Refer to caption](x11.png)
![Refer to caption](x12.png)
With state matching, the learner maintains the matching relation between basis states and reference model states during learning. In the implementation, before applying a matching rule, the matching is updated based on the OQs asked since the previous match computation. We present two key rules here and an optimisation in the next subsection.
Rule (MS): Match separation. This rule aims to show apartness between the frontier and a basis state using separating sequences from the reference separating family. Let , , with for some , and . Suppose that , , and does not match any basis state. In particular, there exists some separating sequence for . The match separation rule poses OQ to either show or and .
Example 5.1
Suppose we learn using from Fig. 1. After applying the extension rule three times, we get (Fig. 3). State matches as their outputs coincide on sequences from alphabet . State transitions to the unmatched state with input . The match separation rule conjectures may match which implies . We use OQ to test this conjecture and indeed find that can be added to the basis using promotion.
Lemma 2
We fix , , and . Suppose . If for all , then after applying the match separation rule with for all with , state is isolated.
Rule (MR): Match refinement. Let and . Suppose matches both and and let . The match refinement rule poses OQ resulting in no longer being matched to or .
Example 5.2
Suppose we continue learning using from observation tree (Fig. 3). State matches both and . After posing OQ where , no longer matches .
If the initial state of SUL is language equivalent to some state in the reference model, then we can discover all reachable states in via state matching and rules. The statement uses Lemma 2 above.
Theorem 5.1
Suppose we have reference and SUL equivalent to but with a possibly different initial state. Using only the match refinement, match separation, promotion and extension rules, we can add states to the basis where is the number of equivalence classes (w.r.t. language equivalence) in the reachable part of .
5.2 Optimised Separation using State Matching
In this subsection, we add an optimisation rule prioritized separation that uses the matching to guide the identification of frontier states. First, we highlight the differences between prioritized separation and the previous separation rules. Both match separation and prioritized separation require that for and . The aim of match separation is to isolate and requires that does not match any basis state. Instead, the aim of prioritized separation is to guide the identification of using the state identifier for a matched with a basis state. The prioritized separation rule is also different from the separation rule (Sec. 4.2) which randomly selects to separate from or .
Rule (PS): Prioritized separation. The prioritized separation rule uses the matching to find a separating sequence from the reference model that is expected to separate a frontier state from a basis state. Let and . Suppose is not apart from and and . If is in of a reference model state that matches , the prioritized separation rule poses OQ resulting in being apart from or 111The precise specification is more involved, as the learner only keeps track of the match relation on ..
Example 5.3
Suppose we learn using from Fig. 1. Assume we have discovered all states in and want to identify , which is currently not apart from any basis state. The prioritized separation rule can only be applied with basis states such that , as is the only sequence in the state identifier of which is the state that matches . From the sequences possibly used by , only immediately identifies .
5.3 Approximate State Matching
In this subsection, we introduce an approximate version of matching, by quantifying matching via a matching degree. Let be a tree and be a (partial) Mealy machine. Let . We define as prefix-suffix pairs that are defined from onwards. Then, we define the matching degree as
Example 5.4
A state in an observation tree approximately matches a state , written , if there does not exist a such that .
Lemma 3
For any : implies .
We define rules approximate match separation (AMS), approximate match refinement (AMR) and approximate prioritized separation (APS) that represent the approximate matching variations of match separation, match refinement and prioritized separation respectively. These rules have weaker preconditions and postconditions, see Table 3 in App 0.A.
6 Adaptive
The rebuilding, state matching and rules described in Table 1 are ordered and combined into one adaptive learning algorithm called adaptive (written ). A non-ordered listing of the rules can be found in Algorithm 1 in App. 0.A. We use the abbreviations for the rules defined in previous sections.
Definition 6.1
The algorithm repeatedly applies the rules from Table 1 (see Algorithm 1), with the following ordering: Ex, APS, (S if APS was not applicable), P, if the observation tree is adequate we try AMR, AMS, Eq. The algorithm starts by applying R and PP until they are no longer applicable; these rules are not applied anymore afterwards.
Similar to , the correctness of amounts to showing termination because the algorithm can only terminate when the teacher indicates that the SUL and hypothesis are equivalent. We prove termination of by proving that each rule application lowers a ranking function. The necessary ingredients for the ranking function are derived from the post-conditions of Table 1.
Theorem 6.1
learns the correct Mealy machine within output queries and at most equivalence queries where is the number of equivalence classes for , is the number of equivalence classes for , is the number of input symbols and the length of the longest counterexample.
7 Adaptive Learning with Multiple References
Let be a finite set of complete reference models with possibly different alphabets. Assume each reference model has a state cover and separating family . We adapt the arguments for the algorithm to represent the state cover and separating family for the set of reference models.
State cover. We initialize the algorithm with the union of the state cover of each reference model, i.e., . To reduce the size of , the state cover for each reference model is computed using a fixed ordering on inputs.
Separating family. We combine the separating families for multiple reference models using a stronger notion of apartness, called total apartness, which also separates states based on whether inputs are defined. When changing the alphabet of a reference model to the alphabet of the SUL, as is done when computing the separating family, the reference model may become partial. If states from different reference models behave the same on their common alphabet but their alphabets contain different inputs from the SUL, we still want to distinguish the reference models based on which inputs they enable.
Definition 7.1
Let be partial Mealy machines and . We say and are total apart, written , if or there exists such that either or but not both.
We use total apartness to define a total state identifier and a total separating family. This definition is similar to Def. 3.3 but is be replaced by . We combine the multiple reference models into a single one with an arbitrary initial state, compute the total separating family and use this to initialize .
Example 7.1
A total separating family for and alphabet is .
We add an optimisation to that only chooses and from the same reference model during rebuilding. Theorem 6.1 can be generalized to this setting where represents the number of equivalence classes across the reference models.
8 Experimental Evaluation
In this section, we empirically investigate the performance of our implementation of . The source code and all benchmarks are available online222https://gitlab.science.ru.nl/lkruger/adaptive-lsharp-learnlib/[16]. We present four experiments to answer the following research questions:
- R1
-
What is the performance of adaptive AAL algorithms, when …
- Exp 1
-
…learning models from a similar reference model?
- Exp 2
-
…applied to benchmarks from the literature?
- R2
-
Can multiple references help , when learning …
- Exp 3
-
…a model from similar reference models?
- Exp 4
-
…a protocol implementation from reference implementations?
Setup. We implement on top of the LearnLib implementation333Obtained from https://github.com/UCL-PPLV/learnlib.git [9]. We invoke conformance testing for the EQs, using the random Wp method from LearnLib with minimal size and random length 444These hyperparameters are discussed in the LearnLib documentation, learnlib.de.. We run all experiments with 30 seeds. We measure the performance of the algorithms based on the number of inputs sent to the SUL during both OQs and EQs: Fewer is better.
Algorithm | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
115.2 | 24.2 | 49.4 | 69.7 | 78.7 | 60.5 | 50.7 | 132.9 | 294.2 | 36.8 | 52.5 | 38.0 | 18.3 | 301.9 | |
KV | 123.5 | 17.8 | 49.6 | 60.1 | 68.9 | 58.7 | 44.9 | 103.7 | 244.3 | 25.5 | 28.7 | 28.0 | 7.5 | 253.6 |
101.7 | 14.3 | 50.0 | 49.2 | 73.0 | 58.7 | 39.9 | 100.1 | 313.9 | 25.4 | 38.9 | 28.0 | 8.0 | 234.9 | |
[6] | 132.7 | 19.8 | 22.5 | 25.0 | 32.7 | 26.0 | - | 178.0 | 375.0 | 24.7 | 25.4 | 44.1 | 8.9 | 256.3 |
IKV [9] | 114.8 | 18.6 | 1.6 | 2.4 | 0.9 | 0.8 | - | 56.6 | 373.9 | 11.0 | 2.1 | 1.1 | 5.8 | 7.0 |
(new!) | 1.2 | 0.5 | 1.5 | 0.8 | 0.8 | 0.8 | 0.6 | 68.1 | 141.1 | 1.4 | 1.3 | 0.8 | 1.9 | 7.2 |
(new!) | 101.7 | 12.3 | 1.7 | 9.4 | 1.1 | 7.9 | 0.7 | 68.2 | 306.1 | 12.6 | 2.8 | 1.7 | 6.4 | 7.9 |
(new!) | 1.2 | 0.5 | 3.5 | 5.2 | 9.1 | 7.2 | 0.7 | 63.0 | 36.8 | 8.7 | 9.8 | 10.8 | 5.7 | 7.1 |
(new!) | 1.2 | 0.5 | 1.7 | 2.7 | 2.0 | 2.1 | 0.7 | 70.6 | 186.5 | 6.0 | 6.1 | 1.7 | 4.8 | 7.4 |
(new!) | 1.2 | 0.5 | 1.5 | 0.8 | 1.0 | 0.8 | 0.6 | 69.3 | 38.7 | 3.1 | 2.0 | 1.0 | 4.5 | 7.3 |
Experiment 1.
We evaluate the performance of against non-adaptive and adaptive algorithms from the literature, in particular [2], KV [15], and [23] as well as [6] and (a Mealy machine adaptation of) IKV [9].
As part of an ablation study, we compare with simpler variations which we refer to as
,
,
,
. The subscripts indicate which rules are added:
: R + PP,
: MS + MR + PS,
: AMS + AMR + APS.
We learn six models from the AutomataWiki benchmarks [18] also used in [23]. We limit ourselves to six models because we mutate every model in 14 different ways (and for 30 seeds). The chosen models represent different types of protocols with varying number of states. We learn the mutated models using the original models, referred to as , as a reference. The mutations may add states, divert transitions, remove inputs, perform multiple mutations, or compose the model with a mutated version of the model. We provide details on the used models and mutations in App. 0.E.
Results. Table 2 shows for an algorithm (rows) and a mutation (columns) the total number of inputs () necessary to learn all models, summed over all seeds555 and IKV do not support removing input inputs, relevant for mutation M7.. The highlighted values indicate the best performing algorithm. We provide detailed pairwise comparisons between algorithms in App. 0.E.
Discussion. First, we observe that always outperforms non-adaptive learning algorithms, as is expected. By combining state matching and rebuilding, mostly outperforms algorithms from the literature, with IKV being competitive on some types of mutations. In we append to , outperforms because incorrectly matches states with states in , making it harder to learn the fragment. In the pairwise comparisons in App. 0.E, we see that performs much better on models GnuTLS, OpenSSH compared to other adaptive approaches. We conjecture that this effect occurs, as these models are hard to learn in general (high number of total inputs) and thus the potential benefit of is higher.
Experiment 2. We evaluate , , IKV and on benchmarks that contain reference models. Adaptive-OpenSSL [7], used in [6], contains models learned from different git development branches for the OpenSSL server side. Adaptive-Philips [19] contains models representing some legacy code which evolved over time due to bug fixes and allowing more inputs.
![Refer to caption](x13.png)
[][Summed inputs in millions for learning some .]
{,
{,
,
,
SUL
{}
{}
}
}
33.1
52.7
17.4
22.9
[][Summed inputs in millions for learning some mutated .]
SUL
{}
{}
{,}
68.1
96.3
25.7
141.1
263.0
35.3
7.2
212.1
3.2
Results. Fig. 4(a) shows the mean total number of inputs required for learning a model from the associated reference model, depicting the percentile (line) and average (mark) over the seeds.
Discussion. We observe that and perform worse than . often outperforms IKV by a factor 2-4, despite that these models are relatively small and thus easy to learn.
Experiment 3. We evaluate with one or multiple references on the models used in Experiment 1. We either (1) learn using several mutations of or (2) learn a mutation that represents a combination of the and .
Results. Tables 4, 4 show for every type of SUL (rows) and every set of references (columns) the total number of inputs () necessary to learn all models, summed over all seeds. Highlighted values indicate the best performing set of references. Column in Table 4 corresponds to values in row of Table 2; they are added in Table 4 for clarity.
Discussion. We observe that using multiple references outperforms using one reference, as is expected. We hypothesize that learning with reference instead of often leads to an increase in total inputs because is less complex due to the random transitions. Therefore, discovering states belonging to the fragment in , and becomes more difficult.
Experiment 4. We evaluate the performance of with one or multiple references on learning DTLS and TCP models from AutomataWiki666References represent related models instead of previous models as in Experiment 2.. We consider seven DTLS implementations selected to have the same key exchange algorithm and certification requirement. We consider three TCP client implementations.
![Refer to caption](x14.png)
![Refer to caption](x15.png)
Results. Fig. 5 shows the required inputs for learning (x-axis) with only the reference model indicated by the colored data point, averaged over the seeds. For each DTLS model, we included learning with the as a reference model. The mark indicates using all models except the as references, the mark indicates using no references, e.g., non-adaptive .
Discussion. We observe that using all references except usually performs as well as the best performing reference model that is distinct from . In scand-lat, using a set of references outperforms single reference models, almost matching the performance of learning with as a reference.
9 Conclusion
We introduced the adaptive algorithm (), a new algorithm for adaptive active automata learning that allows to flexibly use domain knowledge in the form of (preferably similar) reference models and thereby aims to reduce the sample complexity for learning new models. Experiments show that the algorithm can lead to significant improvements over the state-of-the-art (Sec. 8).
9.0.1 Future work.
Approximate state matching is sometimes too eager and may mislead the learner, as happens for in Experiment 1 (Sec. 8). This may be addressed by only applying matching rules when the matching degree is above some threshold. It is currently unclear how to determine an appropriate threshold.
Further, adaptive methods typically perform well when the reference model and SUL are similar [12]. We would like to dynamically determine which (parts of) reference models are similar, and incorporate this in the rebuilding rule.
Adaptive AAL allows the re-use of information in the form of a Mealy machine. Other sources of information that can be re-used in AAL are, for instance, system logs, realised by combining active and passive learning [25, 1]. An interesting direction of research is the development of a more general methodology that allows the re-use of various forms of previous knowledge.
References
- [1] Bernhard K. Aichernig, Edi Muskardin, and Andrea Pferscher. Active vs. passive: A comparison of automata learning paradigms for network protocols. In FMAS/ASYDE@SEFM, volume 371 of EPTCS, pages 1–19, 2022.
- [2] Dana Angluin. Learning regular sets from queries and counterexamples. Inf. Comput., 75(2):87–106, 1987.
- [3] Kousar Aslam, Loek Cleophas, Ramon R. H. Schiffelers, and Mark van den Brand. Interface protocol inference to aid understanding legacy software components. Softw. Syst. Model., 19(6):1519–1540, 2020.
- [4] Alexander Bainczyk, Bernhard Steffen, and Falk Howar. Lifelong learning of reactive systems in practice. In The Logic of Software. A Tasting Menu of Formal Methods, volume 13360 of LNCS, pages 38–53. Springer, 2022.
- [5] Sagar Chaki, Edmund M. Clarke, Natasha Sharygina, and Nishant Sinha. Verification of evolving software via component substitutability analysis. Formal Methods Syst. Des., 32(3):235–266, 2008.
- [6] Carlos Diego Nascimento Damasceno, Mohammad Reza Mousavi, and Adenilso da Silva Simão. Learning to reuse: Adaptive model learning for evolving systems. In IFM, volume 11918 of LNCS, pages 138–156. Springer, 2019.
- [7] Joeri de Ruiter. A tale of the OpenSSL state machine: A large-scale black-box analysis. In NordSec, volume 10014 of LNCS, pages 169–184, 2016.
- [8] Joeri de Ruiter and Erik Poll. Protocol state fuzzing of TLS implementations. In USENIX Security Symposium, pages 193–206. USENIX Association, 2015.
- [9] Tiago Ferreira, Gerco van Heerdt, and Alexandra Silva. Tree-based adaptive model learning. In A Journey from Process Algebra via Timed Automata to Model Learning, volume 13560 of LNCS, pages 164–179. Springer, 2022.
- [10] Alex Groce, Doron A. Peled, and Mihalis Yannakakis. Adaptive model checking. Log. J. IGPL, 14(5):729–744, 2006.
- [11] Falk Howar and Bernhard Steffen. Active automata learning in practice - an annotated bibliography of the years 2011 to 2016. In Machine Learning for Dynamic Software Analysis, volume 11026 of LNCS, pages 123–148. Springer, 2018.
- [12] David Huistra, Jeroen Meijer, and Jaco van de Pol. Adaptive learning for learn-based regression testing. In FMICS, volume 11119 of LNCS, pages 162–177. Springer, 2018.
- [13] Malte Isberner, Falk Howar, and Bernhard Steffen. The TTT algorithm: A redundancy-free approach to active automata learning. In RV, volume 8734 of LNCS, pages 307–322. Springer, 2014.
- [14] Malte Isberner, Falk Howar, and Bernhard Steffen. The open-source learnlib - A framework for active automata learning. In CAV (1), volume 9206 of LNCS, pages 487–495. Springer, 2015.
- [15] Michael J. Kearns and Umesh V. Vazirani. An Introduction to Computational Learning Theory. MIT Press, 1994. URL: https://mitpress.mit.edu/books/introduction-computational-learning-theory.
- [16] Loes Kruger, Sebastian Junges, and Jurriaan Rot. State Matching and Multiple References in Adaptive Active Automata Learning: Supplementary Material, June 2024. doi:10.5281/zenodo.12517574.
- [17] Edi Muskardin, Bernhard K. Aichernig, Ingo Pill, Andrea Pferscher, and Martin Tappler. AALpy: An active automata learning library. In ATVA, volume 12971 of LNCS, pages 67–73. Springer, 2021.
- [18] Daniel Neider, Rick Smetsers, Frits W. Vaandrager, and Harco Kuppens. Benchmarks for automata learning and conformance testing. In Models, Mindsets, Meta, volume 11200 of LNCS, pages 390–416. Springer, 2018.
- [19] Mathijs Schuts, Jozef Hooman, and Frits W. Vaandrager. Refactoring of legacy software using model learning and equivalence checking: An industrial experience report. In IFM, volume 9681 of LNCS, pages 311–325. Springer, 2016.
- [20] Rick Smetsers, Joshua Moerman, and David N. Jansen. Minimal separating sequences for all pairs of states. In LATA, volume 9618 of LNCS, pages 181–193. Springer, 2016.
- [21] Martin Tappler, Bernhard K. Aichernig, and Roderick Bloem. Model-based testing IoT communication via active automata learning. CoRR, abs/1904.07075, 2019.
- [22] Frits W. Vaandrager. Model learning. Commun. ACM, 60(2):86–95, 2017.
- [23] Frits W. Vaandrager, Bharat Garhewal, Jurriaan Rot, and Thorsten Wißmann. A new approach for active automata learning based on apartness. In TACAS (1), volume 13243 of LNCS, pages 223–243. Springer, 2022.
- [24] Stephan Windmüller, Johannes Neubauer, Bernhard Steffen, Falk Howar, and Oliver Bauer. Active continuous quality control. In CBSE, pages 111–120. ACM, 2013.
- [25] Nan Yang, Kousar Aslam, Ramon R. H. Schiffelers, Leonard Lensink, Dennis Hendriks, Loek Cleophas, and Alexander Serebrenik. Improving model inference in industry by combining active and passive learning. In SANER, pages 253–263. IEEE, 2019.
Appendix 0.A Additional Definition, Figure, Table and Algorithm
We define how to fold back an observation tree to a complete Mealy machine.
Definition 0.A.1
Let be an observation tree for SUL . If each basis state has a transition for every input and each frontier state is identified with a basis state, then is folded back to complete Mealy machine where for all and :
In Fig. 6, we show the scenarios in the observation tree and the reference model necessary to apply the rebuilding, match refinement, match separation and prioritized separation rules.
![Refer to caption](x16.png)
![Refer to caption](x17.png)
![Refer to caption](x18.png)
![Refer to caption](x19.png)
In Algorithm 1, we list the rules used for in a non-deterministic ordering.
Table 3 shows the pre- and postconditions of the approximate matching variations of the state matching rules.
Rule | Parameters | Precondition | Postcondition | |
---|---|---|---|---|
Sec 5.3 | approximate match separation | , | ||
, | ||||
s.t. | ||||
approximate match refinement | ||||
approximate prioritized separation | , | |||
s.t. | ||||
Appendix 0.B Proofs of Section 4
Proof of Lemma 1
Proof
Let , and . Suppose
-
(1)
,
-
(2)
,
-
(3)
For all , ,
-
(4)
For all , , where we write for conciseness.
We prove for all , holds from either assumptions (1)-(4) or because the assumptions validate preconditions for the rebuilding rule and after applying the rule we find the required result. Suppose we have a specific . If holds, we are done. From now, assume (5) .
From (4) we derive that (6) or . Otherwise, and which implies under assumption . However, contradicts (5).
From assumptions (1)-(3),(5),(6), we know rebuilding can be applied which leads to OQ and . After the OQs, we know and , combining this with proves that . Thus, for every , , which is exactly the definition of isolated.
Proof of Theorem 4.1
Proof
Let be the number of equivalence classes (w.r.t. language equivalence) in the reachable part of . We prove that whenever the basis does not contain elements, then there always exists an access sequence in that leads to a state that can be isolated using the rebuilding rule. Using recursive reasoning, this proves that the basis contains states whenever prioritized promotion and rebuilding are not applicable anymore. Let denote the current basis, frontier and observation tree. From the Theorem statement we know:
-
(1)
matches ,
-
(2)
States can only be promoted using prioritized promotion.
We also use the following general assumptions from the paper:
-
(3)
is minimal,
-
(4)
is prefix-closed,
-
(5)
and are complete.
First, we note that the state cover and separating family are computed on , which means that both only contain sequences in the alphabet . Because of (3), we know there are equivalence classes in the reachable part of . From (1) and (5), we derive that for all , . This implies that .
If , we are done. Otherwise, . From (1), (3) and , we know that some state in , reachable with a sequence from , has not been discovered yet. Because of (4), this state must be reachable from the basis with one input symbol. In other words, there must exist a basis state and such that and .
From (1), we know that for it must hold that because and are computed on . From (2), we know that for each , .
Therefore, we can apply Lemma 1 and this will lead to being isolated. Using the prioritized promotion rule, we can add to the basis, leading to and we can apply the recursive reasoning to find a new state to promote or to terminate with the required result.
Note that the precise ordering of prioritized promotion and rebuilding is irrelevant. We can never promote states that we do not want to promote. Moreover, when a state is isolated, it can never be un-isolated. Therefore, applying the rebuilding rule while prioritized promotion can be applied never leads to problems. Finally, rebuilding cannot be applied for ever (see termination proof 6.1), therefore we have to use prioritized promotion at some point.
Appendix 0.C Proofs of Section 5
Proof of Lemma 2
Proof
Let and . Suppose
-
(1)
,
-
(2)
,
-
(3)
For all , .
We prove that for all , holds. Suppose we have a specific . If already holds, we are done. From now, assume (4) . Normally, is not a transitive relation, however, because is an observation tree for , for all . Therefore, we can derive from (2). From (1)-(4), we know all preconditions required for match separation hold. We apply the rule and execute OQ with . Note here that with must exist due to (3). After the OQ, we have (5) and from (3) we derive (6) . Because and , it must be that (7) . Combining (5), (6) and (7) proves .
Proof of Theorem 5.1
Proof
Let be the SUL and the reference with and both complete Mealy machines. Moreover, let be equivalent to but possibly has a different initial state. From this, we derive that (1) there exists a state such that is language equivalent to . Let be the number of states in the reachable part of . Let denote the current basis, frontier and observation tree.
We prove that if , then we can add some state to the basis after applying match refinement, match separation, promotion, extension until none of them are applicable anymore. This trivially terminates when we reach .
Suppose . There must be some and such that (2) represents an equivalence class that is different from the equivalence classes for all . We perform a case distinction on the location of in the current observation tree.
-
•
Suppose , this immediately contradicts (2).
-
•
Suppose , then we can apply the extension rule resulting in .
-
•
Suppose and is isolated, then we can apply promotion.
-
•
Suppose (3) and is not isolated. Moreover, from (1) we derive that there exists a state such that (4) and (5) is language equivalent to . We perform a case distinction based on whether for some and show that for each case we can derive a contradiction or apply a rule to make progress.
-
–
Suppose (6) there exists a such that . From (1) we derive that (7) there must exist some state that is language equivalent to .
We derive that state (8) is language equivalent to an equivalence class that is different from the equivalence classes for all because is language equivalent to (derived from (5)) and (2). Moreover, (9) is language equivalent to a state already in the basis (7). By combining (8) and (9), we find that . Moreover, because and are in and they represent different equivalence classes, sequence exists. This means we can apply match refinement with and , resulting in because otherwise (7) leads to a contradiction.
This reasoning can be applied for any such that , resulting in for all after multiple applications of match refinement. In this case, we can continue with the case below. -
–
Suppose there does not exist a such that . In this case, we can apply Lemma 2 with . This results in being isolated. We can apply promotion which increases the size of the basis.
-
–
-
•
Suppose and , this contradicts the assumption that and .
Note that the precise ordering of the promotion, extension, match refinement and match separation is irrelevant. We discuss the reasoning for each rule.
- Promotion
-
States that are isolated can never become un-isolated, therefore, applying other rules before promotion can never lead to problems.
- Extension
-
If we apply rules before applying extension then either extension is not necessary anymore or we can still apply it but both lead to .
- Match refinement
-
The only goal of match refinement is to refine the matching. If two reference states match a basis state, we can perform an OQ that leads to one of the reference states no longer being a match. If we apply one of the other rules before match refinement which already leads to this result, then we do not have to perform match refinement but obtain the same result.
- Match separation
-
In this Theorem, the match separation rule always leads to a new apartness pair between the frontier state and the basis. If some other rule already shows the required apartness pair, we do not have to apply match separation but obtain the same result.
Proof of Lemma 3
Proof
Let be an observation tree, a reference model, and . Suppose . In particular, for all and such that ,
This is equivalent to for all such that
Because we assume reference models are complete w.r.t. their own alphabet , the reference is complete w.r.t. , this implies for all such that and
which is precisely .
Appendix 0.D Proofs of Section 6
Before we prove the complexity for , we define and prove an additional termination Theorem. We prove termination of by proving that each rule lowers the ranking function. To keep consistent with the complexity proof [23], we actually prove that each rule increases some norm and the norm is bounded by the SUL. Specifically, we use norm :
(1) |
where indicates the slightly adapted norm from [23]. The abbreviations for the summands are defined as follows.
The summand keeps track of which separating sequences of the reference model have been applied to basis states in the new observation tree. The summand keeps track of unmatched states between states in the basis or frontier and the reference model. The summand keeps track of separating sequences from the reference model applied to pairs of basis and frontier states. These summands are motivated by the postconditions in Table 1.
Theorem 0.D.1
Every rule application in increases norm .
Proof
Let denote the values before and denote the values after the respective rule application. Let denote the reference model. We reuse abbreviations from [23] and the norm definition above
The proof that the rules promotion, extension, separation and equivalence increase is similar to the proof in [23]. However, we slightly adapted which has influence on the proof for promotion and separation and we assume stronger guarantees for equivalence. It remains to show that combined with the new summands the total norm still increases. Therefore, we include the proofs for the rules here.
- Rebuilding
-
Let , and . We assume
-
1.
-
2.
,
-
3.
, ,
-
4.
or
-
5.
.
The algorithm performs two queries OQs and . After these OQs, the traces and are defined. Particularly, because of assumption (1) and , we know after the OQs. Combining this with (4), we find
Note that we implicitly use (3) and (5) to ensure that exists. In some cases we might find that which, together with (2), indicates
Otherwise . Additionally,
Thus, .
-
1.
- Prioritized promotion
-
Let . Suppose (1) is isolated and suppose (2) . State is moved from to , i.e. , then we have
Because we move something from the frontier to the basis, we find
and thus
Finally,
The total norm increases because
- Promotion
-
Analogous to the proof for prioritized promotion.
- Extension
- Separation
- Match separation
-
Let , , , and . Suppose
-
1.
,
-
2.
,
-
3.
There is no such that ,
-
4.
There exists such that and .
After OQ we find either
If , then
If , then
Additionally,
Thus, .
-
1.
- Match refinement
-
Let and . Suppose and . Note that when using approximate matching this does not imply that and . After OQ with , we find
Additionally,
Because remains unchanged, we have .
- Prioritized separation
-
Analogous to the proof for separation, the additional condition on does not change the postcondition.
- Equivalence
-
Suppose all are identified and for all and , . These conditions are stronger than the conditions from [23]. Therefore, we know at least holds. Additionally,
Thus, .
Proof of Theorem 6.1
Proof
First, we prove that if is an observation tree for , then
The first part follows from Theorem 3.9 in [23] with some minor adjustments. The set contains at most elements and contains at most elements. Each state in can be apart from at most other states in . Therefore,
Since the set contains at most elements and each state in can be apart from at most states from , we have
The set contains at most elements and each pair has at most one , thus we have
Combining everything and simplifying it leads to
The ordering on the rules never block the algorithm and when the norm cannot be increased further, the only applicable rule is the equivalence rule which is guaranteed to lead to the teacher accepting the hypothesis. Therefore, the correct Mealy machine is learned within rule applications.
In , every (non-terminating) application of the equivalence rule leads to a new basis state. Since the basis is bounded by the number of states in the SUL, which is , there can be at most applications of the equivalence rule. Each call to ProcCounterEx requires at most output queries (see Theorem 3.11 of [23]).
All rules, except for the equivalence rule, require at most two OQs per rule application. Therefore, the application of these rules requires OQs. Combining everything, we find that requires and at most EQs.
Appendix 0.E Additional Experiment Information
0.E.1 Experiment Models
In Experiments 1 and 3, we use the following six models, available here under Mealy machine benchmarks.
-
•
learnresult_fix
-
•
DropBear
-
•
OpenSSH
-
•
model1
-
•
NSS_3.17.4_server_regular
-
•
GnuTLS_3.3.8_client_full
Due to the mutations, this means that the largest model that we can learn has 62 states (). In Experiment 2, we use the ordering for the Adaptive-OpenSSL models as implied by Fig. 5 in [6]. The ordering taken for the Adaptive-Philips is chronological. In Experiment 4, we use the client TCP models. Additionally, we use the following DTLS models.
-
•
ctinydtls_ecdhe_cert_req.dot
-
•
etinydtls_ecdhe_cert_req.dot
-
•
gnutls-3.6.7_all_cert_req.dot
-
•
mbedtls_all_cert_req.dot
-
•
scandium-2.0.0_ecdhe_cert_req.dot
-
•
scandium_latest_ecdhe_cert_req.dot
-
•
wolfssl-4.0.0_dhe_ecdhe_rsa_cert_req.dot
0.E.2 Mutation Explanations
In this section, we call the input Mealy machine . Every mutation is applied exactly once to generate the mutated model.
: New initial state. This mutation adds a new initial state called dummy as well as a fresh symbol to . From state dummy, all self loop with the output from (the previous initial state). The fresh symbol transitions from dummy to . The fresh symbol self loops in all (other) states with the output of where is the first input in the alphabet.
: Change the initial state. This mutation randomly selects one of the states in to pick as the new initial state. Because is not necessarily strongly connected, the number of states in the resulting Mealy machine might be lower.
: Add a state. This mutation adds a new state to . We randomly select a state from and and change the destination of this transition to , this ensures is reachable. For all , we randomly select a destination state and use the output of the time or a random output .
: Remove a state. This mutation removes a non-initial state from . All transitions that lead from to with input are shortcutted to with output . If , we self loop in .
: Divert a transition. This mutation randomly selects and . We set . While is equivalent to the resulting Mealy machine, we choose a new and set .
: Change transition output. This mutation randomly selects , and . We set such that is distinct from the original .
: Remove a symbol. This mutation removes a symbol from the input alphabet. Consequently, all the transitions with are not contained in the resulting Mealy machine.
: Appending a mutated model. This mutation takes a and a natural number . It first makes a second Mealy machine by applying to . Then it appends to at the state of which we call , i.e., for some random . The natural numbers are chosen based on visual inspection of the models, we consistently choose a state that represents the end of a model. This end state is either the sink state or a state at the end of a very long trace in the model which transitions to the sink state.
: Prepending a mutated model. This mutation takes a and a natural number . It first makes a second Mealy machine by appling to . Then it appends to at the state of which we call , i.e., for some random .
: Several mutations. This mutation applies mutations , , and to in this particular order.
: Several mutations with different initial state. This mutation applies , , , and to in this particular order.
: Changing many transitions. This mutation applies , , , , , to .
: Many mutations. This mutation applies three times to .
: Union. This mutation takes and makes a second Mealy machine by applying to . We combine and by creating one new dummy initial state with two fresh symbols for which one goes to and the other to . The fresh symbols and other transitions are handled in the same way as in .
0.E.3 Additional Figure Experiment 1
In Fig. 7, we show additional pairwise comparison plots from Experiment 1. Each plot compares a pair of algorithms per model and mutation, where a point represents that the algorithm on the x-axis required symbols over all seeds and the algorithm on the y-axis requires symbols. Points below the diagonal indicate that the y-algorithm outperforms the x-algorithm, points below the dashed (dotted) line indicate a factor two (ten) improvement, respectively.
![Refer to caption](x20.png)
![Refer to caption](x21.png)
![Refer to caption](x22.png)
![Refer to caption](x23.png)
![Refer to caption](x24.png)
![Refer to caption](x25.png)
0.E.4 Additional Tables Experiment 2
Tables 5 and 4 display the mean number of inputs per model and algorithm in the same style as Table 2. The reference row indicates which reference model was used for the adaptive algorithms. THe teal values indicate the lowest, and therefore, best score.
Algorithm | model2 | model3 | model4 | model5 | model6 |
---|---|---|---|---|---|
Reference | model1 | model2 | model3 | model4 | model5 |
657 | 2196 | 2196 | 5340 | 5340 | |
KV | 256 | 1671 | 1672 | 2128 | 2128 |
212 | 862 | 862 | 1730 | 1730 | |
657 | 2325 | 2650 | 2997 | 4520 | |
IKV | 160 | 814 | 458 | 770 | 841 |
146 | 918 | 580 | 1592 | 1043 | |
161 | 954 | 749 | 1765 | 1382 | |
167 | 956 | 590 | 1775 | 1178 | |
152 | 920 | 590 | 1602 | 1178 | |
161 | 954 | 580 | 1765 | 1043 |
Algorithm | 097c | 097e | 098f | 098l | 098m | 098s | 098u | 098za | 100 | 100f | 100h | 100m | 101 | 101h | 101k | 102 | 110pre1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Reference | 097 | 097c | 097e | 098f | 098l | 098m | 098s | 098u | 098m | 100 | 100f | 100h | 100h | 100 | 101h | 101k | 102 |
21273 | 21273 | 31608 | 2408 | 2065 | 2065 | 2506 | 1820 | 2065 | 2065 | 2506 | 1820 | 3143 | 1820 | 1477 | 1281 | 1134 | |
KV | 22434 | 19376 | 24754 | 6103 | 4267 | 4504 | 5634 | 3686 | 4267 | 4659 | 6492 | 3545 | 8423 | 3545 | 2764 | 2239 | 1178 |
23512 | 23305 | 25412 | 6786 | 5145 | 5934 | 6562 | 3684 | 5145 | 5819 | 6283 | 3983 | 8938 | 3983 | 3075 | 2293 | 1452 | |
5155 | 5155 | 5155 | 3203 | 2065 | 2065 | 2317 | 2363 | 2065 | 2065 | 2317 | 2430 | 3820 | 2430 | 2115 | 1281 | 1134 | |
IKV | 3290 | 4872 | 1506 | 2945 | 2326 | 2875 | 3789 | 792 | 876 | 3153 | 3398 | 792 | 2033 | 831 | 636 | 977 | 376 |
3808 | 1391 | 1391 | 861 | 756 | 737 | 2100 | 638 | 751 | 737 | 1953 | 638 | 1778 | 638 | 514 | 461 | 665 | |
19632 | 1397 | 1397 | 843 | 756 | 737 | 2109 | 642 | 751 | 737 | 1963 | 642 | 1791 | 642 | 518 | 1545 | 1538 | |
23346 | 17503 | 1417 | 7358 | 5458 | 6437 | 6441 | 5081 | 768 | 6619 | 7054 | 4804 | 1800 | 4804 | 532 | 2606 | 601 | |
3558 | 17201 | 1417 | 2764 | 3856 | 2641 | 3365 | 657 | 768 | 3110 | 3974 | 657 | 1800 | 664 | 532 | 464 | 667 | |
19683 | 1397 | 1391 | 843 | 756 | 737 | 2109 | 638 | 751 | 737 | 1963 | 638 | 1782 | 638 | 520 | 937 | 663 |
Appendix 0.F Detailed Example Run of
In this section, we give a detailed explanation of how can be learned with references and using . From the references, we derive the following state cover and separating family:
-
1.
always start with an observation tree containing only the root node.
-
2.
The first rule we apply is the rebuilding rule. We apply this rule with (the root state) and because the conditions hold:
-
•
because ,
-
•
because ,
-
•
,
-
•
,
-
•
with .
We execute and .
-
•
-
3.
We can now apply prioritized promotion with because . The resulting observation tree looks as follows:
-
4.
Next, we try to promote the state reached by in . Note that we cannot apply rebuilding with , and because and and .
We do apply rebuilding with , and . All the conditions hold and . This leads to output queries and . -
5.
We can apply prioritized promotion with because and .
-
6.
We again use the rebuilding rule for and . All the conditions hold and we use . We execute the queries and . This leads to the following observation tree. Note that we cannot promote .
As stated in the Section 7, we only apply the rebuilding rule with and from the same reference model. We have not explored yet but because these access sequences do not reach frontier states, we cannot apply the rebuilding rule any further. The prioritized promotion rule can also not be applied. Therefore, we move on to the other rules.
-
7.
Next, we apply the extension rule for all the current basis states. This means the following output queries are performed: , , . This results in the following observation tree.
-
8.
We compute the matching table and then apply prioritized separation:
state match 1.0 7/7 5/7 4/7 0/4 0/4 0/4 1/4 1.0 3/4 4/4 2/4 0/4 0/4 0/4 1/4 1.0 1/2 1/2 2/2 0/2 0/2 0/2 1/2 -
•
To separate from the basis, we can use the separating sequence because and is the expected matching reference state of because , and . Therefore, we execute .
-
•
To separate from the basis, we can use the separating sequence because and is the expected matching reference state of because , and . Therefore, we execute .
-
•
To separate from the basis, we can use the separating sequence because and is the expected matching reference state of because , and . Therefore, we execute .
-
•
To separate from the basis, we can use the separating sequence because and is the expected matching reference state of because , and . Therefore, we execute .
-
•
-
9.
Next, we apply the promotion rule for because , and . The resulting observation tree looks as follows:
-
10.
Next, we apply extension with and input , resulting in .
-
11.
We perform another round of prioritized separation with the following matching table:
state match 0.857 12/14 8/14 7/14 2/6 2/6 1/6 3/6 1.0 5/9 9/9 5/9 0/5 0/5 0/5 1/5 1.0 3/5 2/5 5/5 0/3 0/3 0/3 1/3 1.0 2/3 1/3 1/3 2/2 2/2 1/2 0/2 -
•
To separate further, we execute .
-
•
To separate further, we execute .
-
•
To separate , we can use the separating sequence because and and are the expected matching reference states of because , and , and . Therefore, we execute .
-
•
To separate , we can use the separating sequence because and and are the expected matching reference states of because , and , and . Therefore, we execute .
The resulting observation tree looks as follows:
-
•
-
12.
We can no longer apply prioritized separation so we continue with standard separation. We use the separating sequences and to separate states. Specifically, we perform the following output queries:
-
•
and to separate from the basis.
-
•
to separate from the basis.
-
•
to separate from the basis.
-
•
and to separate from the basis.
The resulting observation tree looks as follows:
-
•
-
13.
All frontier states are identified and no frontier states are isolated. This means we can possibly perform match refinement or match separation. The matching table looks as follows:
state match 0.834 15/18 10/18 8/18 4/9 3/9 2/9 6/9 1.0 6/11 11/11 5/11 0/7 0/7 2/7 2/7 1.0 4/6 2/6 6/6 0/4 0/4 1/4 2/4 1.0 2/5 2/5 2/5 4/4 3/4 1/4 1/4 Since every basis state is matched with exactly one reference state, match refinement is not applicable. However, we can apply match separation with , , and because , , , , and for all , is not the match. We use because and execute . This leads to the following observation tree:
-
14.
The match separation led to isolation of which we can now add to the basis with the promotion rule.
-
15.
Additionally, we can apply promotion for state because it is the only state with output 2 for input . Notice that we have found all the states in after promoting .
-
16.
Next, we apply the extension rule several times
-
•
and for ,
-
•
and for .
-
•
-
17.
We apply the prioritized separation rule several times
-
•
and for ,
-
•
for ,
-
•
and for ,
-
•
and for ,
-
•
-
18.
Next, we apply separation a few more times.
-
•
and for ,
-
•
and for ,
-
•
for ,
-
•
for ,
This leads to the following final observation tree and matching table:
state match 0.834 15/18 10/18 8/18 12/18 7/18 5/18 14/18 1.0 6/11 11/11 5/11 0/7 0/7 2/7 2/7 1.0 4/6 2/6 6/6 0/4 0/4 1/4 2/4 0.924 2/5 2/5 2/5 12/13 7/13 4/13 5/13 0.889 2/5 2/5 2/5 4/9 8/9 4/9 3/9 0.8 2/5 2/5 2/5 1/5 1/5 4/5 2/5 -
•
-
19.
Next, the equivalence rule is used to construct a hypothesis from the observation tree. This hypothesis is correct so the algorithm terminates.