Search | arXiv e-print repository

Learning EFSM Models with Registers in Guards

Authors: Germán Vega, Roland Groz, Catherine Oriat, Michael Foster, Neil Walkinshaw, Adenilso Simão

Abstract: This paper presents an active inference method for Extended Finite State Machines, where inputs and outputs are parametrized, and transitions can be conditioned by guards involving input parameters and internal variables called registers. The method applies to (software) systems that cannot be reset, so it learns an EFSM model of the system on a single trace. This paper presents an active inference method for Extended Finite State Machines, where inputs and outputs are parametrized, and transitions can be conditioned by guards involving input parameters and internal variables called registers. The method applies to (software) systems that cannot be reset, so it learns an EFSM model of the system on a single trace. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 14 pages (last page blank), 8 figures, 4 algorithms Submitted to LearnAut workshop 2024 (not published)

arXiv:2405.17019 [pdf, other]

doi 10.1145/3660819

Bounding Random Test Set Size with Computational Learning Theory

Authors: Neil Walkinshaw, Michael Foster, Jose Miguel Rojas, Robert M Hierons

Abstract: Random testing approaches work by generating inputs at random, or by selecting inputs randomly from some pre-defined operational profile. One long-standing question that arises in this and other testing contexts is as follows: When can we stop testing? At what point can we be certain that executing further tests in this manner will not explore previously untested (and potentially buggy) software b… ▽ More Random testing approaches work by generating inputs at random, or by selecting inputs randomly from some pre-defined operational profile. One long-standing question that arises in this and other testing contexts is as follows: When can we stop testing? At what point can we be certain that executing further tests in this manner will not explore previously untested (and potentially buggy) software behaviors? This is analogous to the question in Machine Learning, of how many training examples are required in order to infer an accurate model. In this paper we show how probabilistic approaches to answer this question in Machine Learning (arising from Computational Learning Theory) can be applied in our testing context. This enables us to produce an upper bound on the number of tests that are required to achieve a given level of adequacy. We are the first to enable this from only knowing the number of coverage targets (e.g. lines of code) in the source code, without needing to observe a sample test executions. We validate this bound on a large set of Java units, and an autonomous driving system. △ Less

Submitted 24 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

Comments: Preprint, accepted to FSE 2024

arXiv:2209.00357 [pdf, other]

Testing Causality in Scientific Modelling Software

Authors: Andrew G. Clark, Michael Foster, Benedikt Prifling, Neil Walkinshaw, Robert M. Hierons, Volker Schmidt, Robert D. Turner

Abstract: From simulating galaxy formation to viral transmission in a pandemic, scientific models play a pivotal role in develo** scientific theories and supporting government policy decisions that affect us all. Given these critical applications, a poor modelling assumption or bug could have far-reaching consequences. However, scientific models possess several properties that make them notoriously diffic… ▽ More From simulating galaxy formation to viral transmission in a pandemic, scientific models play a pivotal role in develo** scientific theories and supporting government policy decisions that affect us all. Given these critical applications, a poor modelling assumption or bug could have far-reaching consequences. However, scientific models possess several properties that make them notoriously difficult to test, including a complex input space, long execution times, and non-determinism, rendering existing testing techniques impractical. In fields such as epidemiology, where researchers seek answers to challenging causal questions, a statistical methodology known as Causal Inference has addressed similar problems, enabling the inference of causal conclusions from noisy, biased, and sparse data instead of costly experiments. This paper introduces the Causal Testing Framework: a framework that uses Causal Inference techniques to establish causal effects from existing data, enabling users to conduct software testing activities concerning the effect of a change, such as Metamorphic Testing, a posteriori. We present three case studies covering real-world scientific models, demonstrating how the Causal Testing Framework can infer metamorphic test outcomes from reused, confounded test data to provide an efficient solution for testing scientific modelling software. △ Less

Submitted 30 June, 2023; v1 submitted 1 September, 2022; originally announced September 2022.

ACM Class: D.2.5; I.6.4

arXiv:2103.07370 [pdf, other]

doi 10.1016/j.infsof.2021.106567

Test case generation for agent-based models: A systematic literature review

Authors: Andrew G. Clark, Neil Walkinshaw, Robert M. Hierons

Abstract: Agent-based models play an important role in simulating complex emergent phenomena and supporting critical decisions. In this context, a software fault may result in poorly informed decisions that lead to disastrous consequences. The ability to rigorously test these models is therefore essential. In this systematic literature review, we answer five research questions related to the key aspects of… ▽ More Agent-based models play an important role in simulating complex emergent phenomena and supporting critical decisions. In this context, a software fault may result in poorly informed decisions that lead to disastrous consequences. The ability to rigorously test these models is therefore essential. In this systematic literature review, we answer five research questions related to the key aspects of test case generation in agent-based models: What are the information artifacts used to generate tests? How are these tests generated? How is a verdict assigned to a generated test? How is the adequacy of a generated test suite measured? What level of abstraction of an agent-based model is targeted by a generated test? Our results show that whilst the majority of techniques are effective for testing functional requirements at the agent and integration levels of abstraction, there are comparatively few techniques capable of testing society-level behaviour. Additionally, we identify a need for more thorough evaluation using realistic case studies that feature challenging properties associated with a typical agent-based model. △ Less

Submitted 18 March, 2021; v1 submitted 12 March, 2021; originally announced March 2021.

arXiv:2101.04948 [pdf, other]

Deep State Inference: Toward Behavioral Model Inference of Black-box Software Systems

Authors: Foozhan Ataiefard, Mohammad Jafar Mashhadi, Hadi Hemmati, Niel Walkinshaw

Abstract: Many software engineering tasks, such as testing, and anomaly detection can benefit from the ability to infer a behavioral model of the software.Most existing inference approaches assume access to code to collect execution sequences. In this paper, we investigate a black-box scenario, where the system under analysis cannot be instrumented, in this granular fashion.This scenario is particularly pre… ▽ More Many software engineering tasks, such as testing, and anomaly detection can benefit from the ability to infer a behavioral model of the software.Most existing inference approaches assume access to code to collect execution sequences. In this paper, we investigate a black-box scenario, where the system under analysis cannot be instrumented, in this granular fashion.This scenario is particularly prevalent with control systems' log analysis in the form of continuous signals. In this situation, an execution trace amounts to a multivariate time-series of input and output signals, where different states of the system correspond to different `phases` in the time-series. The main challenge is to detect when these phase changes take place. Unfortunately, most existing solutions are either univariate, make assumptions on the data distribution, or have limited learning power.Therefore, we propose a hybrid deep neural network that accepts as input a multivariate time series and applies a set of convolutional and recurrent layers to learn the non-linear correlations between signals and the patterns over time.We show how this approach can be used to accurately detect state changes, and how the inferred models can be successfully applied to transfer-learning scenarios, to accurately process traces from different products with similar execution characteristics. Our experimental results on two UAV autopilot case studies indicate that our approach is highly accurate (over 90% F1 score for state classification) and significantly improves baselines (by up to 102% for change point detection).Using transfer learning we also show that up to 90% of the maximum achievable F1 scores in the open-source case study can be achieved by reusing the trained models from the industrial case and only fine tuning them using as low as 5 labeled samples, which reduces the manual labeling effort by 98%. △ Less

Submitted 12 October, 2021; v1 submitted 13 January, 2021; originally announced January 2021.

Comments: 17 pages,9 figures. arXiv admin note: text overlap with arXiv:2008.11856

arXiv:1610.06089 [pdf, other]

Finding Clustering Configurations to Accurately Infer Packet Structures from Network Data

Authors: Othman Esoul, Neil Walkinshaw

Abstract: Clustering is often used for reverse engineering network protocols from captured network traces. The performance of clustering techniques is often contingent upon the selection of various parameters, which can have a severe impact on clustering quality. In this paper we experimentally investigate the effect of four different parameters with respect to network traces. We also determining the optima… ▽ More Clustering is often used for reverse engineering network protocols from captured network traces. The performance of clustering techniques is often contingent upon the selection of various parameters, which can have a severe impact on clustering quality. In this paper we experimentally investigate the effect of four different parameters with respect to network traces. We also determining the optimal parameter configuration with respect to traces from four different network protocols. Our results indicate that the choice of distance measure and the length of the message has the most substantial impact on cluster accuracy. Depending on the type of protocol, the $n$-gram length can also have a substantial impact. △ Less

Submitted 19 October, 2016; originally announced October 2016.

Comments: 14 pages, 3 figures

arXiv:1608.03181 [pdf, other]

Uncertainty-Driven Black-Box Test Data Generation

Authors: Neil Walkinshaw, Gordon Fraser

Abstract: We can never be certain that a software system is correct simply by testing it, but with every additional successful test we become less uncertain about its correctness. In absence of source code or elaborate specifications and models, tests are usually generated or chosen randomly. However, rather than randomly choosing tests, it would be preferable to choose those tests that decrease our uncerta… ▽ More We can never be certain that a software system is correct simply by testing it, but with every additional successful test we become less uncertain about its correctness. In absence of source code or elaborate specifications and models, tests are usually generated or chosen randomly. However, rather than randomly choosing tests, it would be preferable to choose those tests that decrease our uncertainty about correctness the most. In order to guide test generation, we apply what is referred to in Machine Learning as "Query Strategy Framework": We infer a behavioural model of the system under test and select those tests which the inferred model is "least certain" about. Running these tests on the system under test thus directly targets those parts about which tests so far have failed to inform the model. We provide an implementation that uses a genetic programming engine for model inference in order to enable an uncertainty sampling technique known as "query by committee", and evaluate it on eight subject systems from the Apache Commons Math framework and JodaTime. The results indicate that test generation using uncertainty sampling outperforms conventional and Adaptive Random Testing. △ Less

Submitted 10 August, 2016; originally announced August 2016.

arXiv:1505.07987 [pdf, other]

SEPIA: Search for Proofs Using Inferred Automata

Authors: Thomas Gransden, Neil Walkinshaw, Rajeev Raman

Abstract: This paper describes SEPIA, a tool for automated proof generation in Coq. SEPIA combines model inference with interactive theorem proving. Existing proof corpora are modelled using state-based models inferred from tactic sequences. These can then be traversed automatically to identify proofs. The SEPIA system is described and its performance evaluated on three Coq datasets. Our results show that S… ▽ More This paper describes SEPIA, a tool for automated proof generation in Coq. SEPIA combines model inference with interactive theorem proving. Existing proof corpora are modelled using state-based models inferred from tactic sequences. These can then be traversed automatically to identify proofs. The SEPIA system is described and its performance evaluated on three Coq datasets. Our results show that SEPIA provides a useful complement to existing automated tactics in Coq. △ Less

Submitted 29 May, 2015; originally announced May 2015.

Comments: To appear at 25th International Conference on Automated Deduction

arXiv:1405.3623 [pdf, other]

Mining State-Based Models from Proof Corpora

Authors: Thomas Gransden, Neil Walkinshaw, Rajeev Raman

Abstract: Interactive theorem provers have been used extensively to reason about various software/hardware systems and mathematical theorems. The key challenge when using an interactive prover is finding a suitable sequence of proof steps that will lead to a successful proof requires a significant amount of human intervention. This paper presents an automated technique that takes as input examples of succes… ▽ More Interactive theorem provers have been used extensively to reason about various software/hardware systems and mathematical theorems. The key challenge when using an interactive prover is finding a suitable sequence of proof steps that will lead to a successful proof requires a significant amount of human intervention. This paper presents an automated technique that takes as input examples of successful proofs and infers an Extended Finite State Machine as output. This can in turn be used to generate proofs of new conjectures. Our preliminary experiments show that the inferred models are generally accurate (contain few false-positive sequences) and that representing existing proofs in such a way can be very useful when guiding new ones. △ Less

Submitted 14 May, 2014; originally announced May 2014.

Comments: To Appear at Conferences on Intelligent Computer Mathematics 2014

Showing 1–9 of 9 results for author: Walkinshaw, N