-
Closed-loop Analysis of Vision-based Autonomous Systems: A Case Study
Authors:
Corina S. Pasareanu,
Ravi Mangal,
Divya Gopinath,
Sinem Getir Yaman,
Calum Imrie,
Radu Calinescu,
Huafeng Yu
Abstract:
Deep neural networks (DNNs) are increasingly used in safety-critical autonomous systems as perception components processing high-dimensional image data. Formal analysis of these systems is particularly challenging due to the complexity of the perception DNNs, the sensors (cameras), and the environment conditions. We present a case study applying formal probabilistic analysis techniques to an exper…
▽ More
Deep neural networks (DNNs) are increasingly used in safety-critical autonomous systems as perception components processing high-dimensional image data. Formal analysis of these systems is particularly challenging due to the complexity of the perception DNNs, the sensors (cameras), and the environment conditions. We present a case study applying formal probabilistic analysis techniques to an experimental autonomous system that guides airplanes on taxiways using a perception DNN. We address the above challenges by replacing the camera and the network with a compact probabilistic abstraction built from the confusion matrices computed for the DNN on a representative image data set. We also show how to leverage local, DNN-specific analyses as run-time guards to increase the safety of the overall system. Our findings are applicable to other autonomous systems that use complex DNNs for perception.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
An Overview of Structural Coverage Metrics for Testing Neural Networks
Authors:
Muhammad Usman,
Youcheng Sun,
Divya Gopinath,
Rishi Dange,
Luca Manolache,
Corina S. Pasareanu
Abstract:
Deep neural network (DNN) models, including those used in safety-critical domains, need to be thoroughly tested to ensure that they can reliably perform well in different scenarios. In this article, we provide an overview of structural coverage metrics for testing DNN models, including neuron coverage (NC), k-multisection neuron coverage (kMNC), top-k neuron coverage (TKNC), neuron boundary covera…
▽ More
Deep neural network (DNN) models, including those used in safety-critical domains, need to be thoroughly tested to ensure that they can reliably perform well in different scenarios. In this article, we provide an overview of structural coverage metrics for testing DNN models, including neuron coverage (NC), k-multisection neuron coverage (kMNC), top-k neuron coverage (TKNC), neuron boundary coverage (NBC), strong neuron activation coverage (SNAC) and modified condition/decision coverage (MC/DC). We evaluate the metrics on realistic DNN models used for perception tasks (including LeNet-1, LeNet-4, LeNet-5, and ResNet20) as well as on networks used in autonomy (TaxiNet). We also provide a tool, DNNCov, which can measure the testing coverage for all these metrics. DNNCov outputs an informative coverage report to enable researchers and practitioners to assess the adequacy of DNN testing, compare different coverage measures, and to more conveniently inspect the model's internals during testing.
△ Less
Submitted 5 August, 2022;
originally announced August 2022.
-
Assume, Guarantee or Repair -- A Regular Framework for Non Regular Properties (full version)
Authors:
Hadar Frenkel,
Orna Grumberg,
Corina S. Pasareanu,
Sarai Sheinvald
Abstract:
We present Assume-Guarantee-Repair (AGR) - a novel framework which verifies that a program satisfies a set of properties and also repairs the program in case the verification fails. We consider communicating programs - these are simple C-like programs, extended with synchronous actions over communication channels. Our method, which consists of a learning-based approach to assume-guarantee reasonin…
▽ More
We present Assume-Guarantee-Repair (AGR) - a novel framework which verifies that a program satisfies a set of properties and also repairs the program in case the verification fails. We consider communicating programs - these are simple C-like programs, extended with synchronous actions over communication channels. Our method, which consists of a learning-based approach to assume-guarantee reasoning, performs verification and repair simultaneously: in every iteration, AGR either makes another step towards proving that the (current) system satisfies the required properties, or alters the system in a way that brings it closer to satisfying the properties. To handle infinite-state systems we build finite abstractions, for which we check the satisfaction of complex properties that contain first-order constraints, using both syntactic and semantic-aware methods. We implemented AGR and evaluated it on various communication protocols. Our experiments present compact proofs of correctness and quick repairs.
△ Less
Submitted 21 July, 2022;
originally announced July 2022.
-
VPN: Verification of Poisoning in Neural Networks
Authors:
Youcheng Sun,
Muhammad Usman,
Divya Gopinath,
Corina S. Păsăreanu
Abstract:
Neural networks are successfully used in a variety of applications, many of them having safety and security concerns. As a result researchers have proposed formal verification techniques for verifying neural network properties. While previous efforts have mainly focused on checking local robustness in neural networks, we instead study another neural network security issue, namely data poisoning. I…
▽ More
Neural networks are successfully used in a variety of applications, many of them having safety and security concerns. As a result researchers have proposed formal verification techniques for verifying neural network properties. While previous efforts have mainly focused on checking local robustness in neural networks, we instead study another neural network security issue, namely data poisoning. In this case an attacker inserts a trigger into a subset of the training data, in such a way that at test time, this trigger in an input causes the trained model to misclassify to some target class. We show how to formulate the check for data poisoning as a property that can be checked with off-the-shelf verification tools, such as Marabou and nneum, where counterexamples of failed checks constitute the triggers. We further show that the discovered triggers are `transferable' from a small model to a larger, better-trained model, allowing us to analyze state-of-the art performant models trained for image classification tasks.
△ Less
Submitted 8 May, 2022;
originally announced May 2022.
-
AntidoteRT: Run-time Detection and Correction of Poison Attacks on Neural Networks
Authors:
Muhammad Usman,
Youcheng Sun,
Divya Gopinath,
Corina S. Pasareanu
Abstract:
We study backdoor poisoning attacks against image classification networks, whereby an attacker inserts a trigger into a subset of the training data, in such a way that at test time, this trigger causes the classifier to predict some target class. %There are several techniques proposed in the literature that aim to detect the attack but only a few also propose to defend against it, and they typical…
▽ More
We study backdoor poisoning attacks against image classification networks, whereby an attacker inserts a trigger into a subset of the training data, in such a way that at test time, this trigger causes the classifier to predict some target class. %There are several techniques proposed in the literature that aim to detect the attack but only a few also propose to defend against it, and they typically involve retraining the network which is not always possible in practice. We propose lightweight automated detection and correction techniques against poisoning attacks, which are based on neuron patterns mined from the network using a small set of clean and poisoned test samples with known labels. The patterns built based on the mis-classified samples are used for run-time detection of new poisoned inputs. For correction, we propose an input correction technique that uses a differential analysis to identify the trigger in the detected poisoned images, which is then reset to a neutral color. Our detection and correction are performed at run-time and input level, which is in contrast to most existing work that is focused on offline model-level defenses. We demonstrate that our technique outperforms existing defenses such as NeuralCleanse and STRIP on popular benchmarks such as MNIST, CIFAR-10, and GTSRB against the popular BadNets attack and the more complex DFST attack.
△ Less
Submitted 31 January, 2022;
originally announced February 2022.
-
QuantifyML: How Good is my Machine Learning Model?
Authors:
Muhammad Usman,
Divya Gopinath,
Corina S. Păsăreanu
Abstract:
The efficacy of machine learning models is typically determined by computing their accuracy on test data sets. However, this may often be misleading, since the test data may not be representative of the problem that is being studied. With QuantifyML we aim to precisely quantify the extent to which machine learning models have learned and generalized from the given data. Given a trained model, Quan…
▽ More
The efficacy of machine learning models is typically determined by computing their accuracy on test data sets. However, this may often be misleading, since the test data may not be representative of the problem that is being studied. With QuantifyML we aim to precisely quantify the extent to which machine learning models have learned and generalized from the given data. Given a trained model, QuantifyML translates it into a C program and feeds it to the CBMC model checker to produce a formula in Conjunctive Normal Form (CNF). The formula is analyzed with off-the-shelf model counters to obtain precise counts with respect to different model behavior. QuantifyML enables i) evaluating learnability by comparing the counts for the outputs to ground truth, expressed as logical predicates, ii) comparing the performance of models built with different machine learning algorithms (decision-trees vs. neural networks), and iii) quantifying the safety and robustness of models.
△ Less
Submitted 24 October, 2021;
originally announced October 2021.
-
DeepCert: Verification of Contextually Relevant Robustness for Neural Network Image Classifiers
Authors:
Colin Paterson,
Haoze Wu,
John Grese,
Radu Calinescu,
Corina S. Pasareanu,
Clark Barrett
Abstract:
We introduce DeepCert, a tool-supported method for verifying the robustness of deep neural network (DNN) image classifiers to contextually relevant perturbations such as blur, haze, and changes in image contrast. While the robustness of DNN classifiers has been the subject of intense research in recent years, the solutions delivered by this research focus on verifying DNN robustness to small pertu…
▽ More
We introduce DeepCert, a tool-supported method for verifying the robustness of deep neural network (DNN) image classifiers to contextually relevant perturbations such as blur, haze, and changes in image contrast. While the robustness of DNN classifiers has been the subject of intense research in recent years, the solutions delivered by this research focus on verifying DNN robustness to small perturbations in the images being classified, with perturbation magnitude measured using established Lp norms. This is useful for identifying potential adversarial attacks on DNN image classifiers, but cannot verify DNN robustness to contextually relevant image perturbations, which are typically not small when expressed with Lp norms. DeepCert addresses this underexplored verification problem by supporting:(1) the encoding of real-world image perturbations; (2) the systematic evaluation of contextually relevant DNN robustness, using both testing and formal verification; (3) the generation of contextually relevant counterexamples; and, through these, (4) the selection of DNN image classifiers suitable for the operational context (i)envisaged when a potentially safety-critical system is designed, or (ii)observed by a deployed system. We demonstrate the effectiveness of DeepCert by showing how it can be used to verify the robustness of DNN image classifiers build for two benchmark datasets (`German Traffic Sign' and `CIFAR-10') to multiple contextually relevant perturbations.
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
Property Inference for Deep Neural Networks
Authors:
Divya Gopinath,
Hayes Converse,
Corina S. Pasareanu,
Ankur Taly
Abstract:
We present techniques for automatically inferring formal properties of feed-forward neural networks. We observe that a significant part (if not all) of the logic of feed forward networks is captured in the activation status ('on' or 'off') of its neurons. We propose to extract patterns based on neuron decisions as preconditions that imply certain desirable output property e.g., the prediction bein…
▽ More
We present techniques for automatically inferring formal properties of feed-forward neural networks. We observe that a significant part (if not all) of the logic of feed forward networks is captured in the activation status ('on' or 'off') of its neurons. We propose to extract patterns based on neuron decisions as preconditions that imply certain desirable output property e.g., the prediction being a certain class. We present techniques to extract input properties, encoding convex predicates on the input space that imply given output properties and layer properties, representing network properties captured in the hidden layers that imply the desired output behavior. We apply our techniques on networks for the MNIST and ACASXU applications. Our experiments highlight the use of the inferred properties in a variety of tasks, such as explaining predictions, providing robustness guarantees, simplifying proofs, and network distillation.
△ Less
Submitted 10 September, 2020; v1 submitted 29 April, 2019;
originally announced April 2019.
-
DifFuzz: Differential Fuzzing for Side-Channel Analysis
Authors:
Shirin Nilizadeh,
Yannic Noller,
Corina S. Pasareanu
Abstract:
Side-channel attacks allow an adversary to uncover secret program data by observing the behavior of a program with respect to a resource, such as execution time, consumed memory or response size. Side-channel vulnerabilities are difficult to reason about as they involve analyzing the correlations between resource usage over multiple program paths. We present DifFuzz, a fuzzing-based approach for d…
▽ More
Side-channel attacks allow an adversary to uncover secret program data by observing the behavior of a program with respect to a resource, such as execution time, consumed memory or response size. Side-channel vulnerabilities are difficult to reason about as they involve analyzing the correlations between resource usage over multiple program paths. We present DifFuzz, a fuzzing-based approach for detecting side-channel vulnerabilities related to time and space. DifFuzz automatically detects these vulnerabilities by analyzing two versions of the program and using resource-guided heuristics to find inputs that maximize the difference in resource consumption between secret-dependent paths. The methodology of DifFuzz is general and can be applied to programs written in any language. For this paper, we present an implementation that targets analysis of Java programs, and uses and extends the Kelinci and AFL fuzzers. We evaluate DifFuzz on a large number of Java programs and demonstrate that it can reveal unknown side-channel vulnerabilities in popular applications. We also show that DifFuzz compares favorably against Blazer and Themis, two state-of-the-art analysis tools for finding side-channels in Java programs.
△ Less
Submitted 26 February, 2019; v1 submitted 16 November, 2018;
originally announced November 2018.
-
Compositional Verification for Autonomous Systems with Deep Learning Components
Authors:
Corina S. Pasareanu,
Divya Gopinath,
Huafeng Yu
Abstract:
As autonomy becomes prevalent in many applications, ranging from recommendation systems to fully autonomous vehicles, there is an increased need to provide safety guarantees for such systems. The problem is difficult, as these are large, complex systems which operate in uncertain environments, requiring data-driven machine-learning components. However, learning techniques such as Deep Neural Netwo…
▽ More
As autonomy becomes prevalent in many applications, ranging from recommendation systems to fully autonomous vehicles, there is an increased need to provide safety guarantees for such systems. The problem is difficult, as these are large, complex systems which operate in uncertain environments, requiring data-driven machine-learning components. However, learning techniques such as Deep Neural Networks, widely used today, are inherently unpredictable and lack the theoretical foundations to provide strong assurance guarantees. We present a compositional approach for the scalable, formal verification of autonomous systems that contain Deep Neural Network components. The approach uses assume-guarantee reasoning whereby {\em contracts}, encoding the input-output behavior of individual components, allow the designer to model and incorporate the behavior of the learning-enabled components working side-by-side with the other components. We illustrate the approach on an example taken from the autonomous vehicles domain.
△ Less
Submitted 18 October, 2018;
originally announced October 2018.
-
Symbolic Execution for Deep Neural Networks
Authors:
Divya Gopinath,
Kaiyuan Wang,
Mengshi Zhang,
Corina S. Pasareanu,
Sarfraz Khurshid
Abstract:
Deep Neural Networks (DNN) are increasingly used in a variety of applications, many of them with substantial safety and security concerns. This paper introduces DeepCheck, a new approach for validating DNNs based on core ideas from program analysis, specifically from symbolic execution. The idea is to translate a DNN into an imperative program, thereby enabling program analysis to assist with DNN…
▽ More
Deep Neural Networks (DNN) are increasingly used in a variety of applications, many of them with substantial safety and security concerns. This paper introduces DeepCheck, a new approach for validating DNNs based on core ideas from program analysis, specifically from symbolic execution. The idea is to translate a DNN into an imperative program, thereby enabling program analysis to assist with DNN validation. A basic translation however creates programs that are very complex to analyze. DeepCheck introduces novel techniques for lightweight symbolic analysis of DNNs and applies them in the context of image classification to address two challenging problems in DNN analysis: 1) identification of important pixels (for attribution and adversarial generation); and 2) creation of 1-pixel and 2-pixel attacks. Experimental results using the MNIST data-set show that DeepCheck's lightweight symbolic analysis provides a valuable tool for DNN validation.
△ Less
Submitted 27 July, 2018;
originally announced July 2018.
-
Badger: Complexity Analysis with Fuzzing and Symbolic Execution
Authors:
Yannic Noller,
Rody Kersten,
Corina S. Păsăreanu
Abstract:
Hybrid testing approaches that involve fuzz testing and symbolic execution have shown promising results in achieving high code coverage, uncovering subtle errors and vulnerabilities in a variety of software applications. In this paper we describe Badger - a new hybrid approach for complexity analysis, with the goal of discovering vulnerabilities which occur when the worst-case time or space comple…
▽ More
Hybrid testing approaches that involve fuzz testing and symbolic execution have shown promising results in achieving high code coverage, uncovering subtle errors and vulnerabilities in a variety of software applications. In this paper we describe Badger - a new hybrid approach for complexity analysis, with the goal of discovering vulnerabilities which occur when the worst-case time or space complexity of an application is significantly higher than the average case. Badger uses fuzz testing to generate a diverse set of inputs that aim to increase not only coverage but also a resource-related cost associated with each path. Since fuzzing may fail to execute deep program paths due to its limited knowledge about the conditions that influence these paths, we complement the analysis with a symbolic execution, which is also customized to search for paths that increase the resource-related cost. Symbolic execution is particularly good at generating inputs that satisfy various program conditions but by itself suffers from path explosion. Therefore, Badger uses fuzzing and symbolic execution in tandem, to leverage their benefits and overcome their weaknesses. We implemented our approach for the analysis of Java programs, based on Kelinci and Symbolic PathFinder. We evaluated Badger on Java applications, showing that our approach is significantly faster in generating worst-case executions compared to fuzzing or symbolic execution on their own.
△ Less
Submitted 8 June, 2018;
originally announced June 2018.
-
DeepSafe: A Data-driven Approach for Checking Adversarial Robustness in Neural Networks
Authors:
Divya Gopinath,
Guy Katz,
Corina S. Pasareanu,
Clark Barrett
Abstract:
Deep neural networks have become widely used, obtaining remarkable results in domains such as computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, and bio-informatics, where they have produced results comparable to human experts. However, these networks can be easily fooled by adversarial perturbations: minimal changes…
▽ More
Deep neural networks have become widely used, obtaining remarkable results in domains such as computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, and bio-informatics, where they have produced results comparable to human experts. However, these networks can be easily fooled by adversarial perturbations: minimal changes to correctly-classified inputs, that cause the network to mis-classify them. This phenomenon represents a concern for both safety and security, but it is currently unclear how to measure a network's robustness against such perturbations. Existing techniques are limited to checking robustness around a few individual input points, providing only very limited guarantees. We propose a novel approach for automatically identifying safe regions of the input space, within which the network is robust against adversarial perturbations. The approach is data-guided, relying on clustering to identify well-defined geometric regions as candidate safe regions. We then utilize verification techniques to confirm that these regions are safe or to provide counter-examples showing that they are not safe. We also introduce the notion of targeted robustness which, for a given target label and region, ensures that a NN does not map any input in the region to the target label. We evaluated our technique on the MNIST dataset and on a neural network implementation of a controller for the next-generation Airborne Collision Avoidance System for unmanned aircraft (ACAS Xu). For these networks, our approach identified multiple regions which were completely safe as well as some which were only safe for specific labels. It also discovered several adversarial perturbations of interest.
△ Less
Submitted 30 January, 2020; v1 submitted 2 October, 2017;
originally announced October 2017.
-
Abstraction and Learning for Infinite-State Compositional Verification
Authors:
Dimitra Giannakopoulou,
Corina S. Păsăreanu
Abstract:
Despite many advances that enable the application of model checking techniques to the verification of large systems, the state-explosion problem remains the main challenge for scalability. Compositional verification addresses this challenge by decomposing the verification of a large system into the verification of its components. Recent techniques use learning-based approaches to automate composit…
▽ More
Despite many advances that enable the application of model checking techniques to the verification of large systems, the state-explosion problem remains the main challenge for scalability. Compositional verification addresses this challenge by decomposing the verification of a large system into the verification of its components. Recent techniques use learning-based approaches to automate compositional verification based on the assume-guarantee style reasoning. However, these techniques are only applicable to finite-state systems. In this work, we propose a new framework that interleaves abstraction and learning to perform automated compositional verification of infinite-state systems. We also discuss the role of learning and abstraction in the related context of interface generation for infinite-state components.
△ Less
Submitted 19 September, 2013;
originally announced September 2013.
-
Learning Probabilistic Systems from Tree Samples
Authors:
Anvesh Komuravelli,
Corina S. Pasareanu,
Edmund M. Clarke
Abstract:
We consider the problem of learning a non-deterministic probabilistic system consistent with a given finite set of positive and negative tree samples. Consistency is defined with respect to strong simulation conformance. We propose learning algorithms that use traditional and a new "stochastic" state-space partitioning, the latter resulting in the minimum number of states. We then use them to solv…
▽ More
We consider the problem of learning a non-deterministic probabilistic system consistent with a given finite set of positive and negative tree samples. Consistency is defined with respect to strong simulation conformance. We propose learning algorithms that use traditional and a new "stochastic" state-space partitioning, the latter resulting in the minimum number of states. We then use them to solve the problem of "active learning", that uses a knowledgeable teacher to generate samples as counterexamples to simulation equivalence queries. We show that the problem is undecidable in general, but that it becomes decidable under a suitable condition on the teacher which comes naturally from the way samples are generated from failed simulation checks. The latter problem is shown to be undecidable if we impose an additional condition on the learner to always conjecture a "minimum state" hypothesis. We therefore propose a semi-algorithm using stochastic partitions. Finally, we apply the proposed (semi-) algorithms to infer intermediate assumptions in an automated assume-guarantee verification framework for probabilistic systems.
△ Less
Submitted 20 July, 2012;
originally announced July 2012.
-
Assume-Guarantee Abstraction Refinement for Probabilistic Systems
Authors:
Anvesh Komuravelli,
Corina S. Pasareanu,
Edmund M. Clarke
Abstract:
We describe an automated technique for assume-guarantee style checking of strong simulation between a system and a specification, both expressed as non-deterministic Labeled Probabilistic Transition Systems (LPTSes). We first characterize counterexamples to strong simulation as "stochastic" trees and show that simpler structures are insufficient. Then, we use these trees in an abstraction refineme…
▽ More
We describe an automated technique for assume-guarantee style checking of strong simulation between a system and a specification, both expressed as non-deterministic Labeled Probabilistic Transition Systems (LPTSes). We first characterize counterexamples to strong simulation as "stochastic" trees and show that simpler structures are insufficient. Then, we use these trees in an abstraction refinement algorithm that computes the assumptions for assume-guarantee reasoning as conservative LPTS abstractions of some of the system components. The abstractions are automatically refined based on tree counterexamples obtained from failed simulation checks with the remaining components. We have implemented the algorithms for counterexample generation and assume-guarantee abstraction refinement and report encouraging results.
△ Less
Submitted 20 July, 2012;
originally announced July 2012.
-
Predicate Abstraction with Under-approximation Refinement
Authors:
Corina S. Pasareanu,
Radek Pelanek,
Willem Visser
Abstract:
We propose an abstraction-based model checking method which relies on refinement of an under-approximation of the feasible behaviors of the system under analysis. The method preserves errors to safety properties, since all analyzed behaviors are feasible by definition. The method does not require an abstract transition relation to be generated, but instead executes the concrete transitions while…
▽ More
We propose an abstraction-based model checking method which relies on refinement of an under-approximation of the feasible behaviors of the system under analysis. The method preserves errors to safety properties, since all analyzed behaviors are feasible by definition. The method does not require an abstract transition relation to be generated, but instead executes the concrete transitions while storing abstract versions of the concrete states, as specified by a set of abstraction predicates. For each explored transition the method checks, with the help of a theorem prover, whether there is any loss of precision introduced by abstraction. The results of these checks are used to decide termination or to refine the abstraction by generating new abstraction predicates. If the (possibly infinite) concrete system under analysis has a finite bisimulation quotient, then the method is guaranteed to eventually explore an equivalent finite bisimilar structure. We illustrate the application of the approach for checking concurrent programs.
△ Less
Submitted 26 February, 2007; v1 submitted 22 January, 2007;
originally announced January 2007.