Search | arXiv e-print repository

Artificial intelligence in digital pathology: a diagnostic test accuracy systematic review and meta-analysis

Authors: Clare McGenity, Emily L Clarke, Charlotte Jennings, Gillian Matthews, Caroline Cartlidge, Henschel Freduah-Agyemang, Deborah D Stocken, Darren Treanor

Abstract: Ensuring diagnostic performance of AI models before clinical use is key to the safe and successful adoption of these technologies. Studies reporting AI applied to digital pathology images for diagnostic purposes have rapidly increased in number in recent years. The aim of this work is to provide an overview of the diagnostic accuracy of AI in digital pathology images from all areas of pathology. T… ▽ More Ensuring diagnostic performance of AI models before clinical use is key to the safe and successful adoption of these technologies. Studies reporting AI applied to digital pathology images for diagnostic purposes have rapidly increased in number in recent years. The aim of this work is to provide an overview of the diagnostic accuracy of AI in digital pathology images from all areas of pathology. This systematic review and meta-analysis included diagnostic accuracy studies using any type of artificial intelligence applied to whole slide images (WSIs) in any disease type. The reference standard was diagnosis through histopathological assessment and / or immunohistochemistry. Searches were conducted in PubMed, EMBASE and CENTRAL in June 2022. We identified 2976 studies, of which 100 were included in the review and 48 in the full meta-analysis. Risk of bias and concerns of applicability were assessed using the QUADAS-2 tool. Data extraction was conducted by two investigators and meta-analysis was performed using a bivariate random effects model. 100 studies were identified for inclusion, equating to over 152,000 whole slide images (WSIs) and representing many disease types. Of these, 48 studies were included in the meta-analysis. These studies reported a mean sensitivity of 96.3% (CI 94.1-97.7) and mean specificity of 93.3% (CI 90.5-95.4) for AI. There was substantial heterogeneity in study design and all 100 studies identified for inclusion had at least one area at high or unclear risk of bias. This review provides a broad overview of AI performance across applications in whole slide imaging. However, there is huge variability in study design and available performance data, with details around the conduct of the study and make up of the datasets frequently missing. Overall, AI offers good accuracy when applied to WSIs but requires more rigorous evaluation of its performance. △ Less

Submitted 19 June, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

Comments: 26 pages, 5 figures, 8 tables + Supplementary materials

ACM Class: I.2.1

arXiv:2204.12934 [pdf, other]

An Iterative Labeling Method for Annotating Fisheries Imagery

Authors: Zhiyong Zhang, Pushyami Kaveti, Hanumant Singh, Abigail Powell, Erica Fruh, M. Elizabeth Clarke

Abstract: In this paper, we present a methodology for fisheries-related data that allows us to converge on a labeled image dataset by iterating over the dataset with multiple training and production loops that can exploit crowdsourcing interfaces. We present our algorithm and its results on two separate sets of image data collected using the Seabed autonomous underwater vehicle. The first dataset comprises… ▽ More In this paper, we present a methodology for fisheries-related data that allows us to converge on a labeled image dataset by iterating over the dataset with multiple training and production loops that can exploit crowdsourcing interfaces. We present our algorithm and its results on two separate sets of image data collected using the Seabed autonomous underwater vehicle. The first dataset comprises of 2,026 completely unlabeled images, while the second consists of 21,968 images that were point annotated by experts. Our results indicate that training with a small subset and iterating on that to build a larger set of labeled data allows us to converge to a fully annotated dataset with a small number of iterations. Even in the case of a dataset labeled by experts, a single iteration of the methodology improves the labels by discovering additional complicated examples of labels associated with fish that overlap, are very small, or obscured by the contrast limitations associated with underwater imagery. △ Less

Submitted 8 June, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

arXiv:2202.11524 [pdf, other]

Weakly-supervised learning for image-based classification of primary melanomas into genomic immune subgroups

Authors: Lucy Godson, Navid Alemi, Jeremie Nsengimana, Graham P. Cook, Emily L. Clarke, Darren Treanor, D. Timothy Bishop, Julia Newton-Bishop, Ali Gooya

Abstract: Determining early-stage prognostic markers and stratifying patients for effective treatment are two key challenges for improving outcomes for melanoma patients. Previous studies have used tumour transcriptome data to stratify patients into immune subgroups, which were associated with differential melanoma specific survival and potential treatment strategies. However, acquiring transcriptome data i… ▽ More Determining early-stage prognostic markers and stratifying patients for effective treatment are two key challenges for improving outcomes for melanoma patients. Previous studies have used tumour transcriptome data to stratify patients into immune subgroups, which were associated with differential melanoma specific survival and potential treatment strategies. However, acquiring transcriptome data is a time-consuming and costly process. Moreover, it is not routinely used in the current clinical workflow. Here we attempt to overcome this by develo** deep learning models to classify gigapixel H&E stained pathology slides, which are well established in clinical workflows, into these immune subgroups. Previous subty** approaches have employed supervised learning which requires fully annotated data, or have only examined single genetic mutations in melanoma patients. We leverage a multiple-instance learning approach, which only requires slide-level labels and uses an attention mechanism to highlight regions of high importance to the classification. Moreover, we show that pathology-specific self-supervised models generate better representations compared to pathology-agnostic models for improving our model performance, achieving a mean AUC of 0.76 for classifying histopathology images as high or low immune subgroups. We anticipate that this method may allow us to find new biomarkers of high importance and could act as a tool for clinicians to infer the immune landscape of tumours and stratify patients, without needing to carry out additional expensive genetic tests. △ Less

Submitted 23 February, 2022; originally announced February 2022.

Comments: 8 pages (without appendices), 2 figures, submitted to MIDL 2022 conference proceedings

arXiv:2107.07368 [pdf, other]

A note on hyperopic cops and robber

Authors: Nancy E. Clarke, Stephen Finbow, Margaret-Ellen Messinger, Amanda Porter

Abstract: We explore a variant of the game of Cops and Robber introduced by Bonato et al.~where the robber is invisible unless outside the common neighbourhood of the cops. The hyperopic cop number is analogous to the cop number and we investigate bounds on this quantity. We define a small common neighbourhood set and relate the minimum cardinality of this graph parameter to the hyperopic cop number. We con… ▽ More We explore a variant of the game of Cops and Robber introduced by Bonato et al.~where the robber is invisible unless outside the common neighbourhood of the cops. The hyperopic cop number is analogous to the cop number and we investigate bounds on this quantity. We define a small common neighbourhood set and relate the minimum cardinality of this graph parameter to the hyperopic cop number. We consider diameter 2 graphs, particularly the join of two graphs, as well as Cartesian products. △ Less

Submitted 15 July, 2021; originally announced July 2021.

MSC Class: 05C57

arXiv:1910.14200 [pdf, ps, other]

doi 10.1016/j.dam.2020.06.019

Cops that surround a robber

Authors: Andrea C. Burgess, Rosalind A. Cameron, Nancy E. Clarke, Peter Danziger, Stephen Finbow, Caleb W. Jones, David A. Pike

Abstract: We introduce the game of Surrounding Cops and Robbers on a graph, as a variant of the original game of Cops and Robbers. In contrast to the original game in which the cops win by occupying the same vertex as the robber, they now win by occupying each of the robber's neighbouring vertices. We denote by $σ(G)$ the {\em surrounding cop number} of $G$, namely the least number of cops required to surro… ▽ More We introduce the game of Surrounding Cops and Robbers on a graph, as a variant of the original game of Cops and Robbers. In contrast to the original game in which the cops win by occupying the same vertex as the robber, they now win by occupying each of the robber's neighbouring vertices. We denote by $σ(G)$ the {\em surrounding cop number} of $G$, namely the least number of cops required to surround a robber in the graph $G$. We present a number of results regarding this parameter, including general bounds as well as exact values for several classes of graphs. Particular classes of interest include product graphs, graphs arising from combinatorial designs, and generalised Petersen graphs. △ Less

Submitted 5 August, 2021; v1 submitted 30 October, 2019; originally announced October 2019.

arXiv:1710.10112 [pdf, ps, other]

Hyperopic Cops and Robbers

Authors: A. Bonato, N. E. Clarke, D. Cox, S. Finbow, F. Mc Inerney, M. E. Messinger

Abstract: We introduce a new variant of the game of Cops and Robbers played on graphs, where the robber is invisible unless outside the neighbor set of a cop. The hyperopic cop number is the corresponding analogue of the cop number, and we investigate bounds and other properties of this parameter. We characterize the cop-win graphs for this variant, along with graphs with the largest possible hyperopic cop… ▽ More We introduce a new variant of the game of Cops and Robbers played on graphs, where the robber is invisible unless outside the neighbor set of a cop. The hyperopic cop number is the corresponding analogue of the cop number, and we investigate bounds and other properties of this parameter. We characterize the cop-win graphs for this variant, along with graphs with the largest possible hyperopic cop number. We analyze the cases of graphs with diameter 2 or at least 3, focusing on when the hyperopic cop number is at most one greater than the cop number. We show that for planar graphs, as with the usual cop number, the hyperopic cop number is at most 3. The hyperopic cop number is considered for countable graphs, and it is shown that for connected chains of graphs, the hyperopic cop density can be any real number in $[0,1/2].$ △ Less

Submitted 27 October, 2017; originally announced October 2017.

arXiv:1708.07179 [pdf, other]

Limited Visibility Cops and Robbers

Authors: N. E. Clarke, D. Cox, C. Duffy, D. Dyer, S. Fitzpatrick, M. E. Messinger

Abstract: We consider a variation of the Cops and Robber game where the cops can only see the robber when the distance between them is at most a fixed parameter $\ell$. We consider the basic consequences of this definition for some simple graph families, and show that this model is not monotonic, unlike common models where the robber is invisible. We see that cops' strategy consists of a phase in which they… ▽ More We consider a variation of the Cops and Robber game where the cops can only see the robber when the distance between them is at most a fixed parameter $\ell$. We consider the basic consequences of this definition for some simple graph families, and show that this model is not monotonic, unlike common models where the robber is invisible. We see that cops' strategy consists of a phase in which they need to "see" the robber (move within distance $\ell$ of the robber), followed by a phase in which they capture the robber. In some graphs the first phase is the most resource intensive phase (in terms of number of cops needed), while in other graphs, it is the second phase. Finally, we characterize those trees for which $k$ cops are sufficient to guarantee capture of the robber for all $\ell \ge 1$. △ Less

Submitted 23 August, 2017; originally announced August 2017.

Comments: 22 pages

MSC Class: 49N75

arXiv:1511.07163 [pdf, other]

Optimizing Solution Quality in Synchronization Synthesis

Authors: Pavol Černý, Edmund M. Clarke, Thomas A. Henzinger, Arjun Radhakrishna, Leonid Ryzhyk, Roopsha Samanta, Thorsten Tarrach

Abstract: Given a multithreaded program written assuming a friendly, non-preemptive scheduler, the goal of synchronization synthesis is to automatically insert synchronization primitives to ensure that the modified program behaves correctly, even with a preemptive scheduler. In this work, we focus on the quality of the synthesized solution: we aim to infer synchronization placements that not only ensure cor… ▽ More Given a multithreaded program written assuming a friendly, non-preemptive scheduler, the goal of synchronization synthesis is to automatically insert synchronization primitives to ensure that the modified program behaves correctly, even with a preemptive scheduler. In this work, we focus on the quality of the synthesized solution: we aim to infer synchronization placements that not only ensure correctness, but also meet some quantitative objectives such as optimal program performance on a given computing platform. The key step that enables solution optimization is the construction of a set of global constraints over synchronization placements such that each model of the constraints set corresponds to a correctness-ensuring synchronization placement. We extract the global constraints from generalizations of counterexample traces and the control-flow graph of the program. The global constraints enable us to choose from among the encoded synchronization solutions using an objective function. We consider two types of objective functions: ones that are solely dependent on the program (e.g., minimizing the size of critical sections) and ones that are also dependent on the computing platform. For the latter, given a program and a computing platform, we construct a performance model based on measuring average contention for critical sections and the average time taken to acquire and release a lock under a given average contention. We empirically evaluated that our approach scales to typical module sizes of many real world concurrent programs such as device drivers and multithreaded servers, and that the performance predictions match reality. To the best of our knowledge, this is the first comprehensive approach for optimizing the placement of synthesized synchronization. △ Less

Submitted 23 November, 2015; originally announced November 2015.

arXiv:1509.07979 [pdf, other]

Anomaly Detection in Unstructured Environments using Bayesian Nonparametric Scene Modeling

Authors: Yogesh Girdhar, Walter Cho, Matthew Campbell, Jesus Pineda, Elizabeth Clarke, Hanumant Singh

Abstract: This paper explores the use of a Bayesian non-parametric topic modeling technique for the purpose of anomaly detection in video data. We present results from two experiments. The first experiment shows that the proposed technique is automatically able characterize the underlying terrain, and detect anomalous flora in image data collected by an underwater robot. The second experiment shows that the… ▽ More This paper explores the use of a Bayesian non-parametric topic modeling technique for the purpose of anomaly detection in video data. We present results from two experiments. The first experiment shows that the proposed technique is automatically able characterize the underlying terrain, and detect anomalous flora in image data collected by an underwater robot. The second experiment shows that the same technique can be used on images from a static camera in a dynamic unstructured environment. In the second dataset, consisting of video data from a static seafloor camera capturing images of a busy coral reef, the proposed technique was able to detect all three instances of an underwater vehicle passing in front of the camera, amongst many other observations of fishes, debris, lighting changes due to surface waves, and benthic flora. △ Less

Submitted 15 February, 2016; v1 submitted 26 September, 2015; originally announced September 2015.

Comments: 6 pages, ICRA 2016

arXiv:1505.04533 [pdf, ps, other]

From Non-preemptive to Preemptive Scheduling using Synchronization Synthesis

Authors: Pavol Černý, Edmund M. Clarke, Thomas A. Henzinger, Arjun Radhakrishna, Leonid Ryzhyk, Roopsha Samanta, Thorsten Tarrach

Abstract: We present a computer-aided programming approach to concurrency. The approach allows programmers to program assuming a friendly, non-preemptive scheduler, and our synthesis procedure inserts synchronization to ensure that the final program works even with a preemptive scheduler. The correctness specification is implicit, inferred from the non-preemptive behavior. Let us consider sequences of calls… ▽ More We present a computer-aided programming approach to concurrency. The approach allows programmers to program assuming a friendly, non-preemptive scheduler, and our synthesis procedure inserts synchronization to ensure that the final program works even with a preemptive scheduler. The correctness specification is implicit, inferred from the non-preemptive behavior. Let us consider sequences of calls that the program makes to an external interface. The specification requires that any such sequence produced under a preemptive scheduler should be included in the set of such sequences produced under a non-preemptive scheduler. The solution is based on a finitary abstraction, an algorithm for bounded language inclusion modulo an independence relation, and rules for inserting synchronization. We apply the approach to device-driver programming, where the driver threads call the software interface of the device and the API provided by the operating system. Our experiments demonstrate that our synthesis method is precise and efficient, and, since it does not require explicit specifications, is more practical than the conventional approach based on user-provided assertions. △ Less

Submitted 18 May, 2015; originally announced May 2015.

Comments: Liss is published as open-source at https://github.com/thorstent/Liss, Computer Aided Verification 2015

arXiv:1410.7346 [pdf, other]

doi 10.1145/2728606.2728634

Towards Personalized Prostate Cancer Therapy Using Delta-Reachability Analysis

Authors: Bing Liu, Soonho Kong, Sicun Gao, Paolo Zuliani, Edmund M. Clarke

Abstract: Recent clinical studies suggest that the efficacy of hormone therapy for prostate cancer depends on the characteristics of individual patients. In this paper, we develop a computational framework for identifying patient-specific androgen ablation therapy schedules for postponing the potential cancer relapse. We model the population dynamics of heterogeneous prostate cancer cells in response to and… ▽ More Recent clinical studies suggest that the efficacy of hormone therapy for prostate cancer depends on the characteristics of individual patients. In this paper, we develop a computational framework for identifying patient-specific androgen ablation therapy schedules for postponing the potential cancer relapse. We model the population dynamics of heterogeneous prostate cancer cells in response to androgen suppression as a nonlinear hybrid automaton. We estimate personalized kinetic parameters to characterize patients and employ $δ$-reachability analysis to predict patient-specific therapeutic strategies. The results show that our methods are promising and may lead to a prognostic tool for personalized cancer therapy. △ Less

Submitted 19 May, 2015; v1 submitted 27 October, 2014; originally announced October 2014.

Comments: HSCC 2015

arXiv:1409.6414 [pdf, ps, other]

Proof Generation from Delta-Decisions

Authors: Sicun Gao, Soonho Kong, Edmund Clarke

Abstract: We show how to generate and validate logical proofs of unsatisfiability from delta-complete decision procedures that rely on error-prone numerical algorithms. Solving this problem is important for ensuring correctness of the decision procedures. At the same time, it is a new approach for automated theorem proving over real numbers. We design a first-order calculus, and transform the computational… ▽ More We show how to generate and validate logical proofs of unsatisfiability from delta-complete decision procedures that rely on error-prone numerical algorithms. Solving this problem is important for ensuring correctness of the decision procedures. At the same time, it is a new approach for automated theorem proving over real numbers. We design a first-order calculus, and transform the computational steps of constraint solving into logic proofs, which are then validated using proof-checking algorithms. As an application, we demonstrate how proofs generated from our solver can establish many nonlinear lemmas in the the formal proof of the Kepler Conjecture. △ Less

Submitted 23 September, 2014; originally announced September 2014.

Comments: Appeared in SYNASC'14

arXiv:1407.1524 [pdf, other]

Parameter Synthesis for Cardiac Cell Hybrid Models Using Delta-Decisions

Authors: Bing Liu, Soonho Kong, Sicun Gao, Paolo Zuliani, Edmund M. Clarke

Abstract: A central problem in systems biology is to identify parameter values such that a biological model satisfies some behavioral constraints (\eg, time series). In this paper we focus on parameter synthesis for hybrid (continuous/discrete) models, as many biological systems can possess multiple operational modes with specific continuous dynamics in each mode. These biological systems are naturally mode… ▽ More A central problem in systems biology is to identify parameter values such that a biological model satisfies some behavioral constraints (\eg, time series). In this paper we focus on parameter synthesis for hybrid (continuous/discrete) models, as many biological systems can possess multiple operational modes with specific continuous dynamics in each mode. These biological systems are naturally modeled as hybrid automata, most often with nonlinear continuous dynamics. However, hybrid automata are notoriously hard to analyze --- even simple reachability for hybrid systems with linear differential dynamics is an undecidable problem. In this paper we present a parameter synthesis framework based on $δ$-complete decision procedures that sidesteps undecidability. We demonstrate our method on two highly nonlinear hybrid models of the cardiac cell action potential. The results show that our parameter synthesis framework is convenient and efficient, and it enabled us to select a suitable model to study and identify crucial parameter ranges related to cardiac disorders. △ Less

Submitted 10 September, 2014; v1 submitted 6 July, 2014; originally announced July 2014.

Comments: 12th Conference on Computational Methods in Systems Biology (CMSB 2014)

arXiv:1404.7206 [pdf, other]

SReach: A Bounded Model Checker for Stochastic Hybrid Systems

Authors: Qinsi Wang, Paolo Zuliani, Soonho Kong, Sicun Gao, Edmund M. Clarke

Abstract: In this paper we describe a new tool, SReach, which solves probabilistic bounded reachability problems for two classes of stochastic hybrid systems. The first one is (nonlinear) hybrid automata with parametric uncertainty. The second one is probabilistic hybrid automata with additional randomness for both transition probabilities and variable resets. Standard approaches to reachability problems fo… ▽ More In this paper we describe a new tool, SReach, which solves probabilistic bounded reachability problems for two classes of stochastic hybrid systems. The first one is (nonlinear) hybrid automata with parametric uncertainty. The second one is probabilistic hybrid automata with additional randomness for both transition probabilities and variable resets. Standard approaches to reachability problems for linear hybrid systems require numerical solutions for large optimization problems, and become infeasible for systems involving both nonlinear dynamics over the reals and stochasticity. Our approach encodes stochastic information by using random variables, and combines the randomized sampling, a $δ$-complete decision procedure, and statistical tests. SReach utilizes the $δ$-complete decision procedure to solve reachability problems in a sound manner, i.e., it always decides correctly if, for a given assignment to all random variables, the system actually reaches the unsafe region. The statistical tests adapted guarantee arbitrary small error bounds between probabilities estimated by SReach and real ones. Compared to standard simulation-based methods, our approach supports non-deterministic branching, increases the coverage of simulation, and avoids the zero-crossing problem. We demonstrate our method's feasibility by applying SReach to three representative biological models and to additional benchmarks for nonlinear hybrid systems with multiple probabilistic system parameters. △ Less

Submitted 27 October, 2014; v1 submitted 28 April, 2014; originally announced April 2014.

arXiv:1404.7171 [pdf, other]

Delta-Complete Analysis for Bounded Reachability of Hybrid Systems

Authors: Sicun Gao, Soonho Kong, Wei Chen, Edmund Clarke

Abstract: We present the framework of delta-complete analysis for bounded reachability problems of general hybrid systems. We perform bounded reachability checking through solving delta-decision problems over the reals. The techniques take into account of robustness properties of the systems under numerical perturbations. We prove that the verification problems become much more mathematically tractable in t… ▽ More We present the framework of delta-complete analysis for bounded reachability problems of general hybrid systems. We perform bounded reachability checking through solving delta-decision problems over the reals. The techniques take into account of robustness properties of the systems under numerical perturbations. We prove that the verification problems become much more mathematically tractable in this new framework. Our implementation of the techniques, an open-source tool dReach, scales well on several highly nonlinear hybrid system models that arise in biomedical and robotics applications. △ Less

Submitted 28 April, 2014; originally announced April 2014.

arXiv:1404.7169 [pdf, ps, other]

Revisiting the Complexity of Stability of Continuous and Hybrid Systems

Authors: Sicun Gao, Soonho Kong, Edmund Clarke

Abstract: We develop a framework to give upper bounds on the "practical" computational complexity of stability problems for a wide range of nonlinear continuous and hybrid systems. To do so, we describe stability properties of dynamical systems using first-order formulas over the real numbers, and reduce stability problems to the delta-decision problems of these formulas. The framework allows us to obtain a… ▽ More We develop a framework to give upper bounds on the "practical" computational complexity of stability problems for a wide range of nonlinear continuous and hybrid systems. To do so, we describe stability properties of dynamical systems using first-order formulas over the real numbers, and reduce stability problems to the delta-decision problems of these formulas. The framework allows us to obtain a precise characterization of the complexity of different notions of stability for nonlinear continuous and hybrid systems. We prove that bounded versions of the stability problems are generally decidable, and give upper bounds on their complexity. The unbounded versions are generally undecidable, for which we give upper bounds on their degrees of unsolvability. △ Less

Submitted 4 June, 2014; v1 submitted 28 April, 2014; originally announced April 2014.

arXiv:1310.8278 [pdf, other]

Satisfiability Modulo ODEs

Authors: Sicun Gao, Soonho Kong, Edmund Clarke

Abstract: We study SMT problems over the reals containing ordinary differential equations. They are important for formal verification of realistic hybrid systems and embedded software. We develop delta-complete algorithms for SMT formulas that are purely existentially quantified, as well as exists-forall formulas whose universal quantification is restricted to the time variables. We demonstrate scalability… ▽ More We study SMT problems over the reals containing ordinary differential equations. They are important for formal verification of realistic hybrid systems and embedded software. We develop delta-complete algorithms for SMT formulas that are purely existentially quantified, as well as exists-forall formulas whose universal quantification is restricted to the time variables. We demonstrate scalability of the algorithms, as implemented in our open-source solver dReal, on SMT benchmarks with several hundred nonlinear ODEs and variables. △ Less

Submitted 30 October, 2013; originally announced October 2013.

Comments: Published in FMCAD 2013

arXiv:1308.2839 [pdf, ps, other]

A note on bounds for the cop number using tree decompositions

Authors: Anthony Bonato, N. E. Clarke, S. Finbow, S. Fitzpatrick, M. E. Messinger

Abstract: In this short note, we supply a new upper bound on the cop number in terms of tree decompositions. Our results in some cases extend a previously derived bound on the cop number using treewidth. In this short note, we supply a new upper bound on the cop number in terms of tree decompositions. Our results in some cases extend a previously derived bound on the cop number using treewidth. △ Less

Submitted 13 August, 2013; originally announced August 2013.

arXiv:1306.1945 [pdf, ps, other]

Automatic Abstraction in SMT-Based Unbounded Software Model Checking

Authors: Anvesh Komuravelli, Arie Gurfinkel, Sagar Chaki, Edmund M. Clarke

Abstract: Software model checkers based on under-approximations and SMT solvers are very successful at verifying safety (i.e. reachability) properties. They combine two key ideas -- (a) "concreteness": a counterexample in an under-approximation is a counterexample in the original program as well, and (b) "generalization": a proof of safety of an under-approximation, produced by an SMT solver, are generaliza… ▽ More Software model checkers based on under-approximations and SMT solvers are very successful at verifying safety (i.e. reachability) properties. They combine two key ideas -- (a) "concreteness": a counterexample in an under-approximation is a counterexample in the original program as well, and (b) "generalization": a proof of safety of an under-approximation, produced by an SMT solver, are generalizable to proofs of safety of the original program. In this paper, we present a combination of "automatic abstraction" with the under-approximation-driven framework. We explore two iterative approaches for obtaining and refining abstractions -- "proof based" and "counterexample based" -- and show how they can be combined into a unified algorithm. To the best of our knowledge, this is the first application of Proof-Based Abstraction, primarily used to verify hardware, to Software Verification. We have implemented a prototype of the framework using Z3, and evaluate it on many benchmarks from the Software Verification Competition. We show experimentally that our combination is quite effective on hard instances. △ Less

Submitted 8 June, 2013; originally announced June 2013.

Comments: Extended version of a paper in the proceedings of CAV 2013

arXiv:1207.5091 [pdf, ps, other]

doi 10.1109/LICS.2012.54

Learning Probabilistic Systems from Tree Samples

Authors: Anvesh Komuravelli, Corina S. Pasareanu, Edmund M. Clarke

Abstract: We consider the problem of learning a non-deterministic probabilistic system consistent with a given finite set of positive and negative tree samples. Consistency is defined with respect to strong simulation conformance. We propose learning algorithms that use traditional and a new "stochastic" state-space partitioning, the latter resulting in the minimum number of states. We then use them to solv… ▽ More We consider the problem of learning a non-deterministic probabilistic system consistent with a given finite set of positive and negative tree samples. Consistency is defined with respect to strong simulation conformance. We propose learning algorithms that use traditional and a new "stochastic" state-space partitioning, the latter resulting in the minimum number of states. We then use them to solve the problem of "active learning", that uses a knowledgeable teacher to generate samples as counterexamples to simulation equivalence queries. We show that the problem is undecidable in general, but that it becomes decidable under a suitable condition on the teacher which comes naturally from the way samples are generated from failed simulation checks. The latter problem is shown to be undecidable if we impose an additional condition on the learner to always conjecture a "minimum state" hypothesis. We therefore propose a semi-algorithm using stochastic partitions. Finally, we apply the proposed (semi-) algorithms to infer intermediate assumptions in an automated assume-guarantee verification framework for probabilistic systems. △ Less

Submitted 20 July, 2012; originally announced July 2012.

Comments: 14 pages, conference paper with full proofs

Journal ref: LICS, pp. 441-450, IEEE, 2012

arXiv:1207.5086 [pdf, ps, other]

doi 10.1007/978-3-642-31424-7_25

Assume-Guarantee Abstraction Refinement for Probabilistic Systems

Authors: Anvesh Komuravelli, Corina S. Pasareanu, Edmund M. Clarke

Abstract: We describe an automated technique for assume-guarantee style checking of strong simulation between a system and a specification, both expressed as non-deterministic Labeled Probabilistic Transition Systems (LPTSes). We first characterize counterexamples to strong simulation as "stochastic" trees and show that simpler structures are insufficient. Then, we use these trees in an abstraction refineme… ▽ More We describe an automated technique for assume-guarantee style checking of strong simulation between a system and a specification, both expressed as non-deterministic Labeled Probabilistic Transition Systems (LPTSes). We first characterize counterexamples to strong simulation as "stochastic" trees and show that simpler structures are insufficient. Then, we use these trees in an abstraction refinement algorithm that computes the assumptions for assume-guarantee reasoning as conservative LPTS abstractions of some of the system components. The abstractions are automatically refined based on tree counterexamples obtained from failed simulation checks with the remaining components. We have implemented the algorithms for counterexample generation and assume-guarantee abstraction refinement and report encouraging results. △ Less

Submitted 20 July, 2012; originally announced July 2012.

Comments: 23 pages, conference paper with full proofs

Journal ref: CAV, vol. 7358 of LNCS, pp. 310-326. Springer-Verlag. 2012

arXiv:1204.6671 [pdf, ps, other]

Delta-Decidability over the Reals

Authors: Sicun Gao, Jeremy Avigad, Edmund Clarke

Abstract: Given any collection F of computable functions over the reals, we show that there exists an algorithm that, given any L_F-sentence \varphi containing only bounded quantifiers, and any positive rational number δ, decides either "\varphi is true", or "a δ-strengthening of \varphi is false". Under mild assumptions, for a C-computable signature F, the δ-decision problem for bounded Σ_k-sentences in L_… ▽ More Given any collection F of computable functions over the reals, we show that there exists an algorithm that, given any L_F-sentence \varphi containing only bounded quantifiers, and any positive rational number δ, decides either "\varphi is true", or "a δ-strengthening of \varphi is false". Under mild assumptions, for a C-computable signature F, the δ-decision problem for bounded Σ_k-sentences in L_F resides in (Σ_k^P)^C. The results stand in sharp contrast to the well-known undecidability results, and serve as a theoretical basis for the use of numerical methods in decision procedures for nonlinear first-order theories over the reals. △ Less

Submitted 30 April, 2012; originally announced April 2012.

Comments: A short version appears in LICS 2012

arXiv:1204.3513 [pdf, ps, other]

Delta-Complete Decision Procedures for Satisfiability over the Reals

Authors: Sicun Gao, Jeremy Avigad, Edmund Clarke

Abstract: We introduce the notion of "δ-complete decision procedures" for solving SMT problems over the real numbers, with the aim of handling a wide range of nonlinear functions including transcendental functions and solutions of Lipschitz-continuous ODEs. Given an SMT problem \varphi and a positive rational number δ, a δ-complete decision procedure determines either that \varphi is unsatisfiable, or that… ▽ More We introduce the notion of "δ-complete decision procedures" for solving SMT problems over the real numbers, with the aim of handling a wide range of nonlinear functions including transcendental functions and solutions of Lipschitz-continuous ODEs. Given an SMT problem \varphi and a positive rational number δ, a δ-complete decision procedure determines either that \varphi is unsatisfiable, or that the "δ-weakening" of \varphi is satisfiable. Here, the δ-weakening of \varphi is a variant of \varphi that allows δ-bounded numerical perturbations on \varphi. We prove the existence of δ-complete decision procedures for bounded SMT over reals with functions mentioned above. For functions in Type 2 complexity class C, under mild assumptions, the bounded δ-SMT problem is in NP^C. δ-Complete decision procedures can exploit scalable numerical methods for handling nonlinearity, and we propose to use this notion as an ideal requirement for numerically-driven decision procedures. As a concrete example, we formally analyze the DPLL<ICP> framework, which integrates Interval Constraint Propagation (ICP) in DPLL(T), and establish necessary and sufficient conditions for its δ-completeness. We discuss practical applications of δ-complete decision procedures for correctness-critical applications including formal verification and theorem proving. △ Less

Submitted 17 September, 2012; v1 submitted 16 April, 2012; originally announced April 2012.

Comments: A shorter version appears in IJCAR 2012

arXiv:1104.0746 [pdf, other]

doi 10.1007/978-3-642-21493-6_9

Quantifier Elimination over Finite Fields Using Gröbner Bases

Authors: Sicun Gao, André Platzer, Edmund M. Clarke

Abstract: We give an algebraic quantifier elimination algorithm for the first-order theory over any given finite field using Gröbner basis methods. The algorithm relies on the strong Nullstellensatz and properties of elimination ideals over finite fields. We analyze the theoretical complexity of the algorithm and show its application in the formal analysis of a biological controller model. We give an algebraic quantifier elimination algorithm for the first-order theory over any given finite field using Gröbner basis methods. The algorithm relies on the strong Nullstellensatz and properties of elimination ideals over finite fields. We analyze the theoretical complexity of the algorithm and show its application in the formal analysis of a biological controller model. △ Less

Submitted 5 April, 2011; originally announced April 2011.

Comments: A shorter version is to appear in International Conference on Algebraic Informatics 2011

Showing 1–24 of 24 results for author: Clarke, E