Search | arXiv e-print repository

3D Environment Modeling for Falsification and Beyond with Scenic 3.0

Authors: Eric Vin, Shun Kashiwa, Matthew Rhea, Daniel J. Fremont, Edward Kim, Tommaso Dreossi, Shromona Ghosh, Xiangyu Yue, Alberto L. Sangiovanni-Vincentelli, Sanjit A. Seshia

Abstract: We present a major new version of Scenic, a probabilistic programming language for writing formal models of the environments of cyber-physical systems. Scenic has been successfully used for the design and analysis of CPS in a variety of domains, but earlier versions are limited to environments which are essentially two-dimensional. In this paper, we extend Scenic with native support for 3D geometr… ▽ More We present a major new version of Scenic, a probabilistic programming language for writing formal models of the environments of cyber-physical systems. Scenic has been successfully used for the design and analysis of CPS in a variety of domains, but earlier versions are limited to environments which are essentially two-dimensional. In this paper, we extend Scenic with native support for 3D geometry, introducing new syntax which provides expressive ways to describe 3D configurations while preserving the simplicity and readability of the language. We replace Scenic's simplistic representation of objects as boxes with precise modeling of complex shapes, including a ray tracing-based visibility system that accounts for object occlusion. We also extend the language to support arbitrary temporal requirements expressed in LTL, and build an extensible Scenic parser generated from a formal grammar of the language. Finally, we illustrate the new application domains these features enable with case studies that would have been impossible to accurately model in Scenic 2. △ Less

Submitted 6 July, 2023; originally announced July 2023.

Comments: 13 pages, 6 figures. Full version of a CAV 2023 tool paper, to appear in the Springer Lecture Notes in Computer Science series

arXiv:2206.02775 [pdf, other]

Randomized Synthesis for Diversity and Cost Constraints with Control Improvisation

Authors: Andreas Gittis, Eric Vin, Daniel J. Fremont

Abstract: In many synthesis problems, it can be essential to generate implementations which not only satisfy functional constraints but are also randomized to improve variety, robustness, or unpredictability. The recently-proposed framework of control improvisation (CI) provides techniques for the correct-by-construction synthesis of randomized systems subject to hard and soft constraints. However, prior wo… ▽ More In many synthesis problems, it can be essential to generate implementations which not only satisfy functional constraints but are also randomized to improve variety, robustness, or unpredictability. The recently-proposed framework of control improvisation (CI) provides techniques for the correct-by-construction synthesis of randomized systems subject to hard and soft constraints. However, prior work on CI has focused on qualitative specifications, whereas in robotic planning and other areas we often have quantitative quality metrics which can be traded against each other. For example, a designer of a patrolling security robot might want to know by how much the average patrol time needs to be increased in order to ensure that a particular aspect of the robot's route is sufficiently diverse and hence unpredictable. In this paper, we enable this type of application by generalizing the CI problem to support quantitative soft constraints which bound the expected value of a given cost function, and randomness constraints which enforce diversity of the generated traces with respect to a given label function. We establish the basic theory of labelled quantitative CI problems, and develop efficient algorithms for solving them when the specifications are encoded by finite automata. We also provide an approximate improvisation algorithm based on constraint solving for any specifications encodable as Boolean formulas. We demonstrate the utility of our problem formulation and algorithms with experiments applying them to generate diverse near-optimal plans for robotic planning problems. △ Less

Submitted 6 June, 2022; originally announced June 2022.

Comments: 35 pages, 3 figures. Full version (including appendices) of a CAV 2022 paper, to appear in the Springer Lecture Notes in Computer Science series

ACM Class: F.4.3; F.4.1; G.3; F.2.2

arXiv:2112.00206 [pdf, other]

Querying Labelled Data with Scenario Programs for Sim-to-Real Validation

Authors: Edward Kim, Jay Shenoy, Sebastian Junges, Daniel Fremont, Alberto Sangiovanni-Vincentelli, Sanjit Seshia

Abstract: Simulation-based testing of autonomous vehicles (AVs) has become an essential complement to road testing to ensure safety. Consequently, substantial research has focused on searching for failure scenarios in simulation. However, a fundamental question remains: are AV failure scenarios identified in simulation meaningful in reality, i.e., are they reproducible on the real system? Due to the sim-to-… ▽ More Simulation-based testing of autonomous vehicles (AVs) has become an essential complement to road testing to ensure safety. Consequently, substantial research has focused on searching for failure scenarios in simulation. However, a fundamental question remains: are AV failure scenarios identified in simulation meaningful in reality, i.e., are they reproducible on the real system? Due to the sim-to-real gap arising from discrepancies between simulated and real sensor data, a failure scenario identified in simulation can be either a spurious artifact of the synthetic sensor data or an actual failure that persists with real sensor data. An approach to validate simulated failure scenarios is to identify instances of the scenario in a corpus of real data, and check if the failure persists on the real data. To this end, we propose a formal definition of what it means for a labelled data item to match an abstract scenario, encoded as a scenario program using the SCENIC probabilistic programming language. Using this definition, we develop a querying algorithm which, given a scenario program and a labelled dataset, finds the subset of data matching the scenario. Experiments demonstrate that our algorithm is accurate and efficient on a variety of realistic traffic scenarios, and scales to a reasonable number of agents. △ Less

Submitted 30 November, 2021; originally announced December 2021.

Comments: pre-print

arXiv:2110.14870 [pdf, other]

A Scenario-Based Platform for Testing Autonomous Vehicle Behavior Prediction Models in Simulation

Authors: Francis Indaheng, Edward Kim, Kesav Viswanadha, Jay Shenoy, **kyu Kim, Daniel J. Fremont, Sanjit A. Seshia

Abstract: Behavior prediction remains one of the most challenging tasks in the autonomous vehicle (AV) software stack. Forecasting the future trajectories of nearby agents plays a critical role in ensuring road safety, as it equips AVs with the necessary information to plan safe routes of travel. However, these prediction models are data-driven and trained on data collected in real life that may not represe… ▽ More Behavior prediction remains one of the most challenging tasks in the autonomous vehicle (AV) software stack. Forecasting the future trajectories of nearby agents plays a critical role in ensuring road safety, as it equips AVs with the necessary information to plan safe routes of travel. However, these prediction models are data-driven and trained on data collected in real life that may not represent the full range of scenarios an AV can encounter. Hence, it is important that these prediction models are extensively tested in various test scenarios involving interactive behaviors prior to deployment. To support this need, we present a simulation-based testing platform which supports (1) intuitive scenario modeling with a probabilistic programming language called Scenic, (2) specifying a multi-objective evaluation metric with a partial priority ordering, (3) falsification of the provided metric, and (4) parallelization of simulations for scalable testing. As a part of the platform, we provide a library of 25 Scenic programs that model challenging test scenarios involving interactive traffic participant behaviors. We demonstrate the effectiveness and the scalability of our platform by testing a trained behavior prediction model and searching for failure scenarios. △ Less

Submitted 13 November, 2021; v1 submitted 27 October, 2021; originally announced October 2021.

Comments: Accepted to the NeurIPS 2021 Workshop on Machine Learning for Autonomous Driving

arXiv:2108.13796 [pdf, other]

Addressing the IEEE AV Test Challenge with Scenic and VerifAI

Authors: Kesav Viswanadha, Francis Indaheng, Justin Wong, Edward Kim, Ellen Kalvan, Yash Pant, Daniel J. Fremont, Sanjit A. Seshia

Abstract: This paper summarizes our formal approach to testing autonomous vehicles (AVs) in simulation for the IEEE AV Test Challenge. We demonstrate a systematic testing framework leveraging our previous work on formally-driven simulation for intelligent cyber-physical systems. First, to model and generate interactive scenarios involving multiple agents, we used Scenic, a probabilistic programming language… ▽ More This paper summarizes our formal approach to testing autonomous vehicles (AVs) in simulation for the IEEE AV Test Challenge. We demonstrate a systematic testing framework leveraging our previous work on formally-driven simulation for intelligent cyber-physical systems. First, to model and generate interactive scenarios involving multiple agents, we used Scenic, a probabilistic programming language for specifying scenarios. A Scenic program defines an abstract scenario as a distribution over configurations of physical objects and their behaviors over time. Sampling from an abstract scenario yields many different concrete scenarios which can be run as test cases for the AV. Starting from a Scenic program encoding an abstract driving scenario, we can use the VerifAI toolkit to search within the scenario for failure cases with respect to multiple AV evaluation metrics. We demonstrate the effectiveness of our testing framework by identifying concrete failure scenarios for an open-source autopilot, Apollo, starting from a variety of realistic traffic scenarios. △ Less

Submitted 20 August, 2021; originally announced August 2021.

Comments: Accepted to the IEEE AITest Conference 2021

arXiv:2107.04164 [pdf, other]

Parallel and Multi-Objective Falsification with Scenic and VerifAI

Authors: Kesav Viswanadha, Edward Kim, Francis Indaheng, Daniel J. Fremont, Sanjit A. Seshia

Abstract: Falsification has emerged as an important tool for simulation-based verification of autonomous systems. In this paper, we present extensions to the Scenic scenario specification language and VerifAI toolkit that improve the scalability of sampling-based falsification methods by using parallelism and extend falsification to multi-objective specifications. We first present a parallelized framework t… ▽ More Falsification has emerged as an important tool for simulation-based verification of autonomous systems. In this paper, we present extensions to the Scenic scenario specification language and VerifAI toolkit that improve the scalability of sampling-based falsification methods by using parallelism and extend falsification to multi-objective specifications. We first present a parallelized framework that is interfaced with both the simulation and sampling capabilities of Scenic and the falsification capabilities of VerifAI, reducing the execution time bottleneck inherently present in simulation-based testing. We then present an extension of VerifAI's falsification algorithms to support multi-objective optimization during sampling, using the concept of rulebooks to specify a preference ordering over multiple metrics that can be used to guide the counterexample search process. Lastly, we evaluate the benefits of these extensions with a comprehensive set of benchmarks written in the Scenic language. △ Less

Submitted 8 July, 2021; originally announced July 2021.

arXiv:2103.05672 [pdf, other]

Entropy-Guided Control Improvisation

Authors: Marcell Vazquez-Chanlatte, Sebastian Junges, Daniel J. Fremont, Sanjit Seshia

Abstract: High level declarative constraints provide a powerful (and popular) way to define and construct control policies; however, most synthesis algorithms do not support specifying the degree of randomness (unpredictability) of the resulting controller. In many contexts, e.g., patrolling, testing, behavior prediction,and planning on idealized models, predictable or biased controllers are undesirable. To… ▽ More High level declarative constraints provide a powerful (and popular) way to define and construct control policies; however, most synthesis algorithms do not support specifying the degree of randomness (unpredictability) of the resulting controller. In many contexts, e.g., patrolling, testing, behavior prediction,and planning on idealized models, predictable or biased controllers are undesirable. To address these concerns, we introduce the \emph{Entropic Reactive Control Improvisation} (ERCI) framework and algorithm which supports synthesizing control policies for stochastic games that are declaratively specified by (i) a \emph{hard constraint} specifying what must occur, (ii) a \emph{soft constraint} specifying what typically occurs, and (iii) a \emph{randomization constraint} specifying the unpredictability and variety of the controller, as quantified using causal entropy. This framework, extends the state of the art by supporting arbitrary combinations of adversarial and probabilistic uncertainty in the environment. ERCI enables a flexible modeling formalism which we argue, theoretically and empirically, remains tractable. △ Less

Submitted 28 June, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

Comments: RSS 21

arXiv:2011.14551 [pdf, other]

A Customizable Dynamic Scenario Modeling and Data Generation Platform for Autonomous Driving

Authors: Jay Shenoy, Edward Kim, Xiangyu Yue, Taesung Park, Daniel Fremont, Alberto Sangiovanni-Vincentelli, Sanjit Seshia

Abstract: Safely interacting with humans is a significant challenge for autonomous driving. The performance of this interaction depends on machine learning-based modules of an autopilot, such as perception, behavior prediction, and planning. These modules require training datasets with high-quality labels and a diverse range of realistic dynamic behaviors. Consequently, training such modules to handle rare… ▽ More Safely interacting with humans is a significant challenge for autonomous driving. The performance of this interaction depends on machine learning-based modules of an autopilot, such as perception, behavior prediction, and planning. These modules require training datasets with high-quality labels and a diverse range of realistic dynamic behaviors. Consequently, training such modules to handle rare scenarios is difficult because they are, by definition, rarely represented in real-world datasets. Hence, there is a practical need to augment datasets with synthetic data covering these rare scenarios. In this paper, we present a platform to model dynamic and interactive scenarios, generate the scenarios in simulation with different modalities of labeled sensor data, and collect this information for data augmentation. To our knowledge, this is the first integrated platform for these tasks specialized to the autonomous driving domain. △ Less

Submitted 30 November, 2020; originally announced November 2020.

arXiv:2010.06580 [pdf, other]

Scenic: A Language for Scenario Specification and Data Generation

Authors: Daniel J. Fremont, Edward Kim, Tommaso Dreossi, Shromona Ghosh, Xiangyu Yue, Alberto L. Sangiovanni-Vincentelli, Sanjit A. Seshia

Abstract: We propose a new probabilistic programming language for the design and analysis of cyber-physical systems, especially those based on machine learning. Specifically, we consider the problems of training a system to be robust to rare events, testing its performance under different conditions, and debugging failures. We show how a probabilistic programming language can help address these problems by… ▽ More We propose a new probabilistic programming language for the design and analysis of cyber-physical systems, especially those based on machine learning. Specifically, we consider the problems of training a system to be robust to rare events, testing its performance under different conditions, and debugging failures. We show how a probabilistic programming language can help address these problems by specifying distributions encoding interesting types of inputs, then sampling these to generate specialized training and test data. More generally, such languages can be used to write environment models, an essential prerequisite to any formal analysis. In this paper, we focus on systems like autonomous cars and robots, whose environment at any point in time is a 'scene', a configuration of physical objects and agents. We design a domain-specific language, Scenic, for describing scenarios that are distributions over scenes and the behaviors of their agents over time. As a probabilistic programming language, Scenic allows assigning distributions to features of the scene, as well as declaratively imposing hard and soft constraints over the scene. We develop specialized techniques for sampling from the resulting distribution, taking advantage of the structure provided by Scenic's domain-specific syntax. Finally, we apply Scenic in a case study on a convolutional neural network designed to detect cars in road images, improving its performance beyond that achieved by state-of-the-art synthetic data generation methods. △ Less

Submitted 13 October, 2020; originally announced October 2020.

Comments: Supercedes arXiv:1809.09310

arXiv:2005.07173 [pdf, other]

Formal Analysis and Redesign of a Neural Network-Based Aircraft Taxiing System with VerifAI

Authors: Daniel J. Fremont, Johnathan Chiu, Dragos D. Margineantu, Denis Osipychev, Sanjit A. Seshia

Abstract: We demonstrate a unified approach to rigorous design of safety-critical autonomous systems using the VerifAI toolkit for formal analysis of AI-based systems. VerifAI provides an integrated toolchain for tasks spanning the design process, including modeling, falsification, debugging, and ML component retraining. We evaluate all of these applications in an industrial case study on an experimental au… ▽ More We demonstrate a unified approach to rigorous design of safety-critical autonomous systems using the VerifAI toolkit for formal analysis of AI-based systems. VerifAI provides an integrated toolchain for tasks spanning the design process, including modeling, falsification, debugging, and ML component retraining. We evaluate all of these applications in an industrial case study on an experimental autonomous aircraft taxiing system developed by Boeing, which uses a neural network to track the centerline of a runway. We define runway scenarios using the Scenic probabilistic programming language, and use them to drive tests in the X-Plane flight simulator. We first perform falsification, automatically finding environment conditions causing the system to violate its specification by deviating significantly from the centerline (or even leaving the runway entirely). Next, we use counterexample analysis to identify distinct failure cases, and confirm their root causes with specialized testing. Finally, we use the results of falsification and debugging to retrain the network, eliminating several failure cases and improving the overall performance of the closed-loop system. △ Less

Submitted 14 May, 2020; originally announced May 2020.

Comments: Full version of a CAV 2020 paper

arXiv:2003.07739 [pdf, other]

Formal Scenario-Based Testing of Autonomous Vehicles: From Simulation to the Real World

Authors: Daniel J. Fremont, Edward Kim, Yash Vardhan Pant, Sanjit A. Seshia, Atul Acharya, Xantha Bruso, Paul Wells, Steve Lemke, Qiang Lu, Shalin Mehta

Abstract: We present a new approach to automated scenario-based testing of the safety of autonomous vehicles, especially those using advanced artificial intelligence-based components, spanning both simulation-based evaluation as well as testing in the real world. Our approach is based on formal methods, combining formal specification of scenarios and safety properties, algorithmic test case generation using… ▽ More We present a new approach to automated scenario-based testing of the safety of autonomous vehicles, especially those using advanced artificial intelligence-based components, spanning both simulation-based evaluation as well as testing in the real world. Our approach is based on formal methods, combining formal specification of scenarios and safety properties, algorithmic test case generation using formal simulation, test case selection for track testing, executing test cases on the track, and analyzing the resulting data. Experiments with a real autonomous vehicle at an industrial testing facility support our hypotheses that (i) formal simulation can be effective at identifying test cases to run on the track, and (ii) the gap between simulated and real worlds can be systematically evaluated and bridged. △ Less

Submitted 12 July, 2020; v1 submitted 17 March, 2020; originally announced March 2020.

Comments: 9 pages, 6 figures. Full version of an ITSC 2020 paper

ACM Class: I.2.9; D.2.4; D.2.5

arXiv:1902.04245 [pdf, other]

VERIFAI: A Toolkit for the Design and Analysis of Artificial Intelligence-Based Systems

Authors: Tommaso Dreossi, Daniel J. Fremont, Shromona Ghosh, Edward Kim, Hadi Ravanbakhsh, Marcell Vazquez-Chanlatte, Sanjit A. Seshia

Abstract: We present VERIFAI, a software toolkit for the formal design and analysis of systems that include artificial intelligence (AI) and machine learning (ML) components. VERIFAI particularly seeks to address challenges with applying formal methods to perception and ML components, including those based on neural networks, and to model and analyze system behavior in the presence of environment uncertaint… ▽ More We present VERIFAI, a software toolkit for the formal design and analysis of systems that include artificial intelligence (AI) and machine learning (ML) components. VERIFAI particularly seeks to address challenges with applying formal methods to perception and ML components, including those based on neural networks, and to model and analyze system behavior in the presence of environment uncertainty. We describe the initial version of VERIFAI which centers on simulation guided by formal models and specifications. Several use cases are illustrated with examples, including temporal-logic falsification, model-based systematic fuzz testing, parameter synthesis, counterexample analysis, and data set augmentation. △ Less

Submitted 14 February, 2019; v1 submitted 12 February, 2019; originally announced February 2019.

arXiv:1809.09310 [pdf, other]

doi 10.1145/3314221.3314633

Scenic: A Language for Scenario Specification and Scene Generation

Authors: Daniel J. Fremont, Tommaso Dreossi, Shromona Ghosh, Xiangyu Yue, Alberto L. Sangiovanni-Vincentelli, Sanjit A. Seshia

Abstract: We propose a new probabilistic programming language for the design and analysis of perception systems, especially those based on machine learning. Specifically, we consider the problems of training a perception system to handle rare events, testing its performance under different conditions, and debugging failures. We show how a probabilistic programming language can help address these problems by… ▽ More We propose a new probabilistic programming language for the design and analysis of perception systems, especially those based on machine learning. Specifically, we consider the problems of training a perception system to handle rare events, testing its performance under different conditions, and debugging failures. We show how a probabilistic programming language can help address these problems by specifying distributions encoding interesting types of inputs and sampling these to generate specialized training and test sets. More generally, such languages can be used for cyber-physical systems and robotics to write environment models, an essential prerequisite to any formal analysis. In this paper, we focus on systems like autonomous cars and robots, whose environment is a "scene", a configuration of physical objects and agents. We design a domain-specific language, Scenic, for describing "scenarios" that are distributions over scenes. As a probabilistic programming language, Scenic allows assigning distributions to features of the scene, as well as declaratively imposing hard and soft constraints over the scene. We develop specialized techniques for sampling from the resulting distribution, taking advantage of the structure provided by Scenic's domain-specific syntax. Finally, we apply Scenic in a case study on a convolutional neural network designed to detect cars in road images, improving its performance beyond that achieved by state-of-the-art synthetic data generation methods. △ Less

Submitted 20 June, 2019; v1 submitted 24 September, 2018; originally announced September 2018.

Comments: 41 pages, 36 figures. Full version of a PLDI 2019 paper (extending UC Berkeley EECS Department Tech Report No. UCB/EECS-2018-8)

arXiv:1804.05037 [pdf, other]

Reactive Control Improvisation

Authors: Daniel J. Fremont, Sanjit A. Seshia

Abstract: Reactive synthesis is a paradigm for automatically building correct-by-construction systems that interact with an unknown or adversarial environment. We study how to do reactive synthesis when part of the specification of the system is that its behavior should be random. Randomness can be useful, for example, in a network protocol fuzz tester whose output should be varied, or a planner for a surve… ▽ More Reactive synthesis is a paradigm for automatically building correct-by-construction systems that interact with an unknown or adversarial environment. We study how to do reactive synthesis when part of the specification of the system is that its behavior should be random. Randomness can be useful, for example, in a network protocol fuzz tester whose output should be varied, or a planner for a surveillance robot whose route should be unpredictable. However, existing reactive synthesis techniques do not provide a way to ensure random behavior while maintaining functional correctness. Towards this end, we generalize the recently-proposed framework of control improvisation (CI) to add reactivity. The resulting framework of reactive control improvisation provides a natural way to integrate a randomness requirement with the usual functional specifications of reactive synthesis over a finite window. We theoretically characterize when such problems are realizable, and give a general method for solving them. For specifications given by reachability or safety games or by deterministic finite automata, our method yields a polynomial-time synthesis algorithm. For various other types of specifications including temporal logic formulas, we obtain a polynomial-space algorithm and prove matching PSPACE-hardness results. We show that all of these randomized variants of reactive synthesis are no harder in a complexity-theoretic sense than their non-randomized counterparts. △ Less

Submitted 19 April, 2018; v1 submitted 13 April, 2018; originally announced April 2018.

Comments: 25 pages. Full version of a CAV 2018 paper

ACM Class: F.4.3; F.4.1; G.3; F.2.2

arXiv:1704.06319 [pdf, ps, other]

Control Improvisation

Authors: Daniel J. Fremont, Alexandre Donzé, Sanjit A. Seshia

Abstract: We formalize and analyze a new problem in formal language theory termed control improvisation. Given a specification language, the problem is to produce an improviser, a probabilistic algorithm that randomly generates words in the language, subject to two additional constraints: the satisfaction of a quantitative soft constraint, and the exhibition of a specified amount of randomness. Control impr… ▽ More We formalize and analyze a new problem in formal language theory termed control improvisation. Given a specification language, the problem is to produce an improviser, a probabilistic algorithm that randomly generates words in the language, subject to two additional constraints: the satisfaction of a quantitative soft constraint, and the exhibition of a specified amount of randomness. Control improvisation has many applications, including for example systematically generating random test vectors satisfying format constraints or preconditions while being similar to a library of seed inputs. Other applications include robotic surveillance, machine improvisation of music, and randomized variants of the supervisory control problem. We describe a general framework for solving the control improvisation problem, and use it to give efficient algorithms for several practical classes of instances with finite automaton and context-free grammar specifications. We also provide a detailed complexity analysis, establishing #P-hardness of the problem in many other cases. For these intractable cases, we show how symbolic techniques based on Boolean satisfiability (SAT) solvers can be used to find approximate solutions. Finally, we discuss an extension of control improvisation to multiple soft constraints that is useful in some applications. △ Less

Submitted 20 April, 2017; originally announced April 2017.

Comments: 25 pages. Submitted to JACM. This article supersedes arXiv:1411.0698

ACM Class: F.4.3; G.3; F.2.2

arXiv:1602.08620 [pdf, other]

On the Hardness of SAT with Community Structure

Authors: Nathan Mull, Daniel J. Fremont, Sanjit A. Seshia

Abstract: Recent attempts to explain the effectiveness of Boolean satisfiability (SAT) solvers based on conflict-driven clause learning (CDCL) on large industrial benchmarks have focused on the concept of community structure. Specifically, industrial benchmarks have been empirically found to have good community structure, and experiments seem to show a correlation between such structure and the efficiency o… ▽ More Recent attempts to explain the effectiveness of Boolean satisfiability (SAT) solvers based on conflict-driven clause learning (CDCL) on large industrial benchmarks have focused on the concept of community structure. Specifically, industrial benchmarks have been empirically found to have good community structure, and experiments seem to show a correlation between such structure and the efficiency of CDCL. However, in this paper we establish hardness results suggesting that community structure is not sufficient to explain the success of CDCL in practice. First, we formally characterize a property shared by a wide class of metrics capturing community structure, including "modularity". Next, we show that the SAT instances with good community structure according to any metric with this property are still NP-hard. Finally, we study a class of random instances generated from the "pseudo-industrial" community attachment model of Giráldez-Cru and Levy. We prove that, with high probability, instances from this model that have relatively few communities but are still highly modular require exponentially long resolution proofs and so are hard for CDCL. We also present experimental evidence that our result continues to hold for instances with many more communities. This indicates that actual industrial instances easily solved by CDCL may have some other relevant structure not captured by the community attachment model. △ Less

Submitted 13 August, 2016; v1 submitted 27 February, 2016; originally announced February 2016.

Comments: 23 pages. Full version of a SAT 2016 paper

ACM Class: F.4.1

arXiv:1512.06633 [pdf, other]

Constrained Sampling and Counting: Universal Hashing Meets SAT Solving

Authors: Kuldeep S. Meel, Moshe Vardi, Supratik Chakraborty, Daniel J. Fremont, Sanjit A. Seshia, Dror Fried, Alexander Ivrii, Sharad Malik

Abstract: Constrained sampling and counting are two fundamental problems in artificial intelligence with a diverse range of applications, spanning probabilistic reasoning and planning to constrained-random verification. While the theory of these problems was thoroughly investigated in the 1980s, prior work either did not scale to industrial size instances or gave up correctness guarantees to achieve scalabi… ▽ More Constrained sampling and counting are two fundamental problems in artificial intelligence with a diverse range of applications, spanning probabilistic reasoning and planning to constrained-random verification. While the theory of these problems was thoroughly investigated in the 1980s, prior work either did not scale to industrial size instances or gave up correctness guarantees to achieve scalability. Recently, we proposed a novel approach that combines universal hashing and SAT solving and scales to formulas with hundreds of thousands of variables without giving up correctness guarantees. This paper provides an overview of the key ingredients of the approach and discusses challenges that need to be overcome to handle larger real-world instances. △ Less

Submitted 21 December, 2015; originally announced December 2015.

Comments: Appears in proceedings of AAAI-16 Workshop on Beyond NP

arXiv:1411.0698 [pdf, other]

doi 10.4230/LIPIcs.FSTTCS.2015.463

Control Improvisation

Authors: Daniel J. Fremont, Alexandre Donzé, Sanjit A. Seshia, David Wessel

Abstract: We formalize and analyze a new automata-theoretic problem termed control improvisation. Given an automaton, the problem is to produce an improviser, a probabilistic algorithm that randomly generates words in its language, subject to two additional constraints: the satisfaction of an admissibility predicate, and the exhibition of a specified amount of randomness. Control improvisation has multiple… ▽ More We formalize and analyze a new automata-theoretic problem termed control improvisation. Given an automaton, the problem is to produce an improviser, a probabilistic algorithm that randomly generates words in its language, subject to two additional constraints: the satisfaction of an admissibility predicate, and the exhibition of a specified amount of randomness. Control improvisation has multiple applications, including, for example, generating musical improvisations that satisfy rhythmic and melodic constraints, where admissibility is determined by some bounded divergence from a reference melody. We analyze the complexity of the control improvisation problem, giving cases where it is efficiently solvable and cases where it is #P-hard or undecidable. We also show how symbolic techniques based on Boolean satisfiability (SAT) solvers can be used to approximately solve some of the intractable cases. △ Less

Submitted 24 April, 2017; v1 submitted 3 November, 2014; originally announced November 2014.

Comments: 16 pages. Full version of an FSTTCS 2015 paper. This article is superseded by arXiv:1704.06319

ACM Class: F.4.3; G.3; F.2.2

Journal ref: 35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2015), 463-474

arXiv:1405.7320 [pdf, other]

Speeding Up SMT-Based Quantitative Program Analysis

Authors: Daniel J. Fremont, Sanjit A. Seshia

Abstract: Quantitative program analysis involves computing numerical quantities about individual or collections of program executions. An example of such a computation is quantitative information flow analysis, where one estimates the amount of information leaked about secret data through a program's output channels. Such information can be quantified in several ways, including channel capacity and (Shannon… ▽ More Quantitative program analysis involves computing numerical quantities about individual or collections of program executions. An example of such a computation is quantitative information flow analysis, where one estimates the amount of information leaked about secret data through a program's output channels. Such information can be quantified in several ways, including channel capacity and (Shannon) entropy. In this paper, we formalize a class of quantitative analysis problems defined over a weighted control flow graph of a loop-free program. These problems can be solved using a combination of path enumeration, SMT solving, and model counting. However, existing methods can only handle very small programs, primarily because the number of execution paths can be exponential in the program size. We show how path explosion can be mitigated in some practical cases by taking advantage of special branching structure and by novel algorithm design. We demonstrate our techniques by computing the channel capacities of the timing side-channels of two programs with extremely large numbers of paths. △ Less

Submitted 28 May, 2014; originally announced May 2014.

Comments: Full version of an SMT 2014 paper

arXiv:1404.2984 [pdf, ps, other]

Distribution-Aware Sampling and Weighted Model Counting for SAT

Authors: Supratik Chakraborty, Daniel J. Fremont, Kuldeep S. Meel, Sanjit A. Seshia, Moshe Y. Vardi

Abstract: Given a CNF formula and a weight for each assignment of values to variables, two natural problems are weighted model counting and distribution-aware sampling of satisfying assignments. Both problems have a wide variety of important applications. Due to the inherent complexity of the exact versions of the problems, interest has focused on solving them approximately. Prior work in this area scaled o… ▽ More Given a CNF formula and a weight for each assignment of values to variables, two natural problems are weighted model counting and distribution-aware sampling of satisfying assignments. Both problems have a wide variety of important applications. Due to the inherent complexity of the exact versions of the problems, interest has focused on solving them approximately. Prior work in this area scaled only to small problems in practice, or failed to provide strong theoretical guarantees, or employed a computationally-expensive maximum a posteriori probability (MAP) oracle that assumes prior knowledge of a factored representation of the weight distribution. We present a novel approach that works with a black-box oracle for weights of assignments and requires only an {\NP}-oracle (in practice, a SAT-solver) to solve both the counting and sampling problems. Our approach works under mild assumptions on the distribution of weights of satisfying assignments, provides strong theoretical guarantees, and scales to problems involving several thousand variables. We also show that the assumptions can be significantly relaxed while improving computational efficiency if a factored representation of the weights is known. △ Less

Submitted 10 April, 2014; originally announced April 2014.

Comments: This is a full version of AAAI 2014 paper

arXiv:1304.2639 [pdf, ps, other]

The Reachability Problem for Affine Functions on the Integers

Authors: Daniel Fremont

Abstract: We consider the problem of determining, given x, y in Z^k and a finite set F of affine functions on Z^k, whether y is reachable from x by applying the functions F. We also consider the analogous problem over N^k. These problems are known to be undecidable for k >= 2. We give 2-EXPTIME algorithms for both problems in the remaining case k = 1. The exact complexities remain open, although we show a s… ▽ More We consider the problem of determining, given x, y in Z^k and a finite set F of affine functions on Z^k, whether y is reachable from x by applying the functions F. We also consider the analogous problem over N^k. These problems are known to be undecidable for k >= 2. We give 2-EXPTIME algorithms for both problems in the remaining case k = 1. The exact complexities remain open, although we show a simple NP lower bound. △ Less

Submitted 1 April, 2013; originally announced April 2013.

Comments: 13 pages

ACM Class: F.4.3

Showing 1–21 of 21 results for author: Fremont, D