Search | arXiv e-print repository

PretVM: Predictable, Efficient Virtual Machine for Real-Time Concurrency

Authors: Shaokai Lin, Erling Jellum, Mirco Theile, Tassilo Tanneberger, Binqi Sun, Chadlia Jerad, Ruomu Xu, Guangyu Feng, Christian Menard, Marten Lohstroh, Jeronimo Castrillon, Sanjit Seshia, Edward Lee

Abstract: This paper introduces the Precision-Timed Virtual Machine (PretVM), an intermediate platform facilitating the execution of quasi-static schedules compiled from a subset of programs written in the Lingua Franca (LF) coordination language. The subset consists of those programs that in principle should have statically verifiable and predictable timing behavior. The PretVM provides a schedule with wel… ▽ More This paper introduces the Precision-Timed Virtual Machine (PretVM), an intermediate platform facilitating the execution of quasi-static schedules compiled from a subset of programs written in the Lingua Franca (LF) coordination language. The subset consists of those programs that in principle should have statically verifiable and predictable timing behavior. The PretVM provides a schedule with well-defined worst-case timing bounds. The PretVM provides a clean separation between application logic and coordination logic, yielding more analyzable program executions. Experiments compare the PretVM against the default (more dynamic) LF scheduler and show that it delivers time-accurate deterministic execution. △ Less

Submitted 25 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.05403 [pdf, other]

SemPat: Using Hyperproperty-based Semantic Analysis to Generate Microarchitectural Attack Patterns

Authors: Adwait Godbole, Yatin A. Manerkar, Sanjit A. Seshia

Abstract: Microarchitectural security verification of software has seen the emergence of two broad classes of approaches. The first is based on semantic security properties (e.g., non-interference) which are verified for a given program and a specified abstract model of the hardware microarchitecture. The second is based on attack patterns, which, if found in a program execution, indicates the presence of a… ▽ More Microarchitectural security verification of software has seen the emergence of two broad classes of approaches. The first is based on semantic security properties (e.g., non-interference) which are verified for a given program and a specified abstract model of the hardware microarchitecture. The second is based on attack patterns, which, if found in a program execution, indicates the presence of an exploit. While the former uses a formal specification that can capture several gadget variants targeting the same vulnerability, it is limited by the scalability of verification. Patterns, while more scalable, must be currently constructed manually, as they are narrower in scope and sensitive to gadget-specific structure. This work develops a technique that, given a non-interference-based semantic security hyperproperty, automatically generates attack patterns up to a certain complexity parameter (called the skeleton size). Thus, we combine the advantages of both approaches: security can be specified by a hyperproperty that uniformly captures several gadget variants, while automatically generated patterns can be used for scalable verification. We implement our approach in a tool and demonstrate the ability to generate new patterns, (e.g., for SpectreV1, SpectreV4) and improved scalability using the generated patterns over hyperproperty-based verification. △ Less

Submitted 8 June, 2024; originally announced June 2024.

Comments: 16 pages, 14 figures, under review

arXiv:2406.03636 [pdf, other]

Synthetic Programming Elicitation and Repair for Text-to-Code in Very Low-Resource Programming Languages

Authors: Federico Mora, Justin Wong, Haley Lepe, Sahil Bhatia, Karim Elmaaroufi, George Varghese, Joseph E. Gonzalez, Elizabeth Polgreen, Sanjit A. Seshia

Abstract: Recent advances in large language models (LLMs) for code applications have demonstrated remarkable zero-shot fluency and instruction following on challenging code related tasks ranging from test case generation to self-repair. Unsurprisingly, however, models struggle to compose syntactically valid programs in programming languages unrepresented in pre-training, referred to as very low-resource Pro… ▽ More Recent advances in large language models (LLMs) for code applications have demonstrated remarkable zero-shot fluency and instruction following on challenging code related tasks ranging from test case generation to self-repair. Unsurprisingly, however, models struggle to compose syntactically valid programs in programming languages unrepresented in pre-training, referred to as very low-resource Programming Languages (VLPLs). VLPLs appear in crucial settings, including domain-specific languages for internal tools and tool-chains for legacy languages. Inspired by an HCI technique called natural program elicitation, we propose designing an intermediate language that LLMs ``naturally'' know how to use and which can be automatically compiled to a target VLPL. When LLMs generate code that lies outside of this intermediate language, we use compiler techniques to repair the code into programs in the intermediate language. Overall, we introduce \emph{synthetic programming elicitation and compilation} (SPEAC), an approach that enables LLMs to generate syntactically valid code even for VLPLs. We empirically evaluate the performance of SPEAC in a case study and find that, compared to existing retrieval and fine-tuning baselines, SPEAC produces syntactically correct programs significantly more frequently without sacrificing semantic correctness. △ Less

Submitted 29 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

Comments: 15 pages, 6 figures, 1 table

arXiv:2406.03003 [pdf, other]

Verified Code Transpilation with LLMs

Authors: Sahil Bhatia, Jie Qiu, Niranjan Hasabnis, Sanjit A. Seshia, Alvin Cheung

Abstract: Domain-specific languages (DSLs) are integral to various software workflows. Such languages offer domain-specific optimizations and abstractions that improve code readability and maintainability. However, leveraging these languages requires developers to rewrite existing code using the specific DSL's API. While large language models (LLMs) have shown some success in automatic code transpilation, n… ▽ More Domain-specific languages (DSLs) are integral to various software workflows. Such languages offer domain-specific optimizations and abstractions that improve code readability and maintainability. However, leveraging these languages requires developers to rewrite existing code using the specific DSL's API. While large language models (LLMs) have shown some success in automatic code transpilation, none of them provide any functional correctness guarantees on the transpiled code. Another approach for automating this task is verified lifting, which relies on program synthesis to find programs in the target language that are functionally equivalent to the source language program. While several verified lifting tools have been developed for various application domains, they are specialized for specific source-target languages or require significant expertise in domain knowledge to make the search efficient. In this paper, leveraging recent advances in LLMs, we propose an LLM-based approach (LLMLift) to building verified lifting tools. We use the LLM's capabilities to reason about programs to translate a given program into its corresponding equivalent in the target language. Additionally, we use LLMs to generate proofs for functional equivalence. We develop lifting-based compilers for {\em four different} DSLs targeting different application domains. Our approach not only outperforms previous symbolic-based tools in both the number of benchmarks transpiled and transpilation time, but also requires significantly less effort to build. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2405.06624 [pdf, other]

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

Authors: David "davidad" Dalrymple, Joar Skalse, Yoshua Bengio, Stuart Russell, Max Tegmark, Sanjit Seshia, Steve Omohundro, Christian Szegedy, Ben Goldhaber, Nora Ammann, Alessandro Abate, Joe Halpern, Clark Barrett, Ding Zhao, Tan Zhi-Xuan, Jeannette Wing, Joshua Tenenbaum

Abstract: Ensuring that AI systems reliably and robustly avoid harmful or dangerous behaviours is a crucial challenge, especially for AI systems with a high degree of autonomy and general intelligence, or systems used in safety-critical contexts. In this paper, we will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI. The core feature of these appro… ▽ More Ensuring that AI systems reliably and robustly avoid harmful or dangerous behaviours is a crucial challenge, especially for AI systems with a high degree of autonomy and general intelligence, or systems used in safety-critical contexts. In this paper, we will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI. The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees. This is achieved by the interplay of three core components: a world model (which provides a mathematical description of how the AI system affects the outside world), a safety specification (which is a mathematical description of what effects are acceptable), and a verifier (which provides an auditable proof certificate that the AI satisfies the safety specification relative to the world model). We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them. We also argue for the necessity of this approach to AI safety, and for the inadequacy of the main alternative approaches. △ Less

Submitted 8 July, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.03709 [pdf, other]

Generating Probabilistic Scenario Programs from Natural Language

Authors: Karim Elmaaroufi, Devan Shanker, Ana Cismaru, Marcell Vazquez-Chanlatte, Alberto Sangiovanni-Vincentelli, Matei Zaharia, Sanjit A. Seshia

Abstract: For cyber-physical systems (CPS), including robotics and autonomous vehicles, mass deployment has been hindered by fatal errors that occur when operating in rare events. To replicate rare events such as vehicle crashes, many companies have created logging systems and employed crash reconstruction experts to meticulously recreate these valuable events in simulation. However, in these methods, "what… ▽ More For cyber-physical systems (CPS), including robotics and autonomous vehicles, mass deployment has been hindered by fatal errors that occur when operating in rare events. To replicate rare events such as vehicle crashes, many companies have created logging systems and employed crash reconstruction experts to meticulously recreate these valuable events in simulation. However, in these methods, "what if" questions are not easily formulated and answered. We present ScenarioNL, an AI System for creating scenario programs from natural language. Specifically, we generate these programs from police crash reports. Reports normally contain uncertainty about the exact details of the incidents which we represent through a Probabilistic Programming Language (PPL), Scenic. By using Scenic, we can clearly and concisely represent uncertainty and variation over CPS behaviors, properties, and interactions. We demonstrate how commonplace prompting techniques with the best Large Language Models (LLM) are incapable of reasoning about probabilistic scenario programs and generating code for low-resource languages such as Scenic. Our system is comprised of several LLMs chained together with several kinds of prompting strategies, a compiler, and a simulator. We evaluate our system on publicly available autonomous vehicle crash reports in California from the last five years and share insights into how we generate code that is both semantically meaningful and syntactically correct. △ Less

Submitted 14 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

Comments: 17 pages, 2 figures

arXiv:2404.18249 [pdf, other]

Tenspiler: A Verified Lifting-Based Compiler for Tensor Operations

Authors: Jie Qiu, Colin Cai, Sahil Bhatia, Niranjan Hasabnis, Sanjit A. Seshia, Alvin Cheung

Abstract: Tensor processing infrastructures such as deep learning frameworks and specialized hardware accelerators have revolutionized how computationally intensive code from domains such as deep learning and image processing is executed and optimized. These infrastructures provide powerful and expressive abstractions while ensuring high performance. However, to utilize them, code must be written specifical… ▽ More Tensor processing infrastructures such as deep learning frameworks and specialized hardware accelerators have revolutionized how computationally intensive code from domains such as deep learning and image processing is executed and optimized. These infrastructures provide powerful and expressive abstractions while ensuring high performance. However, to utilize them, code must be written specifically using the APIs / ISAs of such software frameworks or hardware accelerators. Importantly, given the fast pace of innovation in these domains, code written today quickly becomes legacy as new frameworks and accelerators are developed, and migrating such legacy code manually is a considerable effort. To enable developers in leveraging such DSLs while preserving their current programming paradigm, we introduce Tenspiler, a verified lifting-based compiler that uses program synthesis to translate sequential programs written in general-purpose programming languages (e.g., C++ or Python code) into tensor operations. Central to Tenspiler is our carefully crafted yet simple intermediate language, named TensIR, that expresses tensor operations. TensIR enables efficient lifting, verification, and code generation. Currently, Tenspiler already supports \textbf{six} DSLs, spanning a broad spectrum of software and hardware environments. Furthermore, we show that new backends can be easily supported by Tenspiler by adding simple pattern-matching rules for TensIR. Using 10 real-world code benchmark suites, our experimental evaluation shows that by translating code to be executed on \textbf{6} different software frameworks and hardware devices, Tenspiler offers on average 105$\times$ kernel and 9.65$\times$ end-to-end execution time improvement over the fully-optimized sequential implementation of the same benchmarks. △ Less

Submitted 28 April, 2024; originally announced April 2024.

arXiv:2404.11578 [pdf, other]

LTL-Constrained Policy Optimization with Cycle Experience Replay

Authors: Ameesh Shah, Cameron Voloshin, Chenxi Yang, Abhinav Verma, Swarat Chaudhuri, Sanjit A. Seshia

Abstract: Linear Temporal Logic (LTL) offers a precise means for constraining the behavior of reinforcement learning agents. However, in many tasks, LTL is insufficient for task specification; LTL-constrained policy optimization, where the goal is to optimize a scalar reward under LTL constraints, is needed. Prior methods for this constrained problem are restricted to finite state spaces. In this work, we p… ▽ More Linear Temporal Logic (LTL) offers a precise means for constraining the behavior of reinforcement learning agents. However, in many tasks, LTL is insufficient for task specification; LTL-constrained policy optimization, where the goal is to optimize a scalar reward under LTL constraints, is needed. Prior methods for this constrained problem are restricted to finite state spaces. In this work, we present Cycle Experience Replay (CyclER), a reward-sha** approach to this problem that allows continuous state and action spaces and the use of function approximations. CyclER guides a policy towards satisfaction by encouraging partial behaviors compliant with the LTL constraint, using the structure of the constraint. In doing so, it addresses the optimization challenges stemming from the sparse nature of LTL satisfaction. We evaluate CyclER in three continuous control domains. On these tasks, CyclER outperforms existing reward-sha** methods at finding performant and LTL-satisfying policies. △ Less

Submitted 24 May, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

Comments: preprint, 9 pages in main text

arXiv:2403.11737 [pdf, other]

SMT-Based Dynamic Multi-Robot Task Allocation

Authors: Victoria Marie Tuck, Pei-Wei Chen, Georgios Fainekos, Bardh Hoxha, Hideki Okamoto, S. Shankar Sastry, Sanjit A. Seshia

Abstract: Multi-Robot Task Allocation (MRTA) is a problem that arises in many application domains including package delivery, warehouse robotics, and healthcare. In this work, we consider the problem of MRTA for a dynamic stream of tasks with task deadlines and capacitated agents (capacity for more than one simultaneous task). Previous work commonly focuses on the static case, uses specialized algorithms fo… ▽ More Multi-Robot Task Allocation (MRTA) is a problem that arises in many application domains including package delivery, warehouse robotics, and healthcare. In this work, we consider the problem of MRTA for a dynamic stream of tasks with task deadlines and capacitated agents (capacity for more than one simultaneous task). Previous work commonly focuses on the static case, uses specialized algorithms for restrictive task specifications, or lacks guarantees. We propose an approach to Dynamic MRTA for capacitated robots that is based on Satisfiability Modulo Theories (SMT) solving and addresses these concerns. We show our approach is both sound and complete, and that the SMT encoding is general, enabling extension to a broader class of task specifications. We show how to leverage the incremental solving capabilities of SMT solvers, kee** learned information when allocating new tasks arriving online, and to solve non-incrementally, which we provide runtime comparisons of. Additionally, we provide an algorithm to start with a smaller but potentially incomplete encoding that can iteratively be adjusted to the complete encoding. We evaluate our method on a parameterized set of benchmarks encoding multi-robot delivery created from a graph abstraction of a hospital-like environment. The effectiveness of our approach is demonstrated using a range of encodings, including quantifier-free theories of uninterpreted functions and linear or bitvector arithmetic across multiple solvers. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: 26 pages, 6 figures, to be published in NASA Formal Methods Symposium 2024

arXiv:2402.07051 [pdf, other]

$L^*LM$: Learning Automata from Examples using Natural Language Oracles

Authors: Marcell Vazquez-Chanlatte, Karim Elmaaroufi, Stefan J. Witwicki, Sanjit A. Seshia

Abstract: Expert demonstrations have proven an easy way to indirectly specify complex tasks. Recent algorithms even support extracting unambiguous formal specifications, e.g. deterministic finite automata (DFA), from demonstrations. Unfortunately, these techniques are generally not sample efficient. In this work, we introduce $L^*LM$, an algorithm for learning DFAs from both demonstrations and natural langu… ▽ More Expert demonstrations have proven an easy way to indirectly specify complex tasks. Recent algorithms even support extracting unambiguous formal specifications, e.g. deterministic finite automata (DFA), from demonstrations. Unfortunately, these techniques are generally not sample efficient. In this work, we introduce $L^*LM$, an algorithm for learning DFAs from both demonstrations and natural language. Due to the expressivity of natural language, we observe a significant improvement in the data efficiency of learning DFAs from expert demonstrations. Technically, $L^*LM$ leverages large language models to answer membership queries about the underlying task. This is then combined with recent techniques for transforming learning from demonstrations into a sequence of labeled example learning problems. In our experiments, we observe the two modalities complement each other, yielding a powerful few-shot learner. △ Less

Submitted 10 February, 2024; originally announced February 2024.

arXiv:2310.12234 [pdf, other]

An Eager Satisfiability Modulo Theories Solver for Algebraic Datatypes

Authors: Amar Shah, Federico Mora, Sanjit A. Seshia

Abstract: Algebraic data types (ADTs) are a construct classically found in functional programming languages that capture data structures like enumerated types, lists, and trees. In recent years, interest in ADTs has increased. For example, popular programming languages, like Python, have added support for ADTs. Automated reasoning about ADTs can be done using satisfiability modulo theories (SMT) solving, an… ▽ More Algebraic data types (ADTs) are a construct classically found in functional programming languages that capture data structures like enumerated types, lists, and trees. In recent years, interest in ADTs has increased. For example, popular programming languages, like Python, have added support for ADTs. Automated reasoning about ADTs can be done using satisfiability modulo theories (SMT) solving, an extension of the Boolean satisfiability problem with constraints over first-order structures. Unfortunately, SMT solvers that support ADTs do not scale as state-of-the-art approaches all use variations of the same \emph{lazy} approach. In this paper, we present an SMT solver that takes a fundamentally different approach, an \emph{eager} approach. Specifically, our solver reduces ADT queries to a simpler logical theory, uninterpreted functions (UF), and then uses an existing solver on the reduced query. We prove the soundness and completeness of our approach and demonstrate that it outperforms the state-of-theart on existing benchmarks, as well as a new, more challenging benchmark set from the planning domain. △ Less

Submitted 18 October, 2023; originally announced October 2023.

arXiv:2307.10434 [pdf, other]

Learning Formal Specifications from Membership and Preference Queries

Authors: Ameesh Shah, Marcell Vazquez-Chanlatte, Sebastian Junges, Sanjit A. Seshia

Abstract: Active learning is a well-studied approach to learning formal specifications, such as automata. In this work, we extend active specification learning by proposing a novel framework that strategically requests a combination of membership labels and pair-wise preferences, a popular alternative to membership labels. The combination of pair-wise preferences and membership labels allows for a more flex… ▽ More Active learning is a well-studied approach to learning formal specifications, such as automata. In this work, we extend active specification learning by proposing a novel framework that strategically requests a combination of membership labels and pair-wise preferences, a popular alternative to membership labels. The combination of pair-wise preferences and membership labels allows for a more flexible approach to active specification learning, which previously relied on membership labels only. We instantiate our framework in two different domains, demonstrating the generality of our approach. Our results suggest that learning from both modalities allows us to robustly and conveniently identify specifications via membership and preferences. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: 6 pages, Presented at ICML 2023 Workshop on The Many Facets of Preference-Based Learning

arXiv:2307.03325 [pdf, other]

3D Environment Modeling for Falsification and Beyond with Scenic 3.0

Authors: Eric Vin, Shun Kashiwa, Matthew Rhea, Daniel J. Fremont, Edward Kim, Tommaso Dreossi, Shromona Ghosh, Xiangyu Yue, Alberto L. Sangiovanni-Vincentelli, Sanjit A. Seshia

Abstract: We present a major new version of Scenic, a probabilistic programming language for writing formal models of the environments of cyber-physical systems. Scenic has been successfully used for the design and analysis of CPS in a variety of domains, but earlier versions are limited to environments which are essentially two-dimensional. In this paper, we extend Scenic with native support for 3D geometr… ▽ More We present a major new version of Scenic, a probabilistic programming language for writing formal models of the environments of cyber-physical systems. Scenic has been successfully used for the design and analysis of CPS in a variety of domains, but earlier versions are limited to environments which are essentially two-dimensional. In this paper, we extend Scenic with native support for 3D geometry, introducing new syntax which provides expressive ways to describe 3D configurations while preserving the simplicity and readability of the language. We replace Scenic's simplistic representation of objects as boxes with precise modeling of complex shapes, including a ray tracing-based visibility system that accounts for object occlusion. We also extend the language to support arbitrary temporal requirements expressed in LTL, and build an extensible Scenic parser generated from a formal grammar of the language. Finally, we illustrate the new application domains these features enable with case studies that would have been impossible to accurately model in Scenic 2. △ Less

Submitted 6 July, 2023; originally announced July 2023.

Comments: 13 pages, 6 figures. Full version of a CAV 2023 tool paper, to appear in the Springer Lecture Notes in Computer Science series

arXiv:2306.00521 [pdf]

Genetic Algorithms for Searching a Matrix of Metagrammars for Synthesis

Authors: Yixuan Li, Federico Mora, Elizabeth Polgreen, Sanjit A. Seshia

Abstract: Syntax-guided synthesis is a paradigm in program synthesis in which the search space of candidate solutions is constrained by a syntactic template in the form of a grammar. These syntactic constraints serve two purposes: constraining the language to the space the user desires, but also rendering the search space tractable for the synthesizer. Given a well-written syntactic template, this is an ext… ▽ More Syntax-guided synthesis is a paradigm in program synthesis in which the search space of candidate solutions is constrained by a syntactic template in the form of a grammar. These syntactic constraints serve two purposes: constraining the language to the space the user desires, but also rendering the search space tractable for the synthesizer. Given a well-written syntactic template, this is an extremely effective technique. However, this is highly dependent on the user providing such a template: a syntactic template that is too large results in a larger search space and slower synthesis, and a syntactic template that is too small may not contain the solution needed. In this work, we frame the space of syntactic templates as a matrix of rules, and demonstrate how this matrix can be searched effectively with little training data using simple search techniques such as genetic algorithms, giving improvements in both the number of benchmarks solved and solving time for the state-of-the-art synthesis solver. △ Less

Submitted 4 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: SYNT 2023 of CAV 2023, 4 Pages

arXiv:2305.17596 [pdf, other]

Context-Aided Variable Elimination for Requirement Engineering

Authors: Inigo Incer, Albert Benveniste, Richard M. Murray, Alberto Sangiovanni-Vincentelli, Sanjit A. Seshia

Abstract: Deriving system-level specifications from component specifications usually involves the elimination of variables that are not part of the interface of the top-level system. This paper presents algorithms for eliminating variables from formulas by computing refinements or relaxations of these formulas in a context. We discuss a connection between this problem and optimization and give efficient alg… ▽ More Deriving system-level specifications from component specifications usually involves the elimination of variables that are not part of the interface of the top-level system. This paper presents algorithms for eliminating variables from formulas by computing refinements or relaxations of these formulas in a context. We discuss a connection between this problem and optimization and give efficient algorithms to compute refinements and relaxations of linear inequality constraints. △ Less

Submitted 27 May, 2023; originally announced May 2023.

arXiv:2303.17751 [pdf, other]

Pacti: Scaling Assume-Guarantee Reasoning for System Analysis and Design

Authors: Inigo Incer, Apurva Badithela, Josefine Graebener, Piergiuseppe Mallozzi, Ayush Pandey, Sheng-Jung Yu, Albert Benveniste, Benoit Caillaud, Richard M. Murray, Alberto Sangiovanni-Vincentelli, Sanjit A. Seshia

Abstract: Contract-based design is a method to facilitate modular system design. While there has been substantial progress on the theory of contracts, there has been less progress on scalable algorithms for the algebraic operations in this theory. In this paper, we present: 1) principles to implement a contract-based design tool at scale and 2) Pacti, a tool that can efficiently compute these operations. We… ▽ More Contract-based design is a method to facilitate modular system design. While there has been substantial progress on the theory of contracts, there has been less progress on scalable algorithms for the algebraic operations in this theory. In this paper, we present: 1) principles to implement a contract-based design tool at scale and 2) Pacti, a tool that can efficiently compute these operations. We then illustrate the use of Pacti in a variety of case studies. △ Less

Submitted 30 March, 2023; originally announced March 2023.

arXiv:2303.17010 [pdf, other]

Specification-Guided Data Aggregation for Semantically Aware Imitation Learning

Authors: Ameesh Shah, Jonathan DeCastro, John Gideon, Beyazit Yalcinkaya, Guy Rosman, Sanjit A. Seshia

Abstract: Advancements in simulation and formal methods-guided environment sampling have enabled the rigorous evaluation of machine learning models in a number of safety-critical scenarios, such as autonomous driving. Application of these environment sampling techniques towards improving the learned models themselves has yet to be fully exploited. In this work, we introduce a novel method for improving imit… ▽ More Advancements in simulation and formal methods-guided environment sampling have enabled the rigorous evaluation of machine learning models in a number of safety-critical scenarios, such as autonomous driving. Application of these environment sampling techniques towards improving the learned models themselves has yet to be fully exploited. In this work, we introduce a novel method for improving imitation-learned models in a semantically aware fashion by leveraging specification-guided sampling techniques as a means of aggregating expert data in new environments. Specifically, we create a set of formal specifications as a means of partitioning the space of possible environments into semantically similar regions, and identify elements of this partition where our learned imitation behaves most differently from the expert. We then aggregate expert data on environments in these identified regions, leading to more accurate imitation of the expert's behavior semantics. We instantiate our approach in a series of experiments in the CARLA driving simulator, and demonstrate that our approach leads to models that are more accurate than those learned with other environment sampling methods. △ Less

Submitted 29 March, 2023; originally announced March 2023.

Comments: 8 pages, under review

arXiv:2302.13980 [pdf, other]

A Grammar for the Representation of Unmanned Aerial Vehicles with 3D Topologies

Authors: Piergiuseppe Mallozzi, Hussein Sibai, Inigo Incer, Sanjit A. Seshia, Alberto Sangiovanni-Vincentelli

Abstract: We propose a context-sensitive grammar for the systematic exploration of the design space of the topology of 3D robots, particularly unmanned aerial vehicles. It defines production rules for adding components to an incomplete design topology modeled over a 3D grid. The rules are local. The grammar is simple, yet capable of modeling most existing UAVs as well as novel ones. It can be easily general… ▽ More We propose a context-sensitive grammar for the systematic exploration of the design space of the topology of 3D robots, particularly unmanned aerial vehicles. It defines production rules for adding components to an incomplete design topology modeled over a 3D grid. The rules are local. The grammar is simple, yet capable of modeling most existing UAVs as well as novel ones. It can be easily generalized to other robotic platforms. It can be thought of as a building block for any design exploration and optimization algorithm. △ Less

Submitted 27 February, 2023; originally announced February 2023.

arXiv:2212.02795 [pdf]

Emerging Technology and Policy Co-Design Considerations for the Safe and Transparent Use of Small Unmanned Aerial Systems

Authors: Ritwik Gupta, Alexander Bayen, Sarah Rohrschneider, Adrienne Fulk, Andrew Reddie, Sanjit A. Seshia, Shankar Sastry, Janet Napolitano

Abstract: The rapid technological growth observed in the sUAS sector over the past decade has been unprecedented and has left gaps in policies and regulations to adequately provide for a safe and trusted environment in which to operate these devices. The Center for Security in Politics at UC Berkeley, via a two-day workshop, analyzed these gaps by addressing the entire sUAS vertical. From human factors to a… ▽ More The rapid technological growth observed in the sUAS sector over the past decade has been unprecedented and has left gaps in policies and regulations to adequately provide for a safe and trusted environment in which to operate these devices. The Center for Security in Politics at UC Berkeley, via a two-day workshop, analyzed these gaps by addressing the entire sUAS vertical. From human factors to autonomy, we recommend a series of steps that can be taken by partners in the academic, commercial, and government sectors to reduce policy gaps introduced in the wake of the growth of the sUAS industry. △ Less

Submitted 6 December, 2022; originally announced December 2022.

arXiv:2211.02179 [pdf, other]

Verifying RISC-V Physical Memory Protection

Authors: Kevin Cheang, Cameron Rasmussen, Dayeol Lee, David W. Kohlbrenner, Krste Asanović, Sanjit A. Seshia

Abstract: We formally verify an open-source hardware implementation of physical memory protection (PMP) in RISC-V, which is a standard feature used for memory isolation in security critical systems such as the Keystone trusted execution environment. PMP provides per-hardware-thread machine-mode control registers that specify the access privileges for physical memory regions. We first formalize the functiona… ▽ More We formally verify an open-source hardware implementation of physical memory protection (PMP) in RISC-V, which is a standard feature used for memory isolation in security critical systems such as the Keystone trusted execution environment. PMP provides per-hardware-thread machine-mode control registers that specify the access privileges for physical memory regions. We first formalize the functional property of the PMP rules based on the RISC-V ISA manual. Then, we use the LIME tool to translate an open-source implementation of the PMP hardware module written in Chisel to the UCLID5 formal verification language. We encode the formal specification in UCLID5 and verify the functional correctness of the hardware. This is an initial effort towards verifying the Keystone framework, where the trusted computing base (TCB) relies on PMP to provide security guarantees such as integrity and confidentiality. △ Less

Submitted 3 November, 2022; originally announced November 2022.

Comments: SECRISC-V 2019 Workshop

arXiv:2209.15253 [pdf, other]

doi 10.1145/3548606.3560595

Cerberus: A Formal Approach to Secure and Efficient Enclave Memory Sharing

Authors: Dayeol Lee, Kevin Cheang, Alexander Thomas, Catherine Lu, Pranav Gaddamadugu, Anjo Vahldiek-Oberwagner, Mona Vij, Dawn Song, Sanjit A. Seshia, Krste Asanović

Abstract: Hardware enclaves rely on a disjoint memory model, which maps each physical address to an enclave to achieve strong memory isolation. However, this severely limits the performance and programmability of enclave programs. While some prior work proposes enclave memory sharing, it does not provide a formal model or verification of their designs. This paper presents Cerberus, a formal approach to secu… ▽ More Hardware enclaves rely on a disjoint memory model, which maps each physical address to an enclave to achieve strong memory isolation. However, this severely limits the performance and programmability of enclave programs. While some prior work proposes enclave memory sharing, it does not provide a formal model or verification of their designs. This paper presents Cerberus, a formal approach to secure and efficient enclave memory sharing. To reduce the burden of formal verification, we compare different sharing models and choose a simple yet powerful sharing model. Based on the sharing model, Cerberus extends an enclave platform such that enclave memory can be made immutable and shareable across multiple enclaves via additional operations. We use incremental verification starting with an existing formal model called the Trusted Abstract Platform (TAP). Using our extended TAP model, we formally verify that Cerberus does not break or weaken the security guarantees of the enclaves despite allowing memory sharing. More specifically, we prove the Secure Remote Execution (SRE) property on our formal model. Finally, the paper shows the feasibility of Cerberus by implementing it in an existing enclave platform, RISC-V Keystone. △ Less

Submitted 14 November, 2022; v1 submitted 30 September, 2022; originally announced September 2022.

Comments: ACM CCS 2022

arXiv:2208.06733 [pdf, other]

Automated Conversion of Axiomatic to Operational Models: Theory and Practice

Authors: Adwait Godbole, Yatin A. Manerkar, Sanjit A. Seshia

Abstract: A system may be modelled as an operational model (which has explicit notions of state and transitions between states) or an axiomatic model (which is specified entirely as a set of invariants). Most formal methods techniques (e.g., IC3, invariant synthesis, etc) are designed for operational models and are largely inaccessible to axiomatic models. Furthermore, no prior method exists to automaticall… ▽ More A system may be modelled as an operational model (which has explicit notions of state and transitions between states) or an axiomatic model (which is specified entirely as a set of invariants). Most formal methods techniques (e.g., IC3, invariant synthesis, etc) are designed for operational models and are largely inaccessible to axiomatic models. Furthermore, no prior method exists to automatically convert axiomatic models to operational ones, so operational equivalents to axiomatic models had to be manually created and proven equivalent. In this paper, we advance the state-of-the-art in axiomatic to operational model conversion. We show that general axioms in the $μ$spec axiomatic modelling framework cannot be translated to equivalent finite-state operational models. We also derive restrictions on the space of $μ$spec axioms that enable the feasible generation of equivalent finite-state operational models for them. As for practical results, we develop a methodology for automatically translating $μ$spec axioms to equivalent finite-state automata-based operational models. We demonstrate the efficacy of our method by using the models generated by our procedure to prove the correctness of ordering properties on three RTL designs. △ Less

Submitted 13 August, 2022; originally announced August 2022.

Comments: 16 pages, 14 pages

ACM Class: F.1.1; C.1.2; F.3.1

arXiv:2208.03699 [pdf, other]

UCLID5: Multi-Modal Formal Modeling, Verification, and Synthesis

Authors: Elizabeth Polgreen, Kevin Cheang, Pranav Gaddamadugu, Adwait Godbole, Kevin Laeufer, Shaokai Lin, Yatin A. Manerkar, Federico Mora, Sanjit A. Seshia

Abstract: UCLID5 is a tool for the multi-modal formal modeling, verification, and synthesis of systems. It enables one to tackle verification problems for heterogeneous systems such as combinations of hardware and software, or those that have multiple, varied specifications, or systems that require hybrid modes of modeling. A novel aspect of \uclid is an emphasis on the use of syntax-guided and inductive sy… ▽ More UCLID5 is a tool for the multi-modal formal modeling, verification, and synthesis of systems. It enables one to tackle verification problems for heterogeneous systems such as combinations of hardware and software, or those that have multiple, varied specifications, or systems that require hybrid modes of modeling. A novel aspect of \uclid is an emphasis on the use of syntax-guided and inductive synthesis to automate steps in modeling and verification. This tool paper presents new developments in the \uclid tool including new language features, integration with new techniques for syntax-guided synthesis and satisfiability solving, support for hyperproperties and combinations of axiomatic and operational modeling, demonstrations on new problem classes, and a robust implementation. △ Less

Submitted 7 August, 2022; originally announced August 2022.

Comments: 12 pages plus appendix. Published at CAV 2022

arXiv:2205.13013 [pdf, other]

Learning Deterministic Finite Automata Decompositions from Examples and Demonstrations

Authors: Niklas Lauffer, Beyazit Yalcinkaya, Marcell Vazquez-Chanlatte, Ameesh Shah, Sanjit A. Seshia

Abstract: The identification of a deterministic finite automaton (DFA) from labeled examples is a well-studied problem in the literature; however, prior work focuses on the identification of monolithic DFAs. Although monolithic DFAs provide accurate descriptions of systems' behavior, they lack simplicity and interpretability; moreover, they fail to capture sub-tasks realized by the system and introduce indu… ▽ More The identification of a deterministic finite automaton (DFA) from labeled examples is a well-studied problem in the literature; however, prior work focuses on the identification of monolithic DFAs. Although monolithic DFAs provide accurate descriptions of systems' behavior, they lack simplicity and interpretability; moreover, they fail to capture sub-tasks realized by the system and introduce inductive biases away from the inherent decomposition of the overall task. In this paper, we present an algorithm for learning conjunctions of DFAs from labeled examples. Our approach extends an existing SAT-based method to systematically enumerate Pareto-optimal candidate solutions. We highlight the utility of our approach by integrating it with a state-of-the-art algorithm for learning DFAs from demonstrations. Our experiments show that the algorithm learns sub-tasks realized by the labeled examples, and it is scalable in the domains of interest. △ Less

Submitted 25 May, 2022; originally announced May 2022.

Comments: Preprint, under review

arXiv:2203.02609 [pdf, other]

doi 10.1109/CCTA48906.2021.9659247

DEC-LOS-RRT: Decentralized Path Planning for Multi-robot Systems with Line-of-sight Constrained Communication

Authors: Victoria Tuck, Yash Vardhan Pant, Sanjit A. Seshia, S. Shankar Sastry

Abstract: Decentralized planning for multi-agent systems, such as fleets of robots in a search-and-rescue operation, is often constrained by limitations on how agents can communicate with each other. One such limitation is the case when agents can communicate with each other only when they are in line-of-sight (LOS). Develo** decentralized planning methods that guarantee safety is difficult in this case,… ▽ More Decentralized planning for multi-agent systems, such as fleets of robots in a search-and-rescue operation, is often constrained by limitations on how agents can communicate with each other. One such limitation is the case when agents can communicate with each other only when they are in line-of-sight (LOS). Develo** decentralized planning methods that guarantee safety is difficult in this case, as agents that are occluded from each other might not be able to communicate until it's too late to avoid a safety violation. In this paper, we develop a decentralized planning method that explicitly avoids situations where lack of visibility of other agents would lead to an unsafe situation. Building on top of an existing Rapidly-exploring Random Tree (RRT)-based approach, our method guarantees safety at each iteration. Simulation studies show the effectiveness of our method and compare the degradation in performance with respect to a clairvoyant decentralized planning algorithm where agents can communicate despite not being in LOS of each other. △ Less

Submitted 4 March, 2022; originally announced March 2022.

Comments: 8 pages, 8 figures, Presented at CCTA 2022

Journal ref: CCTA (2021) 103-110

arXiv:2112.10807 [pdf, other]

Demonstration Informed Specification Search

Authors: Marcell Vazquez-Chanlatte, Ameesh Shah, Gil Lederman, Sanjit A. Seshia

Abstract: This paper considers the problem of learning temporal task specifications, e.g. automata and temporal logic, from expert demonstrations. Task specifications are a class of sparse memory augmented rewards with explicit support for temporal and Boolean composition. Three features make learning temporal task specifications difficult: (1) the (countably) infinite number of tasks under consideration; (… ▽ More This paper considers the problem of learning temporal task specifications, e.g. automata and temporal logic, from expert demonstrations. Task specifications are a class of sparse memory augmented rewards with explicit support for temporal and Boolean composition. Three features make learning temporal task specifications difficult: (1) the (countably) infinite number of tasks under consideration; (2) an a-priori ignorance of what memory is needed to encode the task; and (3) the discrete solution space - typically addressed by (brute force) enumeration. To overcome these hurdles, we propose Demonstration Informed Specification Search (DISS): a family of algorithms requiring only black box access to a maximum entropy planner and a task sampler from labeled examples. DISS then works by alternating between conjecturing labeled examples to make the provided demonstrations less surprising and sampling tasks consistent with the conjectured labeled examples. We provide a concrete implementation of DISS in the context of tasks described by Deterministic Finite Automata, and show that DISS is able to efficiently identify tasks from only one or two expert demonstrations. △ Less

Submitted 24 April, 2023; v1 submitted 20 December, 2021; originally announced December 2021.

arXiv:2112.00206 [pdf, other]

Querying Labelled Data with Scenario Programs for Sim-to-Real Validation

Authors: Edward Kim, Jay Shenoy, Sebastian Junges, Daniel Fremont, Alberto Sangiovanni-Vincentelli, Sanjit Seshia

Abstract: Simulation-based testing of autonomous vehicles (AVs) has become an essential complement to road testing to ensure safety. Consequently, substantial research has focused on searching for failure scenarios in simulation. However, a fundamental question remains: are AV failure scenarios identified in simulation meaningful in reality, i.e., are they reproducible on the real system? Due to the sim-to-… ▽ More Simulation-based testing of autonomous vehicles (AVs) has become an essential complement to road testing to ensure safety. Consequently, substantial research has focused on searching for failure scenarios in simulation. However, a fundamental question remains: are AV failure scenarios identified in simulation meaningful in reality, i.e., are they reproducible on the real system? Due to the sim-to-real gap arising from discrepancies between simulated and real sensor data, a failure scenario identified in simulation can be either a spurious artifact of the synthetic sensor data or an actual failure that persists with real sensor data. An approach to validate simulated failure scenarios is to identify instances of the scenario in a corpus of real data, and check if the failure persists on the real data. To this end, we propose a formal definition of what it means for a labelled data item to match an abstract scenario, encoded as a scenario program using the SCENIC probabilistic programming language. Using this definition, we develop a querying algorithm which, given a scenario program and a labelled dataset, finds the subset of data matching the scenario. Experiments demonstrate that our algorithm is accurate and efficient on a variety of realistic traffic scenarios, and scales to a reasonable number of agents. △ Less

Submitted 30 November, 2021; originally announced December 2021.

Comments: pre-print

arXiv:2110.14870 [pdf, other]

A Scenario-Based Platform for Testing Autonomous Vehicle Behavior Prediction Models in Simulation

Authors: Francis Indaheng, Edward Kim, Kesav Viswanadha, Jay Shenoy, **kyu Kim, Daniel J. Fremont, Sanjit A. Seshia

Abstract: Behavior prediction remains one of the most challenging tasks in the autonomous vehicle (AV) software stack. Forecasting the future trajectories of nearby agents plays a critical role in ensuring road safety, as it equips AVs with the necessary information to plan safe routes of travel. However, these prediction models are data-driven and trained on data collected in real life that may not represe… ▽ More Behavior prediction remains one of the most challenging tasks in the autonomous vehicle (AV) software stack. Forecasting the future trajectories of nearby agents plays a critical role in ensuring road safety, as it equips AVs with the necessary information to plan safe routes of travel. However, these prediction models are data-driven and trained on data collected in real life that may not represent the full range of scenarios an AV can encounter. Hence, it is important that these prediction models are extensively tested in various test scenarios involving interactive behaviors prior to deployment. To support this need, we present a simulation-based testing platform which supports (1) intuitive scenario modeling with a probabilistic programming language called Scenic, (2) specifying a multi-objective evaluation metric with a partial priority ordering, (3) falsification of the provided metric, and (4) parallelization of simulations for scalable testing. As a part of the platform, we provide a library of 25 Scenic programs that model challenging test scenarios involving interactive traffic participant behaviors. We demonstrate the effectiveness and the scalability of our platform by testing a trained behavior prediction model and searching for failure scenarios. △ Less

Submitted 13 November, 2021; v1 submitted 27 October, 2021; originally announced October 2021.

Comments: Accepted to the NeurIPS 2021 Workshop on Machine Learning for Autonomous Driving

arXiv:2108.13796 [pdf, other]

Addressing the IEEE AV Test Challenge with Scenic and VerifAI

Authors: Kesav Viswanadha, Francis Indaheng, Justin Wong, Edward Kim, Ellen Kalvan, Yash Pant, Daniel J. Fremont, Sanjit A. Seshia

Abstract: This paper summarizes our formal approach to testing autonomous vehicles (AVs) in simulation for the IEEE AV Test Challenge. We demonstrate a systematic testing framework leveraging our previous work on formally-driven simulation for intelligent cyber-physical systems. First, to model and generate interactive scenarios involving multiple agents, we used Scenic, a probabilistic programming language… ▽ More This paper summarizes our formal approach to testing autonomous vehicles (AVs) in simulation for the IEEE AV Test Challenge. We demonstrate a systematic testing framework leveraging our previous work on formally-driven simulation for intelligent cyber-physical systems. First, to model and generate interactive scenarios involving multiple agents, we used Scenic, a probabilistic programming language for specifying scenarios. A Scenic program defines an abstract scenario as a distribution over configurations of physical objects and their behaviors over time. Sampling from an abstract scenario yields many different concrete scenarios which can be run as test cases for the AV. Starting from a Scenic program encoding an abstract driving scenario, we can use the VerifAI toolkit to search within the scenario for failure cases with respect to multiple AV evaluation metrics. We demonstrate the effectiveness of our testing framework by identifying concrete failure scenarios for an open-source autopilot, Apollo, starting from a variety of realistic traffic scenarios. △ Less

Submitted 20 August, 2021; originally announced August 2021.

Comments: Accepted to the IEEE AITest Conference 2021

arXiv:2108.07307 [pdf, ps, other]

Synthesizing Pareto-Optimal Interpretations for Black-Box Models

Authors: Hazem Torfah, Shetal Shah, Supratik Chakraborty, S. Akshay, Sanjit A. Seshia

Abstract: We present a new multi-objective optimization approach for synthesizing interpretations that "explain" the behavior of black-box machine learning models. Constructing human-understandable interpretations for black-box models often requires balancing conflicting objectives. A simple interpretation may be easier to understand for humans while being less precise in its predictions vis-a-vis a complex… ▽ More We present a new multi-objective optimization approach for synthesizing interpretations that "explain" the behavior of black-box machine learning models. Constructing human-understandable interpretations for black-box models often requires balancing conflicting objectives. A simple interpretation may be easier to understand for humans while being less precise in its predictions vis-a-vis a complex interpretation. Existing methods for synthesizing interpretations use a single objective function and are often optimized for a single class of interpretations. In contrast, we provide a more general and multi-objective synthesis framework that allows users to choose (1) the class of syntactic templates from which an interpretation should be synthesized, and (2) quantitative measures on both the correctness and explainability of an interpretation. For a given black-box, our approach yields a set of Pareto-optimal interpretations with respect to the correctness and explainability measures. We show that the underlying multi-objective optimization problem can be solved via a reduction to quantitative constraint solving, such as weighted maximum satisfiability. To demonstrate the benefits of our approach, we have applied it to synthesize interpretations for black-box neural-network classifiers. Our experiments show that there often exists a rich and varied set of choices for interpretations that are missed by existing approaches. △ Less

Submitted 16 August, 2021; originally announced August 2021.

Comments: Long version of conference paper accepted at FMCAD'21

arXiv:2107.13477 [pdf, other]

Satisfiability and Synthesis Modulo Oracles

Authors: Elizabeth Polgreen, Andrew Reynolds, Sanjit A. Seshia

Abstract: In classic program synthesis algorithms, such as counterexample-guided inductive synthesis (CEGIS), the algorithms alternate between a synthesis phase and an oracle (verification) phase. Many synthesis algorithms use a white-box oracle based on satisfiability modulo theory (SMT) solvers to provide counterexamples. But what if a white-box oracle is either not available or not easy to work with? We… ▽ More In classic program synthesis algorithms, such as counterexample-guided inductive synthesis (CEGIS), the algorithms alternate between a synthesis phase and an oracle (verification) phase. Many synthesis algorithms use a white-box oracle based on satisfiability modulo theory (SMT) solvers to provide counterexamples. But what if a white-box oracle is either not available or not easy to work with? We present a framework for solving a general class of oracle-guided synthesis problems which we term synthesis modulo oracles. In this setting, oracles may be black boxes with a query-response interface defined by the synthesis problem. As a necessary component of this framework, we also formalize the problem of satisfiability modulo theories and oracles, and present an algorithm for solving this problem. We implement a prototype solver for satisfiability and synthesis modulo oracles and demonstrate that, by using oracles that execute functions not easily modeled in SMT-constraints, such as recursive functions or oracles that incorporate compilation and execution of code, SMTO and SyMO are able to solve problems beyond the abilities of standard SMT and synthesis solvers. △ Less

Submitted 28 July, 2021; originally announced July 2021.

Comments: 12 pages, 8 Figures

arXiv:2107.04164 [pdf, other]

Parallel and Multi-Objective Falsification with Scenic and VerifAI

Authors: Kesav Viswanadha, Edward Kim, Francis Indaheng, Daniel J. Fremont, Sanjit A. Seshia

Abstract: Falsification has emerged as an important tool for simulation-based verification of autonomous systems. In this paper, we present extensions to the Scenic scenario specification language and VerifAI toolkit that improve the scalability of sampling-based falsification methods by using parallelism and extend falsification to multi-objective specifications. We first present a parallelized framework t… ▽ More Falsification has emerged as an important tool for simulation-based verification of autonomous systems. In this paper, we present extensions to the Scenic scenario specification language and VerifAI toolkit that improve the scalability of sampling-based falsification methods by using parallelism and extend falsification to multi-objective specifications. We first present a parallelized framework that is interfaced with both the simulation and sampling capabilities of Scenic and the falsification capabilities of VerifAI, reducing the execution time bottleneck inherently present in simulation-based testing. We then present an extension of VerifAI's falsification algorithms to support multi-objective optimization during sampling, using the concept of rulebooks to specify a preference ordering over multiple metrics that can be used to guide the counterexample search process. Lastly, we evaluate the benefits of these extensions with a comprehensive set of benchmarks written in the Scenic language. △ Less

Submitted 8 July, 2021; originally announced July 2021.

arXiv:2106.10365 [pdf, other]

Scenic4RL: Programmatic Modeling and Generation of Reinforcement Learning Environments

Authors: Abdus Salam Azad, Edward Kim, Qiancheng Wu, Kimin Lee, Ion Stoica, Pieter Abbeel, Sanjit A. Seshia

Abstract: The capability of a reinforcement learning (RL) agent heavily depends on the diversity of the learning scenarios generated by the environment. Generation of diverse realistic scenarios is challenging for real-time strategy (RTS) environments. The RTS environments are characterized by intelligent entities/non-RL agents cooperating and competing with the RL agents with large state and action spaces… ▽ More The capability of a reinforcement learning (RL) agent heavily depends on the diversity of the learning scenarios generated by the environment. Generation of diverse realistic scenarios is challenging for real-time strategy (RTS) environments. The RTS environments are characterized by intelligent entities/non-RL agents cooperating and competing with the RL agents with large state and action spaces over a long period of time, resulting in an infinite space of feasible, but not necessarily realistic, scenarios involving complex interaction among different RL and non-RL agents. Yet, most of the existing simulators rely on randomly generating the environments based on predefined settings/layouts and offer limited flexibility and control over the environment dynamics for researchers to generate diverse, realistic scenarios as per their demand. To address this issue, for the first time, we formally introduce the benefits of adopting an existing formal scenario specification language, SCENIC, to assist researchers to model and generate diverse scenarios in an RTS environment in a flexible, systematic, and programmatic manner. To showcase the benefits, we interfaced SCENIC to an existing RTS environment Google Research Football(GRF) simulator and introduced a benchmark consisting of 32 realistic scenarios, encoded in SCENIC, to train RL agents and testing their generalization capabilities. We also show how researchers/RL practitioners can incorporate their domain knowledge to expedite the training process by intuitively modeling stochastic programmatic policies with SCENIC. △ Less

Submitted 28 March, 2023; v1 submitted 18 June, 2021; originally announced June 2021.

Comments: First two authors contributed equally. The final version of this paper is accepted at Proceedings of the AAAI Conference on Artificial Intelligence, 36(6), 6028-6036. https://doi.org/10.1609/aaai.v36i6.20549

arXiv:2106.02449 [pdf, other]

doi 10.1007/978-3-031-06773-0_36

Hypercontracts

Authors: Inigo Incer, Albert Benveniste, Alberto Sangiovanni-Vincentelli, Sanjit A. Seshia

Abstract: Contract theories have been proposed to formally support distributed and decentralized system design while ensuring safe system integration. In this paper we propose hypercontracts, a generic model with a richer structure for its underlying model of components, subsuming simulation preorders. While this new model remains generic, it provides a much more elegant and richer algebra for its key notio… ▽ More Contract theories have been proposed to formally support distributed and decentralized system design while ensuring safe system integration. In this paper we propose hypercontracts, a generic model with a richer structure for its underlying model of components, subsuming simulation preorders. While this new model remains generic, it provides a much more elegant and richer algebra for its key notions of refinement, parallel composition, and quotient, and it allows inclusion of new operations. On top of these foundations, we propose conic hypercontracts, which are still generic but come with a finite description. △ Less

Submitted 8 October, 2021; v1 submitted 26 May, 2021; originally announced June 2021.

Journal ref: NFM 2022. Lecture Notes in Computer Science, vol 13260

arXiv:2105.12326 [pdf, other]

Model Checking Finite-Horizon Markov Chains with Probabilistic Inference

Authors: Steven Holtzen, Sebastian Junges, Marcell Vazquez-Chanlatte, Todd Millstein, Sanjit A. Seshia, Guy Van Den Broeck

Abstract: We revisit the symbolic verification of Markov chains with respect to finite horizon reachability properties. The prevalent approach iteratively computes step-bounded state reachability probabilities. By contrast, recent advances in probabilistic inference suggest symbolically representing all horizon-length paths through the Markov chain. We ask whether this perspective advances the state-of-the-… ▽ More We revisit the symbolic verification of Markov chains with respect to finite horizon reachability properties. The prevalent approach iteratively computes step-bounded state reachability probabilities. By contrast, recent advances in probabilistic inference suggest symbolically representing all horizon-length paths through the Markov chain. We ask whether this perspective advances the state-of-the-art in probabilistic model checking. First, we formally describe both approaches in order to highlight their key differences. Then, using these insights we develop Rubicon, a tool that transpiles Prism models to the probabilistic inference tool Dice. Finally, we demonstrate better scalability compared to probabilistic model checkers on selected benchmarks. All together, our results suggest that probabilistic inference is a valuable addition to the probabilistic model checking portfolio -- with Rubicon as a first step towards integrating both perspectives. △ Less

Submitted 30 June, 2021; v1 submitted 26 May, 2021; originally announced May 2021.

Comments: Technical Report. Accepted at CAV 2021

arXiv:2105.12322 [pdf, other]

Runtime Monitoring for Markov Decision Processes

Authors: Sebastian Junges, Hazem Torfah, Sanjit A. Seshia

Abstract: We investigate the problem of monitoring partially observable systems with nondeterministic and probabilistic dynamics. In such systems, every state may be associated with a risk, e.g., the probability of an imminent crash. During runtime, we obtain partial information about the system state in form of observations. The monitor uses this information to estimate the risk of the (unobservable) curre… ▽ More We investigate the problem of monitoring partially observable systems with nondeterministic and probabilistic dynamics. In such systems, every state may be associated with a risk, e.g., the probability of an imminent crash. During runtime, we obtain partial information about the system state in form of observations. The monitor uses this information to estimate the risk of the (unobservable) current system state. Our results are threefold. First, we show that extensions of state estimation approaches do not scale due the combination of nondeterminism and probabilities. While convex hull algorithms improve the practical runtime, they do not prevent an exponential memory blowup. Second, we present a tractable algorithm based on model checking conditional reachability probabilities. Third, we provide prototypical implementations and manifest the applicability of our algorithms to a range of benchmarks. The results highlight the possibilities and boundaries of our novel algorithms. △ Less

Submitted 26 May, 2021; originally announced May 2021.

Comments: Technical report with appendix. Accepted at CAV

arXiv:2103.05672 [pdf, other]

Entropy-Guided Control Improvisation

Authors: Marcell Vazquez-Chanlatte, Sebastian Junges, Daniel J. Fremont, Sanjit Seshia

Abstract: High level declarative constraints provide a powerful (and popular) way to define and construct control policies; however, most synthesis algorithms do not support specifying the degree of randomness (unpredictability) of the resulting controller. In many contexts, e.g., patrolling, testing, behavior prediction,and planning on idealized models, predictable or biased controllers are undesirable. To… ▽ More High level declarative constraints provide a powerful (and popular) way to define and construct control policies; however, most synthesis algorithms do not support specifying the degree of randomness (unpredictability) of the resulting controller. In many contexts, e.g., patrolling, testing, behavior prediction,and planning on idealized models, predictable or biased controllers are undesirable. To address these concerns, we introduce the \emph{Entropic Reactive Control Improvisation} (ERCI) framework and algorithm which supports synthesizing control policies for stochastic games that are declaratively specified by (i) a \emph{hard constraint} specifying what must occur, (ii) a \emph{soft constraint} specifying what typically occurs, and (iii) a \emph{randomization constraint} specifying the unpredictability and variety of the controller, as quantified using causal entropy. This framework, extends the state of the art by supporting arbitrary combinations of adversarial and probabilistic uncertainty in the environment. ERCI enables a flexible modeling formalism which we argue, theoretically and empirically, remains tractable. △ Less

Submitted 28 June, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

Comments: RSS 21

arXiv:2011.14551 [pdf, other]

A Customizable Dynamic Scenario Modeling and Data Generation Platform for Autonomous Driving

Authors: Jay Shenoy, Edward Kim, Xiangyu Yue, Taesung Park, Daniel Fremont, Alberto Sangiovanni-Vincentelli, Sanjit Seshia

Abstract: Safely interacting with humans is a significant challenge for autonomous driving. The performance of this interaction depends on machine learning-based modules of an autopilot, such as perception, behavior prediction, and planning. These modules require training datasets with high-quality labels and a diverse range of realistic dynamic behaviors. Consequently, training such modules to handle rare… ▽ More Safely interacting with humans is a significant challenge for autonomous driving. The performance of this interaction depends on machine learning-based modules of an autopilot, such as perception, behavior prediction, and planning. These modules require training datasets with high-quality labels and a diverse range of realistic dynamic behaviors. Consequently, training such modules to handle rare scenarios is difficult because they are, by definition, rarely represented in real-world datasets. Hence, there is a practical need to augment datasets with synthetic data covering these rare scenarios. In this paper, we present a platform to model dynamic and interactive scenarios, generate the scenarios in simulation with different modalities of labeled sensor data, and collect this information for data augmentation. To our knowledge, this is the first integrated platform for these tasks specialized to the autonomous driving domain. △ Less

Submitted 30 November, 2020; originally announced November 2020.

arXiv:2010.06580 [pdf, other]

Scenic: A Language for Scenario Specification and Data Generation

Authors: Daniel J. Fremont, Edward Kim, Tommaso Dreossi, Shromona Ghosh, Xiangyu Yue, Alberto L. Sangiovanni-Vincentelli, Sanjit A. Seshia

Abstract: We propose a new probabilistic programming language for the design and analysis of cyber-physical systems, especially those based on machine learning. Specifically, we consider the problems of training a system to be robust to rare events, testing its performance under different conditions, and debugging failures. We show how a probabilistic programming language can help address these problems by… ▽ More We propose a new probabilistic programming language for the design and analysis of cyber-physical systems, especially those based on machine learning. Specifically, we consider the problems of training a system to be robust to rare events, testing its performance under different conditions, and debugging failures. We show how a probabilistic programming language can help address these problems by specifying distributions encoding interesting types of inputs, then sampling these to generate specialized training and test data. More generally, such languages can be used to write environment models, an essential prerequisite to any formal analysis. In this paper, we focus on systems like autonomous cars and robots, whose environment at any point in time is a 'scene', a configuration of physical objects and agents. We design a domain-specific language, Scenic, for describing scenarios that are distributions over scenes and the behaviors of their agents over time. As a probabilistic programming language, Scenic allows assigning distributions to features of the scene, as well as declaratively imposing hard and soft constraints over the scene. We develop specialized techniques for sampling from the resulting distribution, taking advantage of the structure provided by Scenic's domain-specific syntax. Finally, we apply Scenic in a case study on a convolutional neural network designed to detect cars in road images, improving its performance beyond that achieved by state-of-the-art synthetic data generation methods. △ Less

Submitted 13 October, 2020; originally announced October 2020.

Comments: Supercedes arXiv:1809.09310

arXiv:2009.14363 [pdf, other]

Co-design of Control and Planning for Multi-rotor UAVs with Signal Temporal Logic Specifications

Authors: Yash Vardhan Pant, He Yin, Murat Arcak, Sanjit A. Seshia

Abstract: Urban Air Mobility (UAM), or the scenario where multiple manned and Unmanned Aerial Vehicles (UAVs) carry out various tasks over urban airspaces, is a transportation concept of the future that is gaining prominence. UAM missions with complex spatial, temporal and reactive requirements can be succinctly represented using Signal Temporal Logic (STL), a behavioral specification language. However, pla… ▽ More Urban Air Mobility (UAM), or the scenario where multiple manned and Unmanned Aerial Vehicles (UAVs) carry out various tasks over urban airspaces, is a transportation concept of the future that is gaining prominence. UAM missions with complex spatial, temporal and reactive requirements can be succinctly represented using Signal Temporal Logic (STL), a behavioral specification language. However, planning and control of systems with STL specifications is computationally intensive, usually resulting in planning approaches that do not guarantee dynamical feasibility, or control approaches that cannot handle complex STL specifications. Here, we present an approach to co-design the planner and control such that a given STL specification (possibly over multiple UAVs) is satisfied with trajectories that are dynamically feasible and our controller can track them with a bounded tracking-error that the planner accounts for. The tracking controller is formulated for the non-linear dynamics of the individual UAVs, and the tracking error bound is computed for this controller when the trajectories satisfy some kinematic constraints. We also augment an existing multi-UAV STL-based trajectory generator in order to generate trajectories that satisfy such constraints. We show that this co-design allows for trajectories that satisfy a given STL specification, and are also dynamically feasible in the sense that they can be tracked with bounded error. The applicability of this approach is demonstrated through simulations of multi-UAV missions. △ Less

Submitted 29 September, 2020; originally announced September 2020.

arXiv:2009.00155 [pdf, other]

A Review of Single-Source Deep Unsupervised Visual Domain Adaptation

Authors: Sicheng Zhao, Xiangyu Yue, Shanghang Zhang, Bo Li, Han Zhao, Bichen Wu, Ravi Krishna, Joseph E. Gonzalez, Alberto L. Sangiovanni-Vincentelli, Sanjit A. Seshia, Kurt Keutzer

Abstract: Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks. However, in many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data. To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another s… ▽ More Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks. However, in many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data. To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain. Unfortunately, direct transfer across domains often performs poorly due to the presence of domain shift or dataset bias. Domain adaptation is a machine learning paradigm that aims to learn a model from a source domain that can perform well on a different (but related) target domain. In this paper, we review the latest single-source deep unsupervised domain adaptation methods focused on visual tasks and discuss new perspectives for future research. We begin with the definitions of different domain adaptation strategies and the descriptions of existing benchmark datasets. We then summarize and compare different categories of single-source unsupervised domain adaptation methods, including discrepancy-based methods, adversarial discriminative methods, adversarial generative methods, and self-supervision-based methods. Finally, we discuss future research directions with challenges and possible solutions. △ Less

Submitted 18 September, 2020; v1 submitted 31 August, 2020; originally announced September 2020.

arXiv:2008.09707 [pdf, other]

SOTER on ROS: A Run-Time Assurance Framework on the Robot Operating System

Authors: Sumukh Shivakumar, Hazem Torfah, Ankush Desai, Sanjit A. Seshia

Abstract: We present an implementation of SOTER, a run-time assurance framework for building safe distributed mobile robotic (DMR) systems, on top of the Robot Operating System (ROS). The safety of DMR systems cannot always be guaranteed at design time, especially when complex, off-the-shelf components are used that cannot be verified easily. SOTER addresses this by providing a language-based approach for r… ▽ More We present an implementation of SOTER, a run-time assurance framework for building safe distributed mobile robotic (DMR) systems, on top of the Robot Operating System (ROS). The safety of DMR systems cannot always be guaranteed at design time, especially when complex, off-the-shelf components are used that cannot be verified easily. SOTER addresses this by providing a language-based approach for run-time assurance for DMR systems. SOTER implements the reactive robotic software using the language P, a domain-specific language designed for implementing asynchronous event-driven systems, along with an integrated run-time assurance system that allows programmers to use unfortified components but still provide safety guarantees. We describe an implementation of SOTER for ROS and demonstrate its efficacy using a multi-robot surveillance case study, with multiple run-time assurance modules. Through rigorous simulation, we show that SOTER enabled systems ensure safety, even when using unknown and untrusted components. △ Less

Submitted 21 August, 2020; originally announced August 2020.

Comments: 20th International Conference on Runtime Verification

arXiv:2007.10519 [pdf, ps, other]

SynRG: Syntax Guided Synthesis of Expressions with Alternating Quantifiers

Authors: Elizabeth Polgreen, Sanjit A. Seshia

Abstract: Program synthesis is the task of automatically generating expressions that satisfy a given specification. Program synthesis techniques have been used to automate the generation of loop invariants in code, synthesize function summaries, and to assist programmers via program sketching. Syntax-guided synthesis has been a successful paradigm in this area, however, one area where the state-of-the-art s… ▽ More Program synthesis is the task of automatically generating expressions that satisfy a given specification. Program synthesis techniques have been used to automate the generation of loop invariants in code, synthesize function summaries, and to assist programmers via program sketching. Syntax-guided synthesis has been a successful paradigm in this area, however, one area where the state-of-the-art solvers fall-down is reasoning about potentially unbounded data structures such as arrays where both specifications and solutions may require quantifiers and quantifier alternations. We present SynRG, a synthesis algorithm based on restricting the synthesis problem to generate candidate solutions with quantification over a finite domain, and then generalizing these candidate solutions to the unrestricted domain of the original specification. We report experiments on invariant synthesis benchmarks and on program sketching benchmarks taken from the Java StringUtils class and show that our technique can synthesize expressions out of reach of all existing solvers. △ Less

Submitted 12 October, 2020; v1 submitted 20 July, 2020; originally announced July 2020.

Comments: 20 pages

arXiv:2007.06760 [pdf, ps, other]

Synthesis in Uclid5

Authors: Federico Mora, Kevin Cheang, Elizabeth Polgreen, Sanjit A. Seshia

Abstract: We describe an integration of program synthesis into Uclid5, a formal modelling and verification tool. To the best of our knowledge, the new version of Uclid5 is the only tool that supports program synthesis with bounded model checking, k-induction, sequential program verification, and hyperproperty verification. We use the integration to generate 25 program synthesis benchmarks with simple, known… ▽ More We describe an integration of program synthesis into Uclid5, a formal modelling and verification tool. To the best of our knowledge, the new version of Uclid5 is the only tool that supports program synthesis with bounded model checking, k-induction, sequential program verification, and hyperproperty verification. We use the integration to generate 25 program synthesis benchmarks with simple, known solutions that are out of reach of current synthesis engines, and we release the benchmarks to the community. △ Less

Submitted 16 July, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

arXiv:2007.06677 [pdf, ps, other]

Gradient Descent over Metagrammars for Syntax-Guided Synthesis

Authors: Nicolas Chan, Elizabeth Polgreen, Sanjit A. Seshia

Abstract: The performance of a syntax-guided synthesis algorithm is highly dependent on the provision of a good syntactic template, or grammar. Provision of such a template is often left to the user to do manually, though in the absence of such a grammar, state-of-the-art solvers will provide their own default grammar, which is dependent on the signature of the target program to be sythesized. In this work,… ▽ More The performance of a syntax-guided synthesis algorithm is highly dependent on the provision of a good syntactic template, or grammar. Provision of such a template is often left to the user to do manually, though in the absence of such a grammar, state-of-the-art solvers will provide their own default grammar, which is dependent on the signature of the target program to be sythesized. In this work, we speculate this default grammar could be improved upon substantially. We build sets of rules, or metagrammars, for constructing grammars, and perform a gradient descent over these metagrammars aiming to find a metagrammar which solves more benchmarks and on average faster. We show the resulting metagrammar enables CVC4 to solve 26% more benchmarks than the default grammar within a 300s time-out, and that metagrammars learnt from tens of benchmarks generalize to performance on 100s of benchmarks. △ Less

Submitted 16 July, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

Comments: 5 pages, SYNT 2020

arXiv:2007.03204 [pdf, other]

doi 10.1609/aaai.v35i14.17474

Learning Branching Heuristics for Propositional Model Counting

Authors: Pashootan Vaezipoor, Gil Lederman, Yuhuai Wu, Chris J. Maddison, Roger Grosse, Sanjit A. Seshia, Fahiem Bacchus

Abstract: Propositional model counting, or #SAT, is the problem of computing the number of satisfying assignments of a Boolean formula. Many problems from different application areas, including many discrete probabilistic inference problems, can be translated into model counting problems to be solved by #SAT solvers. Exact #SAT solvers, however, are often not scalable to industrial size instances. In this p… ▽ More Propositional model counting, or #SAT, is the problem of computing the number of satisfying assignments of a Boolean formula. Many problems from different application areas, including many discrete probabilistic inference problems, can be translated into model counting problems to be solved by #SAT solvers. Exact #SAT solvers, however, are often not scalable to industrial size instances. In this paper, we present Neuro#, an approach for learning branching heuristics to improve the performance of exact #SAT solvers on instances from a given family of problems. We experimentally show that our method reduces the step count on similarly distributed held-out instances and generalizes to much larger instances from the same problem family. It is able to achieve these results on a number of different problem families having very different structures. In addition to step count improvements, Neuro# can also achieve orders of magnitude wall-clock speedups over the vanilla solver on larger instances in some problem families, despite the runtime overhead of querying the model. △ Less

Submitted 8 September, 2022; v1 submitted 7 July, 2020; originally announced July 2020.

Journal ref: 35(14), 2021, 12427-12435

arXiv:2007.00085 [pdf, other]

Enforcing Almost-Sure Reachability in POMDPs

Authors: Sebastian Junges, Nils Jansen, Sanjit A. Seshia

Abstract: Partially-Observable Markov Decision Processes (POMDPs) are a well-known stochastic model for sequential decision making under limited information. We consider the EXPTIME-hard problem of synthesising policies that almost-surely reach some goal state without ever visiting a bad state. In particular, we are interested in computing the winning region, that is, the set of system configurations from w… ▽ More Partially-Observable Markov Decision Processes (POMDPs) are a well-known stochastic model for sequential decision making under limited information. We consider the EXPTIME-hard problem of synthesising policies that almost-surely reach some goal state without ever visiting a bad state. In particular, we are interested in computing the winning region, that is, the set of system configurations from which a policy exists that satisfies the reachability specification. A direct application of such a winning region is the safe exploration of POMDPs by, for instance, restricting the behavior of a reinforcement learning agent to the region. We present two algorithms: A novel SAT-based iterative approach and a decision-diagram based alternative. The empirical evaluation demonstrates the feasibility and efficacy of the approaches. △ Less

Submitted 18 March, 2021; v1 submitted 30 June, 2020; originally announced July 2020.

arXiv:2005.07173 [pdf, other]

Formal Analysis and Redesign of a Neural Network-Based Aircraft Taxiing System with VerifAI

Authors: Daniel J. Fremont, Johnathan Chiu, Dragos D. Margineantu, Denis Osipychev, Sanjit A. Seshia

Abstract: We demonstrate a unified approach to rigorous design of safety-critical autonomous systems using the VerifAI toolkit for formal analysis of AI-based systems. VerifAI provides an integrated toolchain for tasks spanning the design process, including modeling, falsification, debugging, and ML component retraining. We evaluate all of these applications in an industrial case study on an experimental au… ▽ More We demonstrate a unified approach to rigorous design of safety-critical autonomous systems using the VerifAI toolkit for formal analysis of AI-based systems. VerifAI provides an integrated toolchain for tasks spanning the design process, including modeling, falsification, debugging, and ML component retraining. We evaluate all of these applications in an industrial case study on an experimental autonomous aircraft taxiing system developed by Boeing, which uses a neural network to track the centerline of a runway. We define runway scenarios using the Scenic probabilistic programming language, and use them to drive tests in the X-Plane flight simulator. We first perform falsification, automatically finding environment conditions causing the system to violate its specification by deviating significantly from the centerline (or even leaving the runway entirely). Next, we use counterexample analysis to identify distinct failure cases, and confirm their root causes with specialized testing. Finally, we use the results of falsification and debugging to retrain the network, eliminating several failure cases and improving the overall performance of the closed-loop system. △ Less

Submitted 14 May, 2020; originally announced May 2020.

Comments: Full version of a CAV 2020 paper

arXiv:2003.07739 [pdf, other]

Formal Scenario-Based Testing of Autonomous Vehicles: From Simulation to the Real World

Authors: Daniel J. Fremont, Edward Kim, Yash Vardhan Pant, Sanjit A. Seshia, Atul Acharya, Xantha Bruso, Paul Wells, Steve Lemke, Qiang Lu, Shalin Mehta

Abstract: We present a new approach to automated scenario-based testing of the safety of autonomous vehicles, especially those using advanced artificial intelligence-based components, spanning both simulation-based evaluation as well as testing in the real world. Our approach is based on formal methods, combining formal specification of scenarios and safety properties, algorithmic test case generation using… ▽ More We present a new approach to automated scenario-based testing of the safety of autonomous vehicles, especially those using advanced artificial intelligence-based components, spanning both simulation-based evaluation as well as testing in the real world. Our approach is based on formal methods, combining formal specification of scenarios and safety properties, algorithmic test case generation using formal simulation, test case selection for track testing, executing test cases on the track, and analyzing the resulting data. Experiments with a real autonomous vehicle at an industrial testing facility support our hypotheses that (i) formal simulation can be effective at identifying test cases to run on the track, and (ii) the gap between simulated and real worlds can be systematically evaluated and bridged. △ Less

Submitted 12 July, 2020; v1 submitted 17 March, 2020; originally announced March 2020.

Comments: 9 pages, 6 figures. Full version of an ITSC 2020 paper

ACM Class: I.2.9; D.2.4; D.2.5

arXiv:1912.00289 [pdf, other]

A Programmatic and Semantic Approach to Explaining and DebuggingNeural Network Based Object Detectors

Authors: Edward Kim, Divya Gopinath, Corina Pasareanu, Sanjit Seshia

Abstract: Even as deep neural networks have become very effective for tasks in vision and perception, it remains difficult to explain and debug their behavior. In this paper, we present a programmatic and semantic approach to explaining, understanding, and debugging the correct and incorrect behaviors of a neural network-based perception system. Our approach is semantic in that it employs a high-level repre… ▽ More Even as deep neural networks have become very effective for tasks in vision and perception, it remains difficult to explain and debug their behavior. In this paper, we present a programmatic and semantic approach to explaining, understanding, and debugging the correct and incorrect behaviors of a neural network-based perception system. Our approach is semantic in that it employs a high-level representation of the distribution of environment scenarios that the detector is intended to work on. It is programmatic in that scenario representation is a program in a domain-specific probabilistic programming language which can be used to generate synthetic data to test a given perception module. Our framework assesses the performance of a perception module to identify correct and incorrect detections, extracts rules from those results that semantically characterizes the correct and incorrect scenarios, and then specializes the probabilistic program with those rules in order to more precisely characterize the scenarios in which the perception module operates correctly or not. We demonstrate our results using the SCENIC probabilistic programming language and a neural network-based object detector. Our experiments show that it is possible to automatically generate compact rules that significantly increase the correct detection rate (or conversely the incorrect detection rate) of the network and can thus help with understanding and debugging its behavior. △ Less

Submitted 16 June, 2020; v1 submitted 30 November, 2019; originally announced December 2019.

Journal ref: CVPR (2020)

Showing 1–50 of 104 results for author: Seshia, S