Search | arXiv e-print repository

Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference

Authors: Anton Xue, Avishree Khare, Rajeev Alur, Surbhi Goel, Eric Wong

Abstract: We study how to subvert language models from following the rules. We model rule-following as inference in propositional Horn logic, a mathematical system in which rules have the form "if $P$ and $Q$, then $R$" for some propositions $P$, $Q$, and $R$. We prove that although transformers can faithfully abide by such rules, maliciously crafted prompts can nevertheless mislead even theoretically const… ▽ More We study how to subvert language models from following the rules. We model rule-following as inference in propositional Horn logic, a mathematical system in which rules have the form "if $P$ and $Q$, then $R$" for some propositions $P$, $Q$, and $R$. We prove that although transformers can faithfully abide by such rules, maliciously crafted prompts can nevertheless mislead even theoretically constructed models. Empirically, we find that attacks on our theoretical models mirror popular attacks on large language models. Our work suggests that studying smaller theoretical models can help understand the behavior of large language models in rule-based settings like logical reasoning and jailbreak attacks. △ Less

Submitted 21 June, 2024; originally announced July 2024.

arXiv:2406.06246 [pdf, other]

Data-Efficient Learning with Neural Programs

Authors: Alaia Solko-Breslin, Seewon Choi, Ziyang Li, Neelay Velingker, Rajeev Alur, Mayur Naik, Eric Wong

Abstract: Many computational tasks can be naturally expressed as a composition of a DNN followed by a program written in a traditional programming language or an API call to an LLM. We call such composites "neural programs" and focus on the problem of learning the DNN parameters when the training data consist of end-to-end input-output labels for the composite. When the program is written in a differentiabl… ▽ More Many computational tasks can be naturally expressed as a composition of a DNN followed by a program written in a traditional programming language or an API call to an LLM. We call such composites "neural programs" and focus on the problem of learning the DNN parameters when the training data consist of end-to-end input-output labels for the composite. When the program is written in a differentiable logic programming language, techniques from neurosymbolic learning are applicable, but in general, the learning for neural programs requires estimating the gradients of black-box components. We present an algorithm for learning neural programs, called ISED, that only relies on input-output samples of black-box components. For evaluation, we introduce new benchmarks that involve calls to modern LLMs such as GPT-4 and also consider benchmarks from the neurosymolic learning literature. Our evaluation shows that for the latter benchmarks, ISED has comparable performance to state-of-the-art neurosymbolic frameworks. For the former, we use adaptations of prior work on gradient approximations of black-box components as a baseline, and show that ISED achieves comparable accuracy but in a more data- and sample-efficient manner. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2402.00793 [pdf, other]

Human Expertise in Algorithmic Prediction

Authors: Rohan Alur, Manish Raghavan, Devavrat Shah

Abstract: We introduce a novel framework for incorporating human expertise into algorithmic predictions. Our approach focuses on the use of human judgment to distinguish inputs which `look the same' to any feasible predictive algorithm. We argue that this framing clarifies the problem of human/AI collaboration in prediction tasks, as experts often have access to information -- particularly subjective inform… ▽ More We introduce a novel framework for incorporating human expertise into algorithmic predictions. Our approach focuses on the use of human judgment to distinguish inputs which `look the same' to any feasible predictive algorithm. We argue that this framing clarifies the problem of human/AI collaboration in prediction tasks, as experts often have access to information -- particularly subjective information -- which is not encoded in the algorithm's training data. We use this insight to develop a set of principled algorithms for selectively incorporating human feedback only when it improves the performance of any feasible predictor. We find empirically that although algorithms often outperform their human counterparts on average, human judgment can significantly improve algorithmic predictions on specific instances (which can be identified ex-ante). In an X-ray classification task, we find that this subset constitutes nearly 30% of the patient population. Our approach provides a natural way of uncovering this heterogeneity and thus enabling effective human-AI collaboration. △ Less

Submitted 22 May, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

Comments: 35 pages, 13 figures

arXiv:2311.16169 [pdf, other]

Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities

Authors: Avishree Khare, Saikat Dutta, Ziyang Li, Alaia Solko-Breslin, Rajeev Alur, Mayur Naik

Abstract: Security vulnerabilities in modern software are prevalent and harmful. While automated vulnerability detection tools have made promising progress, their scalability and applicability remain challenging. Recently, Large Language Models (LLMs), such as GPT-4 and CodeLlama, have demonstrated remarkable performance on code-related tasks. However, it is unknown whether such LLMs can do complex reasonin… ▽ More Security vulnerabilities in modern software are prevalent and harmful. While automated vulnerability detection tools have made promising progress, their scalability and applicability remain challenging. Recently, Large Language Models (LLMs), such as GPT-4 and CodeLlama, have demonstrated remarkable performance on code-related tasks. However, it is unknown whether such LLMs can do complex reasoning over code. In this work, we explore whether pre-trained LLMs can detect security vulnerabilities and address the limitations of existing tools. We evaluate the effectiveness of pre-trained LLMs, in terms of performance, explainability, and robustness, on a set of five diverse security benchmarks spanning two languages, Java and C/C++, and covering both synthetic and real-world projects. Overall, all LLMs show modest effectiveness in end-to-end reasoning about vulnerabilities, obtaining an average of 60% accuracy across all datasets. However, we observe that LLMs show promising abilities at performing parts of the analysis correctly, such as identifying vulnerability-related specifications (e.g., sources and sinks) and leveraging natural language information to understand code behavior (e.g., to check if code is sanitized). Further, LLMs are relatively much better at detecting simpler vulnerabilities that typically only need local reasoning (e.g., Integer Overflows and NULL pointer dereference). We find that advanced prompting strategies that involve step-by-step analysis significantly improve performance of LLMs on real-world datasets (improving F1 score by up to 0.25 on average). Finally, we share our insights and recommendations for future work on leveraging LLMs for vulnerability detection. △ Less

Submitted 9 June, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

arXiv:2307.05902 [pdf, other]

Stability Guarantees for Feature Attributions with Multiplicative Smoothing

Authors: Anton Xue, Rajeev Alur, Eric Wong

Abstract: Explanation methods for machine learning models tend not to provide any formal guarantees and may not reflect the underlying decision-making process. In this work, we analyze stability as a property for reliable feature attribution methods. We prove that relaxed variants of stability are guaranteed if the model is sufficiently Lipschitz with respect to the masking of features. We develop a smoothi… ▽ More Explanation methods for machine learning models tend not to provide any formal guarantees and may not reflect the underlying decision-making process. In this work, we analyze stability as a property for reliable feature attribution methods. We prove that relaxed variants of stability are guaranteed if the model is sufficiently Lipschitz with respect to the masking of features. We develop a smoothing method called Multiplicative Smoothing (MuS) to achieve such a model. We show that MuS overcomes the theoretical limitations of standard smoothing techniques and can be integrated with any classifier and feature attribution method. We evaluate MuS on vision and language models with various feature attribution methods, such as LIME and SHAP, and demonstrate that MuS endows feature attributions with non-trivial stability guarantees. △ Less

Submitted 26 October, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

arXiv:2306.01646 [pdf, other]

Auditing for Human Expertise

Authors: Rohan Alur, Loren Laine, Darrick K. Li, Manish Raghavan, Devavrat Shah, Dennis Shung

Abstract: High-stakes prediction tasks (e.g., patient diagnosis) are often handled by trained human experts. A common source of concern about automation in these settings is that experts may exercise intuition that is difficult to model and/or have access to information (e.g., conversations with a patient) that is simply unavailable to a would-be algorithm. This raises a natural question whether human exper… ▽ More High-stakes prediction tasks (e.g., patient diagnosis) are often handled by trained human experts. A common source of concern about automation in these settings is that experts may exercise intuition that is difficult to model and/or have access to information (e.g., conversations with a patient) that is simply unavailable to a would-be algorithm. This raises a natural question whether human experts add value which could not be captured by an algorithmic predictor. We develop a statistical framework under which we can pose this question as a natural hypothesis test. Indeed, as our framework highlights, detecting human expertise is more subtle than simply comparing the accuracy of expert predictions to those made by a particular learning algorithm. Instead, we propose a simple procedure which tests whether expert predictions are statistically independent from the outcomes of interest after conditioning on the available inputs (`features'). A rejection of our test thus suggests that human experts may add value to any algorithm trained on the available data, and has direct implications for whether human-AI `complementarity' is achievable in a given prediction task. We highlight the utility of our procedure using admissions data collected from the emergency department of a large academic hospital system, where we show that physicians' admit/discharge decisions for patients with acute gastrointestinal bleeding (AGIB) appear to be incorporating information that is not available to a standard algorithmic screening tool. This is despite the fact that the screening tool is arguably more accurate than physicians' discretionary decisions, highlighting that -- even absent normative concerns about accountability or interpretability -- accuracy is insufficient to justify algorithmic automation. △ Less

Submitted 27 October, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

Comments: 30 pages, 10 figures. To appear in the proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

arXiv:2305.17115 [pdf, other]

Policy Synthesis and Reinforcement Learning for Discounted LTL

Authors: Rajeev Alur, Osbert Bastani, Kishor Jothimurugan, Mateo Perez, Fabio Somenzi, Ashutosh Trivedi

Abstract: The difficulty of manually specifying reward functions has led to an interest in using linear temporal logic (LTL) to express objectives for reinforcement learning (RL). However, LTL has the downside that it is sensitive to small perturbations in the transition probabilities, which prevents probably approximately correct (PAC) learning without additional assumptions. Time discounting provides a wa… ▽ More The difficulty of manually specifying reward functions has led to an interest in using linear temporal logic (LTL) to express objectives for reinforcement learning (RL). However, LTL has the downside that it is sensitive to small perturbations in the transition probabilities, which prevents probably approximately correct (PAC) learning without additional assumptions. Time discounting provides a way of removing this sensitivity, while retaining the high expressivity of the logic. We study the use of discounted LTL for policy synthesis in Markov decision processes with unknown transition probabilities, and show how to reduce discounted LTL to discounted-sum reward via a reward machine when all discount factors are identical. △ Less

Submitted 29 May, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

arXiv:2302.02984 [pdf, other]

Robust Subtask Learning for Compositional Generalization

Authors: Kishor Jothimurugan, Steve Hsu, Osbert Bastani, Rajeev Alur

Abstract: Compositional reinforcement learning is a promising approach for training policies to perform complex long-horizon tasks. Typically, a high-level task is decomposed into a sequence of subtasks and a separate policy is trained to perform each subtask. In this paper, we focus on the problem of training subtask policies in a way that they can be used to perform any task; here, a task is given by a se… ▽ More Compositional reinforcement learning is a promising approach for training policies to perform complex long-horizon tasks. Typically, a high-level task is decomposed into a sequence of subtasks and a separate policy is trained to perform each subtask. In this paper, we focus on the problem of training subtask policies in a way that they can be used to perform any task; here, a task is given by a sequence of subtasks. We aim to maximize the worst-case performance over all tasks as opposed to the average-case performance. We formulate the problem as a two agent zero-sum game in which the adversary picks the sequence of subtasks. We propose two RL algorithms to solve this game: one is an adaptation of existing multi-agent RL algorithms to our setting and the other is an asynchronous version which enables parallel training of subtask policies. We evaluate our approach on two multi-task environments with continuous states and actions and demonstrate that our algorithms outperform state-of-the-art baselines. △ Less

Submitted 8 June, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

Journal ref: Proceedings of the 40th International Conference on Machine Learning, Honolulu, Hawaii, USA. PMLR 202, 2023

arXiv:2209.05448 [pdf, other]

Composing Copyless Streaming String Transducers

Authors: Rajeev Alur, Taylor Dohmen, Ashutosh Trivedi

Abstract: Streaming string transducers (SSTs) implement string-to-string transformations by reading each input word in a single left-to-right pass while maintaining fragments of potential outputs in a finite set of string variables. These variables get updated on transitions of the transducer, where they can be assigned new values described by concatenations of variables and output symbols. An SST is called… ▽ More Streaming string transducers (SSTs) implement string-to-string transformations by reading each input word in a single left-to-right pass while maintaining fragments of potential outputs in a finite set of string variables. These variables get updated on transitions of the transducer, where they can be assigned new values described by concatenations of variables and output symbols. An SST is called copyless if every update is such that no variable occurs more than once amongst all of the assigned expressions. The transformations realized by copyless SSTs coincide with Courcelle's monadic second-order logic graph transducers (MSOTs) when restricted to string graphs. Copyless SSTs with nondeterminism are known to be equivalent to nondeterministic MSOTs as well. MSOTs, both deterministic and nondeterministic, are closed under composition. Given the equivalence of MSOTs and copyless SSTs, it is easy to see that copyless SSTs are also closed under composition. The original proof of this fact, however, was based on a direct construction to produce a composite copyless SST from two given copyless SSTs. A counterexample discovered by Joost Englefriet showed that this construction may produce copyful transducers. We revisit the original composition constructions for both deterministic and nondeterministic SSTs and show that, although they can introduce copyful updates, the resulting copyful behavior they exhibit is superficial. To characterize this mild copyful behavior, we define a subclass of copyful SSTs, called diamond-free SSTs, in which two copies of a common variable are never combined in any subsequent assignment. In order to recover a modified version of the original construction, we provide a method for producing an equivalent copyless SST from any diamond-free copyful SST. △ Less

Submitted 7 February, 2024; v1 submitted 12 September, 2022; originally announced September 2022.

arXiv:2206.03482 [pdf, other]

Chordal Sparsity for SDP-based Neural Network Verification

Authors: Anton Xue, Lars Lindemann, Rajeev Alur

Abstract: Neural networks are central to many emerging technologies, but verifying their correctness remains a major challenge. It is known that network outputs can be sensitive and fragile to even small input perturbations, thereby increasing the risk of unpredictable and undesirable behavior. Fast and accurate verification of neural networks is therefore critical to their widespread adoption, and in recen… ▽ More Neural networks are central to many emerging technologies, but verifying their correctness remains a major challenge. It is known that network outputs can be sensitive and fragile to even small input perturbations, thereby increasing the risk of unpredictable and undesirable behavior. Fast and accurate verification of neural networks is therefore critical to their widespread adoption, and in recent years, various methods have been developed as a response to this problem. In this paper, we focus on improving semidefinite programming (SDP) based techniques for neural network verification. Such techniques offer the power of expressing complex geometric constraints while retaining a convex problem formulation, but scalability remains a major issue in practice. Our starting point is the DeepSDP framework proposed by Fazlyab et al., which uses quadratic constraints to abstract the verification problem into a large-scale SDP. However, solving this SDP quickly becomes intractable when the network grows. Our key observation is that by leveraging chordal sparsity, we can decompose the primary computational bottleneck of DeepSDP -- a large linear matrix inequality (LMI) -- into an equivalent collection of smaller LMIs. We call our chordally sparse optimization program Chordal-DeepSDP and prove that its construction is identically expressive as that of DeepSDP. Moreover, we show that additional analysis of Chordal-DeepSDP allows us to further rewrite its collection of LMIs in a second level of decomposition that we call Chordal-DeepSDP-2 -- which results in another significant computational gain. Finally, we provide numerical experiments on real networks of learned cart-pole dynamics, showcasing the computational advantage of Chordal-DeepSDP and Chordal-DeepSDP-2 over DeepSDP. △ Less

Submitted 8 January, 2024; v1 submitted 7 June, 2022; originally announced June 2022.

arXiv:2206.03348 [pdf, other]

Specification-Guided Learning of Nash Equilibria with High Social Welfare

Authors: Kishor Jothimurugan, Suguman Bansal, Osbert Bastani, Rajeev Alur

Abstract: Reinforcement learning has been shown to be an effective strategy for automatically training policies for challenging control problems. Focusing on non-cooperative multi-agent systems, we propose a novel reinforcement learning framework for training joint policies that form a Nash equilibrium. In our approach, rather than providing low-level reward functions, the user provides high-level specifica… ▽ More Reinforcement learning has been shown to be an effective strategy for automatically training policies for challenging control problems. Focusing on non-cooperative multi-agent systems, we propose a novel reinforcement learning framework for training joint policies that form a Nash equilibrium. In our approach, rather than providing low-level reward functions, the user provides high-level specifications that encode the objective of each agent. Then, guided by the structure of the specifications, our algorithm searches over policies to identify one that provably forms an $ε$-Nash equilibrium (with high probability). Importantly, it prioritizes policies in a way that maximizes social welfare across all agents. Our empirical evaluation demonstrates that our algorithm computes equilibrium policies with high social welfare, whereas state-of-the-art baselines either fail to compute Nash equilibria or compute ones with comparatively lower social welfare. △ Less

Submitted 6 June, 2022; originally announced June 2022.

arXiv:2204.00846 [pdf, other]

Chordal Sparsity for Lipschitz Constant Estimation of Deep Neural Networks

Authors: Anton Xue, Lars Lindemann, Alexander Robey, Hamed Hassani, George J. Pappas, Rajeev Alur

Abstract: Lipschitz constants of neural networks allow for guarantees of robustness in image classification, safety in controller design, and generalizability beyond the training data. As calculating Lipschitz constants is NP-hard, techniques for estimating Lipschitz constants must navigate the trade-off between scalability and accuracy. In this work, we significantly push the scalability frontier of a semi… ▽ More Lipschitz constants of neural networks allow for guarantees of robustness in image classification, safety in controller design, and generalizability beyond the training data. As calculating Lipschitz constants is NP-hard, techniques for estimating Lipschitz constants must navigate the trade-off between scalability and accuracy. In this work, we significantly push the scalability frontier of a semidefinite programming technique known as LipSDP while achieving zero accuracy loss. We first show that LipSDP has chordal sparsity, which allows us to derive a chordally sparse formulation that we call Chordal-LipSDP. The key benefit is that the main computational bottleneck of LipSDP, a large semidefinite constraint, is now decomposed into an equivalent collection of smaller ones: allowing Chordal-LipSDP to outperform LipSDP particularly as the network depth grows. Moreover, our formulation uses a tunable sparsity parameter that enables one to gain tighter estimates without incurring a significant computational cost. We illustrate the scalability of our approach through extensive numerical experiments. △ Less

Submitted 8 January, 2024; v1 submitted 2 April, 2022; originally announced April 2022.

arXiv:2111.00272 [pdf, ps, other]

A Framework for Transforming Specifications in Reinforcement Learning

Authors: Rajeev Alur, Suguman Bansal, Osbert Bastani, Kishor Jothimurugan

Abstract: Reactive synthesis algorithms allow automatic construction of policies to control an environment modeled as a Markov Decision Process (MDP) that are optimal with respect to high-level temporal logic specifications. However, they assume that the MDP model is known a priori. Reinforcement Learning (RL) algorithms, in contrast, are designed to learn an optimal policy when the transition probabilities… ▽ More Reactive synthesis algorithms allow automatic construction of policies to control an environment modeled as a Markov Decision Process (MDP) that are optimal with respect to high-level temporal logic specifications. However, they assume that the MDP model is known a priori. Reinforcement Learning (RL) algorithms, in contrast, are designed to learn an optimal policy when the transition probabilities of the MDP are unknown, but require the user to associate local rewards with transitions. The appeal of high-level temporal logic specifications has motivated research to develop RL algorithms for synthesis of policies from specifications. To understand the techniques, and nuanced variations in their theoretical guarantees, in the growing body of resulting literature, we develop a formal framework for defining transformations among RL tasks with different forms of objectives. We define the notion of a sampling-based reduction to transform a given MDP into another one which can be simulated even when the transition probabilities of the original MDP are unknown. We formalize the notions of preservation of optimal policies, convergence, and robustness of such reductions. We then use our framework to restate known results, establish new results to fill in some gaps, and identify open problems. In particular, we show that certain kinds of reductions from LTL specifications to reward-based ones do not exist, and prove the non-existence of RL algorithms with PAC-MDP guarantees for safety specifications. △ Less

Submitted 29 May, 2022; v1 submitted 30 October, 2021; originally announced November 2021.

arXiv:2110.06303 [pdf, other]

NetRep: Automatic Repair for Network Programs

Authors: Lei Shi, Yuepeng Wang, Rajeev Alur, Boon Thau Loo

Abstract: Debugging imperative network programs is a challenging task for developers because understanding various network modules and complicated data structures is typically time-consuming. To address the challenge, this paper presents an automated technique for repairing network programs from unit tests. Specifically, given as input a faulty network program and a set of unit tests, our approach localizes… ▽ More Debugging imperative network programs is a challenging task for developers because understanding various network modules and complicated data structures is typically time-consuming. To address the challenge, this paper presents an automated technique for repairing network programs from unit tests. Specifically, given as input a faulty network program and a set of unit tests, our approach localizes the fault through symbolic reasoning, and synthesizes a patch such that the repaired program can pass all unit tests. It applies domain-specific abstraction to simplify network data structures and utilizes modular analysis to facilitate function summary reuse for symbolic analysis. We implement the proposed techniques in a tool called NetRep and evaluate it on 10 benchmarks adapted from real-world software-defined networking controllers. The evaluation results demonstrate the effectiveness and efficiency of NetRep for repairing network programs. △ Less

Submitted 20 October, 2021; v1 submitted 12 October, 2021; originally announced October 2021.

Comments: 24 pages, 9 figures

arXiv:2106.13906 [pdf, other]

Compositional Reinforcement Learning from Logical Specifications

Authors: Kishor Jothimurugan, Suguman Bansal, Osbert Bastani, Rajeev Alur

Abstract: We study the problem of learning control policies for complex tasks given by logical specifications. Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy that maximizes the expected reward. These approaches, however, scale poorly to complex tasks that require high-level planning. In this work, we… ▽ More We study the problem of learning control policies for complex tasks given by logical specifications. Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy that maximizes the expected reward. These approaches, however, scale poorly to complex tasks that require high-level planning. In this work, we develop a compositional learning approach, called DiRL, that interleaves high-level planning and reinforcement learning. First, DiRL encodes the specification as an abstract graph; intuitively, vertices and edges of the graph correspond to regions of the state space and simpler sub-tasks, respectively. Our approach then incorporates reinforcement learning to learn neural network policies for each edge (sub-task) within a Dijkstra-style planning algorithm to compute a high-level plan in the graph. An evaluation of the proposed approach on a set of challenging control benchmarks with continuous state and action spaces demonstrates that it outperforms state-of-the-art baselines. △ Less

Submitted 27 December, 2021; v1 submitted 25 June, 2021; originally announced June 2021.

Journal ref: 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

arXiv:2104.04512 [pdf, other]

Stream Processing With Dependency-Guided Synchronization (Extended Version)

Authors: Konstantinos Kallas, Filip Niksic, Caleb Stanford, Rajeev Alur

Abstract: Real-time data processing applications with low latency requirements have led to the increasing popularity of stream processing systems. While such systems offer convenient APIs that can be used to achieve data parallelism automatically, they offer limited support for computations that require synchronization between parallel nodes. In this paper, we propose *dependency-guided synchronization (DGS… ▽ More Real-time data processing applications with low latency requirements have led to the increasing popularity of stream processing systems. While such systems offer convenient APIs that can be used to achieve data parallelism automatically, they offer limited support for computations that require synchronization between parallel nodes. In this paper, we propose *dependency-guided synchronization (DGS)*, an alternative programming model for stateful streaming computations with complex synchronization requirements. In the proposed model, the input is viewed as partially ordered, and the program consists of a set of parallelization constructs which are applied to decompose the partial order and process events independently. Our programming model maps to an execution model called *synchronization plans* which supports synchronization between parallel nodes. Our evaluation shows that APIs offered by two widely used systems -- Flink and Timely Dataflow -- cannot suitably expose parallelism in some representative applications. In contrast, DGS enables implementations with scalable performance, the resulting synchronization plans offer throughput improvements when implemented manually in existing systems, and the programming overhead is small compared to writing sequential code. △ Less

Submitted 3 January, 2022; v1 submitted 9 April, 2021; originally announced April 2021.

Comments: 41 pages. Non-extended version to appear at Principles and Practice of Parallel Programming (PPoPP), February 2022

arXiv:2010.15638 [pdf, other]

Abstract Value Iteration for Hierarchical Reinforcement Learning

Authors: Kishor Jothimurugan, Osbert Bastani, Rajeev Alur

Abstract: We propose a novel hierarchical reinforcement learning framework for control with continuous state and action spaces. In our framework, the user specifies subgoal regions which are subsets of states; then, we (i) learn options that serve as transitions between these subgoal regions, and (ii) construct a high-level plan in the resulting abstract decision process (ADP). A key challenge is that the A… ▽ More We propose a novel hierarchical reinforcement learning framework for control with continuous state and action spaces. In our framework, the user specifies subgoal regions which are subsets of states; then, we (i) learn options that serve as transitions between these subgoal regions, and (ii) construct a high-level plan in the resulting abstract decision process (ADP). A key challenge is that the ADP may not be Markov, which we address by proposing two algorithms for planning in the ADP. Our first algorithm is conservative, allowing us to prove theoretical guarantees on its performance, which help inform the design of subgoal regions. Our second algorithm is a practical one that interweaves planning at the abstract level and learning at the concrete level. In our experiments, we demonstrate that our approach outperforms state-of-the-art hierarchical reinforcement learning algorithms on several challenging benchmarks. △ Less

Submitted 25 February, 2021; v1 submitted 29 October, 2020; originally announced October 2020.

Journal ref: PMLR: Volume 130 (AISTATS 2021)

arXiv:2010.06135 [pdf, other]

Session-layer Attack Traffic Classification by Program Synthesis

Authors: Lei Shi, Yahui Li, Rajeev Alur, Boon Thau Loo

Abstract: Writing classification rules to identify malicious network traffic is a time-consuming and error-prone task. Learning-based classification systems automatically extract such rules from positive and negative traffic examples. However, due to limitations in the representation of network traffic and the learning strategy, these systems lack both expressiveness to cover a range of attacks and interpre… ▽ More Writing classification rules to identify malicious network traffic is a time-consuming and error-prone task. Learning-based classification systems automatically extract such rules from positive and negative traffic examples. However, due to limitations in the representation of network traffic and the learning strategy, these systems lack both expressiveness to cover a range of attacks and interpretability in fully describing the attack traffic's structure at the session layer. This paper presents Sharingan system, which uses program synthesis techniques to generate network classification programs at the session layer. Sharingan accepts raw network traces as inputs, and reports potential patterns of the attack traffic in NetQRE, a domain specific language designed for specifying session-layer quantitative properties. Using Sharingan, network operators can better analyze the attack pattern due to the following advantages of Sharingan's learning process: (1) it requires minimal feature engineering, (2) it is amenable to efficient implementation of the learnt classifier, and (3) the synthesized program is easy to decipher and edit. We develop a range of novel optimizations that reduce the synthesis time for large and complex tasks to a matter of minutes. Our experiments show that Sharingan is able to correctly identify attacks from a diverse set of network attack traces and generates explainable outputs, while achieving accuracy comparable to state-of-the-art learning-based intrusion detection systems. △ Less

Submitted 12 October, 2020; originally announced October 2020.

Comments: 13 pages

arXiv:2008.09293 [pdf, ps, other]

A Composable Specification Language for Reinforcement Learning Tasks

Authors: Kishor Jothimurugan, Rajeev Alur, Osbert Bastani

Abstract: Reinforcement learning is a promising approach for learning control policies for robot tasks. However, specifying complex tasks (e.g., with multiple objectives and safety constraints) can be challenging, since the user must design a reward function that encodes the entire task. Furthermore, the user often needs to manually shape the reward to ensure convergence of the learning algorithm. We propos… ▽ More Reinforcement learning is a promising approach for learning control policies for robot tasks. However, specifying complex tasks (e.g., with multiple objectives and safety constraints) can be challenging, since the user must design a reward function that encodes the entire task. Furthermore, the user often needs to manually shape the reward to ensure convergence of the learning algorithm. We propose a language for specifying complex control tasks, along with an algorithm that compiles specifications in our language into a reward function and automatically performs reward sha**. We implement our approach in a tool called SPECTRL, and show that it outperforms several state-of-the-art baselines. △ Less

Submitted 29 October, 2020; v1 submitted 20 August, 2020; originally announced August 2020.

Journal ref: In Advances in Neural Information Processing Systems, pp. 13041-13051. 2019

arXiv:2007.03704 [pdf]

Computer-Aided Personalized Education

Authors: Rajeev Alur, Richard Baraniuk, Rastislav Bodik, Ann Drobnis, Sumit Gulwani, Bjoern Hartmann, Yasmin Kafai, Jeff Karpicke, Ran Libeskind-Hadas, Debra Richardson, Armando Solar-Lezama, Candace Thille, Moshe Vardi

Abstract: The shortage of people trained in STEM fields is becoming acute, and universities and colleges are straining to satisfy this demand. In the case of computer science, for instance, the number of US students taking introductory courses has grown three-fold in the past decade. Recently, massive open online courses (MOOCs) have been promoted as a way to ease this strain. This at best provides access t… ▽ More The shortage of people trained in STEM fields is becoming acute, and universities and colleges are straining to satisfy this demand. In the case of computer science, for instance, the number of US students taking introductory courses has grown three-fold in the past decade. Recently, massive open online courses (MOOCs) have been promoted as a way to ease this strain. This at best provides access to education. The bigger challenge though is co** with heterogeneous backgrounds of different students, retention, providing feedback, and assessment. Personalized education relying on computational tools can address this challenge. While automated tutoring has been studied at different times in different communities, recent advances in computing and education technology offer exciting opportunities to transform the manner in which students learn. In particular, at least three trends are significant. First, progress in logical reasoning, data analytics, and natural language processing has led to tutoring tools for automatic assessment, personalized instruction including targeted feedback, and adaptive content generation for a variety of subjects. Second, research in the science of learning and human-computer interaction is leading to a better understanding of how different students learn, when and what types of interventions are effective for different instructional goals, and how to measure the success of educational tools. Finally, the recent emergence of online education platforms, both in academia and industry, is leading to new opportunities for the development of a shared infrastructure. This CCC workshop brought together researchers develo** educational tools based on technologies such as logical reasoning and machine learning with researchers in education, human-computer interaction, and cognitive psychology. △ Less

Submitted 7 July, 2020; originally announced July 2020.

Comments: A Computing Community Consortium (CCC) workshop report, 12 pages

Report number: ccc2016report_6

arXiv:1910.11309 [pdf, other]

Case Study: Verifying the Safety of an Autonomous Racing Car with a Neural Network Controller

Authors: Radoslav Ivanov, Taylor J. Carpenter, James Weimer, Rajeev Alur, George J. Pappas, Insup Lee

Abstract: This paper describes a verification case study on an autonomous racing car with a neural network (NN) controller. Although several verification approaches have been proposed over the last year, they have only been evaluated on low-dimensional systems or systems with constrained environments. To explore the limits of existing approaches, we present a challenging benchmark in which the NN takes raw… ▽ More This paper describes a verification case study on an autonomous racing car with a neural network (NN) controller. Although several verification approaches have been proposed over the last year, they have only been evaluated on low-dimensional systems or systems with constrained environments. To explore the limits of existing approaches, we present a challenging benchmark in which the NN takes raw LiDAR measurements as input and outputs steering for the car. We train a dozen NNs using two reinforcement learning algorithms and show that the state of the art in verification can handle systems with around 40 LiDAR rays, well short of a typical LiDAR scan with 1081 rays. Furthermore, we perform real experiments to investigate the benefits and limitations of verification with respect to the sim2real gap, i.e., the difference between a system's modeled and real performance. We identify cases, similar to the modeled environment, in which verification is strongly correlated with safe behavior. Finally, we illustrate LiDAR fault patterns that can be used to develop robust and safe reinforcement learning algorithms. △ Less

Submitted 24 October, 2019; originally announced October 2019.

arXiv:1904.07146 [pdf, other]

SyGuS-Comp 2018: Results and Analysis

Authors: Rajeev Alur, Dana Fisman, Saswat Padhi, Rishabh Singh, Abhishek Udupa

Abstract: Syntax-guided synthesis (SyGuS) is the computational problem of finding an implementation $f$ that meets both a semantic constraint given by a logical formula $φ$ in a background theory $\mathbb{T}$, and a syntactic constraint given by a grammar $G$, which specifies the allowed set of candidate implementations. Such a synthesis problem can be formally defined in the SyGuS input format (SyGuS-IF),… ▽ More Syntax-guided synthesis (SyGuS) is the computational problem of finding an implementation $f$ that meets both a semantic constraint given by a logical formula $φ$ in a background theory $\mathbb{T}$, and a syntactic constraint given by a grammar $G$, which specifies the allowed set of candidate implementations. Such a synthesis problem can be formally defined in the SyGuS input format (SyGuS-IF), a language that is built on top of SMT-LIB. The Syntax-Guided Synthesis competition (SyGuS-Comp) is an effort to facilitate, bring together and accelerate research and development of efficient solvers for SyGuS by providing a platform for evaluating different synthesis techniques on a comprehensive set of benchmarks. In the 5th SyGuS-Comp, five solvers competed on over 1600 benchmarks across various tracks. This paper presents and analyses the results of this year's (2018) SyGuS competition. △ Less

Submitted 12 April, 2019; originally announced April 2019.

Comments: 18 pages. Satellite event of CAV'18 and SYNT'18. arXiv admin note: substantial text overlap with arXiv:1711.11438

arXiv:1902.04064 [pdf, other]

REAFFIRM: Model-Based Repair of Hybrid Systems for Improving Resiliency

Authors: Luan Viet Nguyen, Gautam Mohan, James Weimer, Oleg Sokolsky, Insup Lee, Rajeev Alur

Abstract: Model-based design offers a promising approach for assisting developers to build reliable and secure cyber-physical systems (CPSs) in a systematic manner. In this methodology, a designer first constructs a model, with mathematically precise semantics, of the system under design, and performs extensive analysis with respect to correctness requirements before generating the implementation from the m… ▽ More Model-based design offers a promising approach for assisting developers to build reliable and secure cyber-physical systems (CPSs) in a systematic manner. In this methodology, a designer first constructs a model, with mathematically precise semantics, of the system under design, and performs extensive analysis with respect to correctness requirements before generating the implementation from the model. However, as new vulnerabilities are discovered, requirements evolve aimed at ensuring resiliency. There is currently a shortage of an inexpensive, automated mechanism that can effectively repair the initial design, and a model-based system developer regularly needs to redesign and reimplement the system from scratch. In this paper, we propose a new methodology along with a Matlab toolkit called REAFFIRM to facilitate the model-based repair for improving the resiliency of CPSs. REAFFIRM takes the inputs including 1) an original hybrid system modeled as a Simulink/Stateflow diagram, 2) a given resiliency pattern specified as a model transformation script, and 3) a safety requirement expressed as a Signal Temporal Logic formula, and then outputs a repaired model which satisfies the requirement. The overall structure of REAFFIRM contains two main modules, a model transformation, and a model synthesizer built on top of the falsification tool Breach. We introduce a new model transformation language for hybrid systems, which we call HATL to allow a designer to specify resiliency patterns. To evaluate the proposed approach, we use REAFFIRM to automatically synthesize repaired models for an adaptive cruise control (ACC) system under a GPS sensor spoofing attack, for a single-machine infinite-bus (SMIB) system under a sliding-mode switching attack, and for a missile guidance system under gyroscopes sensor attack. △ Less

Submitted 9 February, 2019; originally announced February 2019.

arXiv:1811.01828 [pdf, other]

Verisig: verifying safety properties of hybrid systems with neural network controllers

Authors: Radoslav Ivanov, James Weimer, Rajeev Alur, George J. Pappas, Insup Lee

Abstract: This paper presents Verisig, a hybrid system approach to verifying safety properties of closed-loop systems using neural networks as controllers. Although techniques exist for verifying input/output properties of the neural network itself, these methods cannot be used to verify properties of the closed-loop system (since they work with piecewise-linear constraints that do not capture non-linear pl… ▽ More This paper presents Verisig, a hybrid system approach to verifying safety properties of closed-loop systems using neural networks as controllers. Although techniques exist for verifying input/output properties of the neural network itself, these methods cannot be used to verify properties of the closed-loop system (since they work with piecewise-linear constraints that do not capture non-linear plant dynamics). To overcome this challenge, we focus on sigmoid-based networks and exploit the fact that the sigmoid is the solution to a quadratic differential equation, which allows us to transform the neural network into an equivalent hybrid system. By composing the network's hybrid system with the plant's, we transform the problem into a hybrid system verification problem which can be solved using state-of-the-art reachability tools. We show that reachability is decidable for networks with one hidden layer and decidable for general networks if Schanuel's conjecture is true. We evaluate the applicability and scalability of Verisig in two case studies, one from reinforcement learning and one in which the neural network is used to approximate a model predictive controller. △ Less

Submitted 5 November, 2018; originally announced November 2018.

arXiv:1809.10503 [pdf, other]

Equilibria in Quantitative Concurrent Games

Authors: Shaull Almagor, Rajeev Alur, Suguman Bansal

Abstract: Synthesis of finite-state controllers from high-level specifications in multi-agent systems can be reduced to solving multi-player concurrent games over finite graphs. The complexity of solving such games with qualitative objectives for agents, such as reaching a target set, is well understood resulting in tools with applications in robotics. In this paper, we introduce quantitative concurrent gra… ▽ More Synthesis of finite-state controllers from high-level specifications in multi-agent systems can be reduced to solving multi-player concurrent games over finite graphs. The complexity of solving such games with qualitative objectives for agents, such as reaching a target set, is well understood resulting in tools with applications in robotics. In this paper, we introduce quantitative concurrent graph games, where transitions have separate costs for different agents, and each agent attempts to reach its target set while minimizing its own cost along the path. In this model, a solution to the game corresponds to a set of strategies, one per agent, that forms a Nash equilibrium. We study the problem of computing the set of all Pareto-optimal Nash equilibria, and give a comprehensive analysis of its complexity and related problems such as the price of stability and the price of anarchy. In particular, while checking the existence of a Nash equilibrium is NP-complete in general, with multiple parameters contributing to the computational hardness separately, two-player games with bounded costs on individual transitions admit a polynomial-time solution. △ Less

Submitted 27 September, 2018; originally announced September 2018.

ACM Class: I.2.11

arXiv:1807.03865 [pdf, other]

Streamable Regular Transductions

Authors: Rajeev Alur, Dana Fisman, Konstantinos Mamouras, Mukund Raghothaman, Caleb Stanford

Abstract: Motivated by real-time monitoring and data processing applications, we develop a formal theory of quantitative queries for streaming data that can be evaluated efficiently. We consider the model of unambiguous Cost Register Automata (CRAs), which are machines that combine finite-state control (for identifying regular patterns) with a finite set of data registers (for computing numerical aggregates… ▽ More Motivated by real-time monitoring and data processing applications, we develop a formal theory of quantitative queries for streaming data that can be evaluated efficiently. We consider the model of unambiguous Cost Register Automata (CRAs), which are machines that combine finite-state control (for identifying regular patterns) with a finite set of data registers (for computing numerical aggregates). The definition of CRAs is parameterized by the collection of numerical operations that can be applied to the registers. These machines give rise to the class of streamable regular transductions (SR), and to the class of streamable linear regular transductions (SLR) when the register updates are copyless, i.e. every register appears at most once the right-hand-side expressions of the updates. We give a logical characterization of the class SR (resp., SLR) using MSO-definable transformations from strings to DAGs (resp., trees) without backward edges. Additionally, we establish that the two classes SR and SLR are closed under operations that are relevant for designing query languages. Finally, we study the relationship with weighted automata (WA), and show that CRAs over a suitably chosen set of operations correspond to WA, thus establishing that WA are a special case of CRAs. △ Less

Submitted 3 November, 2019; v1 submitted 10 July, 2018; originally announced July 2018.

Comments: 53 pages

arXiv:1711.11438 [pdf, other]

doi 10.4204/EPTCS.260.9

SyGuS-Comp 2017: Results and Analysis

Authors: Rajeev Alur, Dana Fisman, Rishabh Singh, Armando Solar-Lezama

Abstract: Syntax-Guided Synthesis (SyGuS) is the computational problem of finding an implementation f that meets both a semantic constraint given by a logical formula phi in a background theory T, and a syntactic constraint given by a grammar G, which specifies the allowed set of candidate implementations. Such a synthesis problem can be formally defined in SyGuS-IF, a language that is built on top of SMT-L… ▽ More Syntax-Guided Synthesis (SyGuS) is the computational problem of finding an implementation f that meets both a semantic constraint given by a logical formula phi in a background theory T, and a syntactic constraint given by a grammar G, which specifies the allowed set of candidate implementations. Such a synthesis problem can be formally defined in SyGuS-IF, a language that is built on top of SMT-LIB. The Syntax-Guided Synthesis Competition (SyGuS-Comp) is an effort to facilitate, bring together and accelerate research and development of efficient solvers for SyGuS by providing a platform for evaluating different synthesis techniques on a comprehensive set of benchmarks. In this year's competition six new solvers competed on over 1500 benchmarks. This paper presents and analyses the results of SyGuS-Comp'17. △ Less

Submitted 28 November, 2017; originally announced November 2017.

Comments: In Proceedings SYNT 2017, arXiv:1711.10224. arXiv admin note: text overlap with arXiv:1611.07627, arXiv:1602.01170

Journal ref: EPTCS 260, 2017, pp. 97-115

arXiv:1611.07627 [pdf, other]

doi 10.4204/EPTCS.229.13

SyGuS-Comp 2016: Results and Analysis

Authors: Rajeev Alur, Dana Fisman, Rishabh Singh, Armando Solar-Lezama

Abstract: Syntax-Guided Synthesis (SyGuS) is the computational problem of finding an implementation f that meets both a semantic constraint given by a logical formula $\varphi$ in a background theory T, and a syntactic constraint given by a grammar G, which specifies the allowed set of candidate implementations. Such a synthesis problem can be formally defined in SyGuS-IF, a language that is built on top of… ▽ More Syntax-Guided Synthesis (SyGuS) is the computational problem of finding an implementation f that meets both a semantic constraint given by a logical formula $\varphi$ in a background theory T, and a syntactic constraint given by a grammar G, which specifies the allowed set of candidate implementations. Such a synthesis problem can be formally defined in SyGuS-IF, a language that is built on top of SMT-LIB. The Syntax-Guided Synthesis Competition (SyGuS-Comp) is an effort to facilitate, bring together and accelerate research and development of efficient solvers for SyGuS by providing a platform for evaluating different synthesis techniques on a comprehensive set of benchmarks. In this year's competition we added a new track devoted to programming by examples. This track consisted of two categories, one using the theory of bit-vectors and one using the theory of strings. This paper presents and analyses the results of SyGuS-Comp'16. △ Less

Submitted 22 November, 2016; originally announced November 2016.

Comments: In Proceedings SYNT 2016, arXiv:1611.07178. arXiv admin note: text overlap with arXiv:1602.01170

ACM Class: I.2.2; D.1.2; D.2.2; D.2.4;

Journal ref: EPTCS 229, 2016, pp. 178-202

arXiv:1604.02980 [pdf]

Systems Computing Challenges in the Internet of Things

Authors: Rajeev Alur, Emery Berger, Ann W. Drobnis, Limor Fix, Kevin Fu, Gregory D. Hager, Daniel Lopresti, Klara Nahrstedt, Elizabeth Mynatt, Shwetak Patel, Jennifer Rexford, John A. Stankovic, Benjamin Zorn

Abstract: A recent McKinsey report estimates the economic impact of the Internet of Things (IoT) to be between $3.9 to $11 trillion dollars by 20251 . IoT has the potential to have a profound impact on our daily lives, including technologies for the home, for health, for transportation, and for managing our natural resources. The Internet was largely driven by information and ideas generated by people, but… ▽ More A recent McKinsey report estimates the economic impact of the Internet of Things (IoT) to be between $3.9 to $11 trillion dollars by 20251 . IoT has the potential to have a profound impact on our daily lives, including technologies for the home, for health, for transportation, and for managing our natural resources. The Internet was largely driven by information and ideas generated by people, but advances in sensing and hardware have enabled computers to more easily observe the physical world. Coupling this additional layer of information with advances in machine learning brings dramatic new capabilities including the ability to capture and process tremendous amounts of data; to predict behaviors, activities, and the future in uncanny ways; and to manipulate the physical world in response. This trend will fundamentally change how people interact with physical objects and the environment. Success in develo** value-added capabilities around IoT requires a broad approach that includes expertise in sensing and hardware, machine learning, networked systems, human-computer interaction, security, and privacy. Strategies for making IoT practical and spurring its ultimate adoption also require a multifaceted approach that often transcends technology, such as with concerns over data security, privacy, public policy, and regulatory issues. In this paper we argue that existing best practices in building robust and secure systems are insufficient to address the new challenges that IoT systems will present. We provide recommendations regarding investments in research areas that will help address inadequacies in existing systems, practices, tools, and policies. △ Less

Submitted 11 April, 2016; originally announced April 2016.

Comments: A Computing Community Consortium (CCC) white paper, 15 pages

arXiv:1602.01170 [pdf, other]

doi 10.4204/EPTCS.202.3

Results and Analysis of SyGuS-Comp'15

Authors: Rajeev Alur, Dana Fisman, Rishabh Singh, Armando Solar-Lezama

Abstract: Syntax-Guided Synthesis (SyGuS) is the computational problem of finding an implementation f that meets both a semantic constraint given by a logical formula $\varphi$ in a background theory T, and a syntactic constraint given by a grammar G, which specifies the allowed set of candidate implementations. Such a synthesis problem can be formally defined in SyGuS-IF, a language that is built on top of… ▽ More Syntax-Guided Synthesis (SyGuS) is the computational problem of finding an implementation f that meets both a semantic constraint given by a logical formula $\varphi$ in a background theory T, and a syntactic constraint given by a grammar G, which specifies the allowed set of candidate implementations. Such a synthesis problem can be formally defined in SyGuS-IF, a language that is built on top of SMT-LIB. The Syntax-Guided Synthesis Competition (SyGuS-comp) is an effort to facilitate, bring together and accelerate research and development of efficient solvers for SyGuS by providing a platform for evaluating different synthesis techniques on a comprehensive set of benchmarks. In this year's competition we added two specialized tracks: a track for conditional linear arithmetic, where the grammar need not be specified and is implicitly assumed to be that of the LIA logic of SMT-LIB, and a track for invariant synthesis problems, with special constructs conforming to the structure of an invariant synthesis problem. This paper presents and analyzes the results of SyGuS-comp'15. △ Less

Submitted 2 February, 2016; originally announced February 2016.

Comments: In Proceedings SYNT 2015, arXiv:1602.00786

ACM Class: I.2.2, D.2.4, F.3.1;

Journal ref: EPTCS 202, 2016, pp. 3-26

arXiv:1505.05868 [pdf, ps, other]

Synthesis through Unification

Authors: Rajeev Alur, Pavol Cerny, Arjun Radhakrishna

Abstract: Given a specification and a set of candidate programs (program space), the program synthesis problem is to find a candidate program that satisfies the specification. We present the synthesis through unification (STUN) approach, which is an extension of the counter-example guided inductive synthesis (CEGIS) approach. In CEGIS, the synthesizer maintains a subset S of inputs and a candidate program P… ▽ More Given a specification and a set of candidate programs (program space), the program synthesis problem is to find a candidate program that satisfies the specification. We present the synthesis through unification (STUN) approach, which is an extension of the counter-example guided inductive synthesis (CEGIS) approach. In CEGIS, the synthesizer maintains a subset S of inputs and a candidate program Prog that is correct for S. The synthesizer repeatedly checks if there exists a counter-example input c such that the execution of Prog is incorrect on c. If so, the synthesizer enlarges S to include c, and picks a program from the program space that is correct for the new set S. The STUN approach extends CEGIS with the idea that given a program Prog that is correct for a subset of inputs, the synthesizer can try to find a program Prog' that is correct for the rest of the inputs. If Prog and Prog' can be unified into a program in the program space, then a solution has been found. We present a generic synthesis procedure based on the STUN approach and specialize it for three different domains by providing the appropriate unification operators. We implemented these specializations in prototype tools, and we show that our tools often per- forms significantly better on standard benchmarks than a tool based on a pure CEGIS approach. △ Less

Submitted 21 May, 2015; originally announced May 2015.

arXiv:1505.04409 [pdf, other]

Automatic Completion of Distributed Protocols with Symmetry

Authors: Rajeev Alur, Mukund Raghothaman, Christos Stergiou, Stavros Tripakis, Abhishek Udupa

Abstract: A distributed protocol is typically modeled as a set of communicating processes, where each process is described as an extended state machine along with fairness assumptions, and its correctness is specified using safety and liveness requirements. Designing correct distributed protocols is a challenging task. Aimed at simplifying this task, we allow the designer to leave some of the guards and upd… ▽ More A distributed protocol is typically modeled as a set of communicating processes, where each process is described as an extended state machine along with fairness assumptions, and its correctness is specified using safety and liveness requirements. Designing correct distributed protocols is a challenging task. Aimed at simplifying this task, we allow the designer to leave some of the guards and updates to state variables in the description of extended state machines as unknown functions. The protocol completion problem then is to find interpretations for these unknown functions while guaranteeing correctness. In many distributed protocols, process behaviors are naturally symmetric, and thus, synthesized expressions are further required to obey symmetry constraints. Our counterexample-guided synthesis algorithm consists of repeatedly invoking two phases. In the first phase, candidates for unknown expressions are generated using the SMT solver Z3. This phase requires carefully orchestrating constraints to enforce the desired symmetry in read/write accesses. In the second phase, the resulting completed protocol is checked for correctness using a custom-built model checker that handles fairness assumptions, safety and liveness requirements, and exploits symmetry. When model checking fails, our tool examines a set of counterexamples to safety/liveness properties to generate constraints on unknown functions that must be satisfied by subsequent completions. For evaluation, we show that our prototype is able to automatically discover interesting missing details in distributed protocols for mutual exclusion, self stabilization, and cache coherence. △ Less

Submitted 17 May, 2015; originally announced May 2015.

Comments: Full version of paper presented at CAV 2015

arXiv:1402.7150 [pdf, other]

Synthesizing Finite-state Protocols from Scenarios and Requirements

Authors: Rajeev Alur, Milo Martin, Mukund Raghothaman, Christos Stergiou, Stavros Tripakis, Abhishek Udupa

Abstract: Scenarios, or Message Sequence Charts, offer an intuitive way of describing the desired behaviors of a distributed protocol. In this paper we propose a new way of specifying finite-state protocols using scenarios: we show that it is possible to automatically derive a distributed implementation from a set of scenarios augmented with a set of safety and liveness requirements, provided the given scen… ▽ More Scenarios, or Message Sequence Charts, offer an intuitive way of describing the desired behaviors of a distributed protocol. In this paper we propose a new way of specifying finite-state protocols using scenarios: we show that it is possible to automatically derive a distributed implementation from a set of scenarios augmented with a set of safety and liveness requirements, provided the given scenarios adequately \emph{cover} all the states of the desired implementation. We first derive incomplete state machines from the given scenarios, and then synthesis corresponds to completing the transition relation of individual processes so that the global product meets the specified requirements. This completion problem, in general, has the same complexity, PSPACE, as the verification problem, but unlike the verification problem, is NP-complete for a constant number of processes. We present two algorithms for solving the completion problem, one based on a heuristic search in the space of possible completions and one based on OBDD-based symbolic fixpoint computation. We evaluate the proposed methodology for protocol specification and the effectiveness of the synthesis algorithms using the classical alternating-bit protocol. △ Less

Submitted 28 February, 2014; originally announced February 2014.

Comments: This is the working draft of a paper currently in submission. (February 10, 2014)

arXiv:1402.3021 [pdf, other]

Regular Combinators for String Transformations

Authors: Rajeev Alur, Adam Freilich, Mukund Raghothaman

Abstract: We focus on (partial) functions that map input strings to a monoid such as the set of integers with addition and the set of output strings with concatenation. The notion of regularity for such functions has been defined using two-way finite-state transducers, (one-way) cost register automata, and MSO-definable graph transformations. In this paper, we give an algebraic and machine-independent chara… ▽ More We focus on (partial) functions that map input strings to a monoid such as the set of integers with addition and the set of output strings with concatenation. The notion of regularity for such functions has been defined using two-way finite-state transducers, (one-way) cost register automata, and MSO-definable graph transformations. In this paper, we give an algebraic and machine-independent characterization of this class analogous to the definition of regular languages by regular expressions. When the monoid is commutative, we prove that every regular function can be constructed from constant functions using the combinators of choice, split sum, and iterated sum, that are analogs of union, concatenation, and Kleene-*, respectively, but enforce unique (or unambiguous) parsing. Our main result is for the general case of non-commutative monoids, which is of particular interest for capturing regular string-to-string transformations for document processing. We prove that the following additional combinators suffice for constructing all regular functions: (1) the left-additive versions of split sum and iterated sum, which allow transformations such as string reversal; (2) sum of functions, which allows transformations such as copying of strings; and (3) function composition, or alternatively, a new concept of chained sum, which allows output values from adjacent blocks to mix. △ Less

Submitted 12 February, 2014; originally announced February 2014.

Comments: This is the full version, with omitted proofs and constructions, of the conference paper currently in submission

arXiv:1308.4113 [pdf, other]

Counter-Strategy Guided Refinement of GR(1) Temporal Logic Specifications

Authors: Rajeev Alur, Salar Moarref, Ufuk Topcu

Abstract: The reactive synthesis problem is to find a finite-state controller that satisfies a given temporal-logic specification regardless of how its environment behaves. Develo** a formal specification is a challenging and tedious task and initial specifications are often unrealizable. In many cases, the source of unrealizability is the lack of adequate assumptions on the environment of the system. In… ▽ More The reactive synthesis problem is to find a finite-state controller that satisfies a given temporal-logic specification regardless of how its environment behaves. Develo** a formal specification is a challenging and tedious task and initial specifications are often unrealizable. In many cases, the source of unrealizability is the lack of adequate assumptions on the environment of the system. In this paper, we consider the problem of automatically correcting an unrealizable specification given in the generalized reactivity (1) fragment of linear temporal logic by adding assumptions on the environment. When a temporal-logic specification is unrealizable, the synthesis algorithm computes a counter-strategy as a witness. Our algorithm then analyzes this counter-strategy and synthesizes a set of candidate environment assumptions that can be used to remove the counter-strategy from the environment's possible behaviors. We demonstrate the applicability of our approach with several case studies. △ Less

Submitted 19 August, 2013; originally announced August 2013.

arXiv:1304.7029 [pdf, other]

Decision Problems for Additive Regular Functions

Authors: Rajeev Alur, Mukund Raghothaman

Abstract: Additive Cost Register Automata (ACRA) map strings to integers using a finite set of registers that are updated using assignments of the form "x := y + c" at every step. The corresponding class of additive regular functions has multiple equivalent characterizations, appealing closure properties, and a decidable equivalence problem. In this paper, we solve two decision problems for this model. Firs… ▽ More Additive Cost Register Automata (ACRA) map strings to integers using a finite set of registers that are updated using assignments of the form "x := y + c" at every step. The corresponding class of additive regular functions has multiple equivalent characterizations, appealing closure properties, and a decidable equivalence problem. In this paper, we solve two decision problems for this model. First, we define the register complexity of an additive regular function to be the minimum number of registers that an ACRA needs to compute it. We characterize the register complexity by a necessary and sufficient condition regarding the largest subset of registers whose values can be made far apart from one another. We then use this condition to design a PSPACE algorithm to compute the register complexity of a given ACRA, and establish a matching lower bound. Our results also lead to a machine-independent characterization of the register complexity of additive regular functions. Second, we consider two-player games over ACRAs, where the objective of one of the players is to reach a target set while minimizing the cost. We show the corresponding decision problem to be EXPTIME-complete when costs are non-negative integers, but undecidable when costs are integers. △ Less

Submitted 25 April, 2013; originally announced April 2013.

Comments: Conference version to appear in ICALP 2013

arXiv:1302.0745 [pdf, ps, other]

Safe Schedulability of Bounded-Rate Multi-Mode Systems

Authors: Rajeev Alur, Vojtech Forejt, Salar Moarref, Ashutosh Trivedi

Abstract: Bounded-rate multi-mode systems (BMMS) are hybrid systems that can switch freely among a finite set of modes, and whose dynamics is specified by a finite number of real-valued variables with mode-dependent rates that can vary within given bounded sets. The schedulability problem for BMMS is defined as an infinite-round game between two players---the scheduler and the environment---where in each ro… ▽ More Bounded-rate multi-mode systems (BMMS) are hybrid systems that can switch freely among a finite set of modes, and whose dynamics is specified by a finite number of real-valued variables with mode-dependent rates that can vary within given bounded sets. The schedulability problem for BMMS is defined as an infinite-round game between two players---the scheduler and the environment---where in each round the scheduler proposes a time and a mode while the environment chooses an allowable rate for that mode, and the state of the system changes linearly in the direction of the rate vector. The goal of the scheduler is to keep the state of the system within a pre-specified safe set using a non-Zeno schedule, while the goal of the environment is the opposite. Green scheduling under uncertainty is a paradigmatic example of BMMS where a winning strategy of the scheduler corresponds to a robust energy-optimal policy. We present an algorithm to decide whether the scheduler has a winning strategy from an arbitrary starting state, and give an algorithm to compute such a winning strategy, if it exists. We show that the schedulability problem for BMMS is co-NP complete in general, but for two variables it is in PTIME. We also study the discrete schedulability problem where the environment has only finitely many choices of rate vectors in each mode and the scheduler can make decisions only at multiples of a given clock period, and show it to be EXPTIME-complete. △ Less

Submitted 4 February, 2013; originally announced February 2013.

Comments: Technical report for a paper presented at HSCC 2013

arXiv:1111.0670 [pdf, ps, other]

Regular Functions, Cost Register Automata, and Generalized Min-Cost Problems

Authors: Rajeev Alur, Loris D'Antoni, Jyotirmoy V. Deshmukh, Mukund Raghothaman, Yifei Yuan

Abstract: Motivated by the successful application of the theory of regular languages to formal verification of finite-state systems, there is a renewed interest in develo** a theory of analyzable functions from strings to numerical values that can provide a foundation for analyzing {\em quantitative} properties of finite-state systems. In this paper, we propose a deterministic model for associating costs… ▽ More Motivated by the successful application of the theory of regular languages to formal verification of finite-state systems, there is a renewed interest in develo** a theory of analyzable functions from strings to numerical values that can provide a foundation for analyzing {\em quantitative} properties of finite-state systems. In this paper, we propose a deterministic model for associating costs with strings that is parameterized by operations of interest (such as addition, scaling, and $\min$), a notion of {\em regularity} that provides a yardstick to measure expressiveness, and study decision problems and theoretical properties of resulting classes of cost functions. Our definition of regularity relies on the theory of string-to-tree transducers, and allows associating costs with events that are conditional upon regular properties of future events. Our model of {\em cost register automata} allows computation of regular functions using multiple "write-only" registers whose values can be combined using the allowed set of operations. We show that classical shortest-path algorithms as well as algorithms designed for computing {\em discounted costs}, can be adopted for solving the min-cost problems for the more general classes of functions specified in our model. Cost register automata with $\min$ and increment give a deterministic model that is equivalent to {\em weighted automata}, an extensively studied nondeterministic model, and this connection results in new insights and new open problems. △ Less

Submitted 21 February, 2012; v1 submitted 2 November, 2011; originally announced November 2011.

Comments: ICALP12 submission, technical report/extended version. 33 pages+title page

arXiv:1104.2599 [pdf, other]

Streaming Tree Transducers

Authors: Rajeev Alur, Loris D'Antoni

Abstract: Theory of tree transducers provides a foundation for understanding expressiveness and complexity of analysis problems for specification languages for transforming hierarchically structured data such as XML documents. We introduce streaming tree transducers as an analyzable, executable, and expressive model for transforming unranked ordered trees in a single pass. Given a linear encoding of the inp… ▽ More Theory of tree transducers provides a foundation for understanding expressiveness and complexity of analysis problems for specification languages for transforming hierarchically structured data such as XML documents. We introduce streaming tree transducers as an analyzable, executable, and expressive model for transforming unranked ordered trees in a single pass. Given a linear encoding of the input tree, the transducer makes a single left-to-right pass through the input, and computes the output in linear time using a finite-state control, a visibly pushdown stack, and a finite number of variables that store output chunks that can be combined using the operations of string-concatenation and tree-insertion. We prove that the expressiveness of the model coincides with transductions definable using monadic second-order logic (MSO). Existing models of tree transducers either cannot implement all MSO-definable transformations, or require regular look ahead that prohibits single-pass implementation. We show a variety of analysis problems such as type-checking and checking functional equivalence are solvable for our model. △ Less

Submitted 21 February, 2012; v1 submitted 13 April, 2011; originally announced April 2011.

Comments: 40 pages

arXiv:1007.4958 [pdf, other]

Algorithmic Verification of Single-Pass List Processing Programs

Authors: Rajeev Alur, Pavol Cerny

Abstract: We introduce streaming data string transducers that map input data strings to output data strings in a single left-to-right pass in linear time. Data strings are (unbounded) sequences of data values, tagged with symbols from a finite set, over a potentially infinite data domain that supports only the operations of equality and ordering. The transducer uses a finite set of states, a finite set of v… ▽ More We introduce streaming data string transducers that map input data strings to output data strings in a single left-to-right pass in linear time. Data strings are (unbounded) sequences of data values, tagged with symbols from a finite set, over a potentially infinite data domain that supports only the operations of equality and ordering. The transducer uses a finite set of states, a finite set of variables ranging over the data domain, and a finite set of variables ranging over data strings. At every step, it can make decisions based on the next input symbol, updating its state, remembering the input data value in its data variables, and updating data-string variables by concatenating data-string variables and new symbols formed from data variables, while avoiding duplication. We establish that the problems of checking functional equivalence of two streaming transducers, and of checking whether a streaming transducer satisfies pre/post verification conditions specified by streaming acceptors over input/output data-strings, are in PSPACE. We identify a class of imperative and a class of functional programs, manipulating lists of data items, which can be effectively translated to streaming data-string transducers. The imperative programs dynamically modify a singly-linked heap by changing next-pointers of heap-nodes and by adding new nodes. The main restriction specifies how the next-pointers can be used for traversal. We also identify an expressively equivalent fragment of functional programs that traverse a list using syntactically restricted recursive calls. Our results lead to algorithms for assertion checking and for checking functional equivalence of two programs, written possibly in different programming styles, for commonly used routines such as insert, delete, and reverse. △ Less

Submitted 14 February, 2011; v1 submitted 28 July, 2010; originally announced July 2010.

arXiv:0811.0537 [pdf, ps, other]

doi 10.2168/LMCS-4(4:11)2008

First-Order and Temporal Logics for Nested Words

Authors: Rajeev Alur, Marcelo Arenas, Pablo Barcelo, Kousha Etessami, Neil Immerman, Leonid Libkin

Abstract: Nested words are a structured model of execution paths in procedural programs, reflecting their call and return nesting structure. Finite nested words also capture the structure of parse trees and other tree-structured data, such as XML. We provide new temporal logics for finite and infinite nested words, which are natural extensions of LTL, and prove that these logics are first-order expressivel… ▽ More Nested words are a structured model of execution paths in procedural programs, reflecting their call and return nesting structure. Finite nested words also capture the structure of parse trees and other tree-structured data, such as XML. We provide new temporal logics for finite and infinite nested words, which are natural extensions of LTL, and prove that these logics are first-order expressively-complete. One of them is based on adding a "within" modality, evaluating a formula on a subword, to a logic CaRet previously studied in the context of verifying properties of recursive state machines (RSMs). The other logic, NWTL, is based on the notion of a summary path that uses both the linear and nesting structures. For NWTL we show that satisfiability is EXPTIME-complete, and that model-checking can be done in time polynomial in the size of the RSM model and exponential in the size of the NWTL formula (and is also EXPTIME-complete). Finally, we prove that first-order logic over nested words has the three-variable property, and we present a temporal logic for nested words which is complete for the two-variable fragment of first-order. △ Less

Submitted 3 March, 2011; v1 submitted 4 November, 2008; originally announced November 2008.

Comments: revised and corrected version of Mar 03, 2011

ACM Class: F.1.1, F.3.1, F.4.1

Journal ref: Logical Methods in Computer Science, Volume 4, Issue 4 (November 25, 2008) lmcs:782

Showing 1–41 of 41 results for author: Alur, R