Search | arXiv e-print repository

Verifiably Robust Conformal Prediction

Authors: Linus Jeary, Tom Kuipers, Mehran Hosseini, Nicola Paoletti

Abstract: Conformal Prediction (CP) is a popular uncertainty quantification method that provides distribution-free, statistically valid prediction sets, assuming that training and test data are exchangeable. In such a case, CP's prediction sets are guaranteed to cover the (unknown) true test output with a user-specified probability. Nevertheless, this guarantee is violated when the data is subjected to adve… ▽ More Conformal Prediction (CP) is a popular uncertainty quantification method that provides distribution-free, statistically valid prediction sets, assuming that training and test data are exchangeable. In such a case, CP's prediction sets are guaranteed to cover the (unknown) true test output with a user-specified probability. Nevertheless, this guarantee is violated when the data is subjected to adversarial attacks, which often result in a significant loss of coverage. Recently, several approaches have been put forward to recover CP guarantees in this setting. These approaches leverage variations of randomised smoothing to produce conservative sets which account for the effect of the adversarial perturbations. They are, however, limited in that they only support $\ell^2$-bounded perturbations and classification tasks. This paper introduces VRCP (Verifiably Robust Conformal Prediction), a new framework that leverages recent neural network verification methods to recover coverage guarantees under adversarial attacks. Our VRCP method is the first to support perturbations bounded by arbitrary norms including $\ell^1$, $\ell^2$, and $\ell^\infty$, as well as regression tasks. We evaluate and compare our approach on image classification tasks (CIFAR10, CIFAR100, and TinyImageNet) and regression tasks for deep reinforcement learning environments. In every case, VRCP achieves above nominal coverage and yields significantly more efficient and informative prediction regions than the SotA. △ Less

Submitted 6 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

MSC Class: 68T37 (Primary) 68T27 (Secondary) ACM Class: G.3; I.2.4; F.4.1

arXiv:2403.16871 [pdf, other]

Conformal Off-Policy Prediction for Multi-Agent Systems

Authors: Tom Kuipers, Renukanandan Tumu, Shuo Yang, Milad Kazemi, Rahul Mangharam, Nicola Paoletti

Abstract: Off-Policy Prediction (OPP), i.e., predicting the outcomes of a target policy using only data collected under a nominal (behavioural) policy, is a paramount problem in data-driven analysis of safety-critical systems where the deployment of a new policy may be unsafe. To achieve dependable off-policy predictions, recent work on Conformal Off-Policy Prediction (COPP) leverage the conformal predictio… ▽ More Off-Policy Prediction (OPP), i.e., predicting the outcomes of a target policy using only data collected under a nominal (behavioural) policy, is a paramount problem in data-driven analysis of safety-critical systems where the deployment of a new policy may be unsafe. To achieve dependable off-policy predictions, recent work on Conformal Off-Policy Prediction (COPP) leverage the conformal prediction framework to derive prediction regions with probabilistic guarantees under the target process. Existing COPP methods can account for the distribution shifts induced by policy switching, but are limited to single-agent systems and scalar outcomes (e.g., rewards). In this work, we introduce MA-COPP, the first conformal prediction method to solve OPP problems involving multi-agent systems, deriving joint prediction regions for all agents' trajectories when one or more "ego" agents change their policies. Unlike the single-agent scenario, this setting introduces higher complexity as the distribution shifts affect predictions for all agents, not just the ego agents, and the prediction task involves full multi-dimensional trajectories, not just reward values. A key contribution of MA-COPP is to avoid enumeration or exhaustive search of the output space of agent trajectories, which is instead required by existing COPP methods to construct the prediction region. We achieve this by showing that an over-approximation of the true JPR can be constructed, without enumeration, from the maximum density ratio of the JPR trajectories. We evaluate the effectiveness of MA-COPP in multi-agent systems from the PettingZoo library and the F1TENTH autonomous racing environment, achieving nominal coverage in higher dimensions and various shift settings. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: Submitted to the 63rd IEEE Conference on Decision and Control (CDC)

arXiv:2402.08514 [pdf, other]

Counterfactual Influence in Markov Decision Processes

Authors: Milad Kazemi, Jessica Lally, Ekaterina Tishchenko, Hana Chockler, Nicola Paoletti

Abstract: Our work addresses a fundamental problem in the context of counterfactual inference for Markov Decision Processes (MDPs). Given an MDP path $τ$, this kind of inference allows us to derive counterfactual paths $τ'$ describing what-if versions of $τ$ obtained under different action sequences than those observed in $τ$. However, as the counterfactual states and actions deviate from the observed ones… ▽ More Our work addresses a fundamental problem in the context of counterfactual inference for Markov Decision Processes (MDPs). Given an MDP path $τ$, this kind of inference allows us to derive counterfactual paths $τ'$ describing what-if versions of $τ$ obtained under different action sequences than those observed in $τ$. However, as the counterfactual states and actions deviate from the observed ones over time, the observation $τ$ may no longer influence the counterfactual world, meaning that the analysis is no longer tailored to the individual observation, resulting in interventional outcomes rather than counterfactual ones. Even though this issue specifically affects the popular Gumbel-max structural causal model used for MDP counterfactuals, it has remained overlooked until now. In this work, we introduce a formal characterisation of influence based on comparing counterfactual and interventional distributions. We devise an algorithm to construct counterfactual models that automatically satisfy influence constraints. Leveraging such models, we derive counterfactual policies that are not just optimal for a given reward structure but also remain tailored to the observed path. Even though there is an unavoidable trade-off between policy optimality and strength of influence constraints, our experiments demonstrate that it is possible to derive (near-)optimal policies while remaining under the influence of the observation. △ Less

Submitted 13 February, 2024; originally announced February 2024.

Comments: 12 pages, 6 figures

arXiv:2312.01959 [pdf, other]

Learning-Based Approaches to Predictive Monitoring with Conformal Statistical Guarantees

Authors: Francesca Cairoli, Luca Bortolussi, Nicola Paoletti

Abstract: This tutorial focuses on efficient methods to predictive monitoring (PM), the problem of detecting at runtime future violations of a given requirement from the current state of a system. While performing model checking at runtime would offer a precise solution to the PM problem, it is generally computationally expensive. To address this scalability issue, several lightweight approaches based on ma… ▽ More This tutorial focuses on efficient methods to predictive monitoring (PM), the problem of detecting at runtime future violations of a given requirement from the current state of a system. While performing model checking at runtime would offer a precise solution to the PM problem, it is generally computationally expensive. To address this scalability issue, several lightweight approaches based on machine learning have recently been proposed. These approaches work by learning an approximate yet efficient surrogate (deep learning) model of the expensive model checker. A key challenge remains to ensure reliable predictions, especially in safety-critical applications. We review our recent work on predictive monitoring, one of the first to propose learning-based approximations for CPS verification of temporal logic specifications and the first in this context to apply conformal prediction (CP) for rigorous uncertainty quantification. These CP-based uncertainty estimators offer statistical guarantees regarding the generalization error of the learning model, and they can be used to determine unreliable predictions that should be rejected. In this tutorial, we present a general and comprehensive framework summarizing our approach to the predictive monitoring of CPSs, examining in detail several variants determined by three main dimensions: system dynamics (deterministic, non-deterministic, stochastic), state observability, and semantics of requirements' satisfaction (Boolean or quantitative). △ Less

Submitted 4 December, 2023; originally announced December 2023.

arXiv:2310.01951 [pdf, other]

Probabilistic Reach-Avoid for Bayesian Neural Networks

Authors: Matthew Wicker, Luca Laurenti, Andrea Patane, Nicola Paoletti, Alessandro Abate, Marta Kwiatkowska

Abstract: Model-based reinforcement learning seeks to simultaneously learn the dynamics of an unknown stochastic environment and synthesise an optimal policy for acting in it. Ensuring the safety and robustness of sequential decisions made through a policy in such an environment is a key challenge for policies intended for safety-critical scenarios. In this work, we investigate two complementary problems: f… ▽ More Model-based reinforcement learning seeks to simultaneously learn the dynamics of an unknown stochastic environment and synthesise an optimal policy for acting in it. Ensuring the safety and robustness of sequential decisions made through a policy in such an environment is a key challenge for policies intended for safety-critical scenarios. In this work, we investigate two complementary problems: first, computing reach-avoid probabilities for iterative predictions made with dynamical models, with dynamics described by Bayesian neural network (BNN); second, synthesising control policies that are optimal with respect to a given reach-avoid specification (reaching a "target" state, while avoiding a set of "unsafe" states) and a learned BNN model. Our solution leverages interval propagation and backward recursion techniques to compute lower bounds for the probability that a policy's sequence of actions leads to satisfying the reach-avoid specification. Such computed lower bounds provide safety certification for the given policy and BNN model. We then introduce control synthesis algorithms to derive policies maximizing said lower bounds on the safety probability. We demonstrate the effectiveness of our method on a series of control benchmarks characterized by learned BNN dynamics models. On our most challenging benchmark, compared to purely data-driven policies the optimal synthesis algorithm is able to provide more than a four-fold increase in the number of certifiable states and more than a three-fold increase in the average guaranteed reach-avoid probability. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: 47 pages, 10 figures. arXiv admin note: text overlap with arXiv:2105.10134

arXiv:2212.08712 [pdf, other]

Causal Temporal Reasoning for Markov Decision Processes

Authors: Milad Kazemi, Nicola Paoletti

Abstract: We introduce $\textit{PCFTL (Probabilistic CounterFactual Temporal Logic)}$, a new probabilistic temporal logic for the verification of Markov Decision Processes (MDP). PCFTL is the first to include operators for causal reasoning, allowing us to express interventional and counterfactual queries. Given a path formula $φ$, an interventional property is concerned with the satisfaction probability of… ▽ More We introduce $\textit{PCFTL (Probabilistic CounterFactual Temporal Logic)}$, a new probabilistic temporal logic for the verification of Markov Decision Processes (MDP). PCFTL is the first to include operators for causal reasoning, allowing us to express interventional and counterfactual queries. Given a path formula $φ$, an interventional property is concerned with the satisfaction probability of $φ$ if we apply a particular change $I$ to the MDP (e.g., switching to a different policy); a counterfactual allows us to compute, given an observed MDP path $τ$, what the outcome of $φ$ would have been had we applied $I$ in the past. For its ability to reason about \textit{what-if} scenarios involving different configurations of the MDP, our approach represents a departure from existing probabilistic temporal logics that can only reason about a fixed system configuration. From a syntactic viewpoint, we introduce a generalized counterfactual operator that subsumes both interventional and counterfactual probabilities as well as the traditional probabilistic operator found in e.g., PCTL. From a semantics viewpoint, our logic is interpreted over a structural causal model translation of the MDP, which gives us a representation amenable to counterfactual reasoning. We evaluate PCFTL in the context of safe reinforcement learning using a benchmark of grid-world models. △ Less

Submitted 14 June, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

Comments: 22 pages and 5 figures

arXiv:2211.02375 [pdf, other]

Conformal Quantitative Predictive Monitoring of STL Requirements for Stochastic Processes

Authors: Francesca Cairoli, Nicola Paoletti, Luca Bortolussi

Abstract: We consider the problem of predictive monitoring (PM), i.e., predicting at runtime the satisfaction of a desired property from the current system's state. Due to its relevance for runtime safety assurance and online control, PM methods need to be efficient to enable timely interventions against predicted violations, while providing correctness guarantees. We introduce \textit{quantitative predicti… ▽ More We consider the problem of predictive monitoring (PM), i.e., predicting at runtime the satisfaction of a desired property from the current system's state. Due to its relevance for runtime safety assurance and online control, PM methods need to be efficient to enable timely interventions against predicted violations, while providing correctness guarantees. We introduce \textit{quantitative predictive monitoring (QPM)}, the first PM method to support stochastic processes and rich specifications given in Signal Temporal Logic (STL). Unlike most of the existing PM techniques that predict whether or not some property $φ$ is satisfied, QPM provides a quantitative measure of satisfaction by predicting the quantitative (aka robust) STL semantics of $φ$. QPM derives prediction intervals that are highly efficient to compute and with probabilistic guarantees, in that the intervals cover with arbitrary probability the STL robustness values relative to the stochastic evolution of the system. To do so, we take a machine-learning approach and leverage recent advances in conformal inference for quantile regression, thereby avoiding expensive Monte-Carlo simulations at runtime to estimate the intervals. We also show how our monitors can be combined in a compositional manner to handle composite formulas, without retraining the predictors nor sacrificing the guarantees. We demonstrate the effectiveness and scalability of QPM over a benchmark of four discrete-time stochastic processes with varying degrees of complexity. △ Less

Submitted 6 April, 2023; v1 submitted 4 November, 2022; originally announced November 2022.

arXiv:2205.03961 [pdf, other]

An STL-based Formulation of Resilience in Cyber-Physical Systems

Authors: Hongkai Chen, Shan Lin, Scott A. Smolka, Nicola Paoletti

Abstract: Resiliency is the ability to quickly recover from a violation and avoid future violations for as long as possible. Such a property is of fundamental importance for Cyber-Physical Systems (CPS), and yet, to date, there is no widely agreed-upon formal treatment of CPS resiliency. We present an STL-based framework for reasoning about resiliency in CPS in which resiliency has a syntactic characterizat… ▽ More Resiliency is the ability to quickly recover from a violation and avoid future violations for as long as possible. Such a property is of fundamental importance for Cyber-Physical Systems (CPS), and yet, to date, there is no widely agreed-upon formal treatment of CPS resiliency. We present an STL-based framework for reasoning about resiliency in CPS in which resiliency has a syntactic characterization in the form of an STL-based Resiliency Specification (SRS). Given an arbitrary STL formula $\varphi$, time bounds $α$ and $β$, the SRS of $\varphi$, $R_{α,β}(\varphi)$, is the STL formula $\neg\varphi\mathbf{U}_{[0,α]}\mathbf{G}_{[0,β)}\varphi$, specifying that recovery from a violation of $\varphi$ occur within time $α$ (recoverability), and subsequently that $\varphi$ be maintained for duration $β$ (durability). These $R$-expressions, which are atoms in our SRS logic, can be combined using STL operators, allowing one to express composite resiliency specifications, e.g., multiple SRSs must hold simultaneously, or the system must eventually be resilient. We define a quantitative semantics for SRSs in the form of a Resilience Satisfaction Value (ReSV) function $r$ and prove its soundness and completeness w.r.t. STL's Boolean semantics. The $r$-value for $R_{α,β}(\varphi)$ atoms is a singleton set containing a pair quantifying recoverability and durability. The $r$-value for a composite SRS formula results in a set of non-dominated recoverability-durability pairs, given that the ReSVs of subformulas might not be directly comparable (e.g., one subformula has superior durability but worse recoverability than another). To the best of our knowledge, this is the first multi-dimensional quantitative semantics for an STL-based logic. Two case studies demonstrate the practical utility of our approach. △ Less

Submitted 18 July, 2022; v1 submitted 8 May, 2022; originally announced May 2022.

Comments: 16 pages excluding references and appendix (23 pages in total), 6 figures

ACM Class: F.4.1

arXiv:2201.02455 [pdf, other]

Deep Learnable Strategy Templates for Multi-Issue Bilateral Negotiation

Authors: Pallavi Bagga, Nicola Paoletti, Kostas Stathis

Abstract: We study how to exploit the notion of strategy templates to learn strategies for multi-issue bilateral negotiation. Each strategy template consists of a set of interpretable parameterized tactics that are used to decide an optimal action at any time. We use deep reinforcement learning throughout an actor-critic architecture to estimate the tactic parameter values for a threshold utility, when to a… ▽ More We study how to exploit the notion of strategy templates to learn strategies for multi-issue bilateral negotiation. Each strategy template consists of a set of interpretable parameterized tactics that are used to decide an optimal action at any time. We use deep reinforcement learning throughout an actor-critic architecture to estimate the tactic parameter values for a threshold utility, when to accept an offer and how to generate a new bid. This contrasts with existing work that only estimates the threshold utility for those tactics. We pre-train the strategy by supervision from the dataset collected using "teacher strategies", thereby decreasing the exploration time required for learning during negotiation. As a result, we build automated agents for multi-issue negotiations that can adapt to different negotiation domains without the need to be pre-programmed. We empirically show that our work outperforms the state-of-the-art in terms of the individual as well as social efficiency. △ Less

Submitted 7 January, 2022; originally announced January 2022.

Comments: arXiv admin note: text overlap with arXiv:2009.08302

arXiv:2108.07134 [pdf, other]

Neural Predictive Monitoring under Partial Observability

Authors: Francesca Cairoli, Luca Bortolussi, Nicola Paoletti

Abstract: We consider the problem of predictive monitoring (PM), i.e., predicting at runtime future violations of a system from the current state. We work under the most realistic settings where only partial and noisy observations of the state are available at runtime. Such settings directly affect the accuracy and reliability of the reachability predictions, jeopardizing the safety of the system. In this w… ▽ More We consider the problem of predictive monitoring (PM), i.e., predicting at runtime future violations of a system from the current state. We work under the most realistic settings where only partial and noisy observations of the state are available at runtime. Such settings directly affect the accuracy and reliability of the reachability predictions, jeopardizing the safety of the system. In this work, we present a learning-based method for PM that produces accurate and reliable reachability predictions despite partial observability (PO). We build on Neural Predictive Monitoring (NPM), a PM method that uses deep neural networks for approximating hybrid systems reachability, and extend it to the PO case. We propose and compare two solutions, an end-to-end approach, which directly operates on the rough observations, and a two-step approach, which introduces an intermediate state estimation step. Both solutions rely on conformal prediction to provide 1) probabilistic guarantees in the form of prediction regions and 2) sound estimates of predictive uncertainty. We use the latter to identify unreliable (and likely erroneous) predictions and to retrain and improve the monitors on these uncertain inputs (i.e., active learning). Our method results in highly accurate reachability predictions and error detection, as well as tight prediction regions with guaranteed coverage. △ Less

Submitted 17 August, 2021; v1 submitted 16 August, 2021; originally announced August 2021.

arXiv:2105.10134 [pdf, other]

Certification of Iterative Predictions in Bayesian Neural Networks

Authors: Matthew Wicker, Luca Laurenti, Andrea Patane, Nicola Paoletti, Alessandro Abate, Marta Kwiatkowska

Abstract: We consider the problem of computing reach-avoid probabilities for iterative predictions made with Bayesian neural network (BNN) models. Specifically, we leverage bound propagation techniques and backward recursion to compute lower bounds for the probability that trajectories of the BNN model reach a given set of states while avoiding a set of unsafe states. We use the lower bounds in the context… ▽ More We consider the problem of computing reach-avoid probabilities for iterative predictions made with Bayesian neural network (BNN) models. Specifically, we leverage bound propagation techniques and backward recursion to compute lower bounds for the probability that trajectories of the BNN model reach a given set of states while avoiding a set of unsafe states. We use the lower bounds in the context of control and reinforcement learning to provide safety certification for given control policies, as well as to synthesize control policies that improve the certification bounds. On a set of benchmarks, we demonstrate that our framework can be employed to certify policies over BNNs predictions for problems of more than $10$ dimensions, and to effectively synthesize policies that significantly increase the lower bound on the satisfaction probability. △ Less

Submitted 19 June, 2021; v1 submitted 21 May, 2021; originally announced May 2021.

Comments: Accepted, UAI 2021. 17 pages

arXiv:2105.03640 [pdf, other]

doi 10.24963/366

On Guaranteed Optimal Robust Explanations for NLP Models

Authors: Emanuele La Malfa, Agnieszka Zbrzezny, Rhiannon Michelmore, Nicola Paoletti, Marta Kwiatkowska

Abstract: We build on abduction-based explanations for ma-chine learning and develop a method for computing local explanations for neural network models in natural language processing (NLP). Our explanations comprise a subset of the words of the in-put text that satisfies two key features: optimality w.r.t. a user-defined cost function, such as the length of explanation, and robustness, in that they ensure… ▽ More We build on abduction-based explanations for ma-chine learning and develop a method for computing local explanations for neural network models in natural language processing (NLP). Our explanations comprise a subset of the words of the in-put text that satisfies two key features: optimality w.r.t. a user-defined cost function, such as the length of explanation, and robustness, in that they ensure prediction invariance for any bounded perturbation in the embedding space of the left out words. We present two solution algorithms, respectively based on implicit hitting sets and maximum universal subsets, introducing a number of algorithmic improvements to speed up convergence of hard instances. We show how our method can be con-figured with different perturbation sets in the em-bedded space and used to detect bias in predictions by enforcing include/exclude constraints on biased terms, as well as to enhance existing heuristic-based NLP explanation frameworks such as Anchors. We evaluate our framework on three widely used sentiment analysis tasks and texts of up to100words from SST, Twitter and IMDB datasets,demonstrating the effectiveness of the derived explanations. △ Less

Submitted 14 May, 2021; v1 submitted 8 May, 2021; originally announced May 2021.

Comments: 13 pages (8+5 Appendix). Accepted as long-paper at IJCAI 2021

Report number: Pages 2658-2665

Journal ref: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence 2021

arXiv:2009.08302 [pdf, other]

Learnable Strategies for Bilateral Agent Negotiation over Multiple Issues

Authors: Pallavi Bagga, Nicola Paoletti, Kostas Stathis

Abstract: We present a novel bilateral negotiation model that allows a self-interested agent to learn how to negotiate over multiple issues in the presence of user preference uncertainty. The model relies upon interpretable strategy templates representing the tactics the agent should employ during the negotiation and learns template parameters to maximize the average utility received over multiple negotiati… ▽ More We present a novel bilateral negotiation model that allows a self-interested agent to learn how to negotiate over multiple issues in the presence of user preference uncertainty. The model relies upon interpretable strategy templates representing the tactics the agent should employ during the negotiation and learns template parameters to maximize the average utility received over multiple negotiations, thus resulting in optimal bid acceptance and generation. Our model also uses deep reinforcement learning to evaluate threshold utility values, for those tactics that require them, thereby deriving optimal utilities for every environment state. To handle user preference uncertainty, the model relies on a stochastic search to find user model that best agrees with a given partial preference profile. Multi-objective optimization and multi-criteria decision-making methods are applied at negotiation time to generate Pareto-optimal outcomes thereby increasing the number of successful (win-win) negotiations. Rigorous experimental evaluations show that the agent employing our model outperforms the winning agents of the 10th Automated Negotiating Agents Competition (ANAC'19) in terms of individual as well as social-welfare utilities. △ Less

Submitted 7 January, 2022; v1 submitted 17 September, 2020; originally announced September 2020.

arXiv:2003.01283 [pdf, other]

MPC-guided Imitation Learning of Neural Network Policies for the Artificial Pancreas

Authors: Hongkai Chen, Nicola Paoletti, Scott A. Smolka, Shan Lin

Abstract: Even though model predictive control (MPC) is currently the main algorithm for insulin control in the artificial pancreas (AP), it usually requires complex online optimizations, which are infeasible for resource-constrained medical devices. MPC also typically relies on state estimation, an error-prone process. In this paper, we introduce a novel approach to AP control that uses Imitation Learning… ▽ More Even though model predictive control (MPC) is currently the main algorithm for insulin control in the artificial pancreas (AP), it usually requires complex online optimizations, which are infeasible for resource-constrained medical devices. MPC also typically relies on state estimation, an error-prone process. In this paper, we introduce a novel approach to AP control that uses Imitation Learning to synthesize neural-network insulin policies from MPC-computed demonstrations. Such policies are computationally efficient and, by instrumenting MPC at training time with full state information, they can directly map measurements into optimal therapy decisions, thus bypassing state estimation. We apply Bayesian inference via Monte Carlo Dropout to learn policies, which allows us to quantify prediction uncertainty and thereby derive safer therapy decisions. We show that our control policies trained under a specific patient model readily generalize (in terms of model parameters and disturbance distributions) to patient cohorts, consistently outperforming traditional MPC with state estimation. △ Less

Submitted 2 March, 2020; originally announced March 2020.

arXiv:2001.11785 [pdf, other]

A Deep Reinforcement Learning Approach to Concurrent Bilateral Negotiation

Authors: Pallavi Bagga, Nicola Paoletti, Bedour Alrayes, Kostas Stathis

Abstract: We present a novel negotiation model that allows an agent to learn how to negotiate during concurrent bilateral negotiations in unknown and dynamic e-markets. The agent uses an actor-critic architecture with model-free reinforcement learning to learn a strategy expressed as a deep neural network. We pre-train the strategy by supervision from synthetic market data, thereby decreasing the exploratio… ▽ More We present a novel negotiation model that allows an agent to learn how to negotiate during concurrent bilateral negotiations in unknown and dynamic e-markets. The agent uses an actor-critic architecture with model-free reinforcement learning to learn a strategy expressed as a deep neural network. We pre-train the strategy by supervision from synthetic market data, thereby decreasing the exploration time required for learning during negotiation. As a result, we can build automated agents for concurrent negotiations that can adapt to different e-market settings without the need to be pre-programmed. Our experimental evaluation shows that our deep reinforcement learning-based agents outperform two existing well-known negotiation strategies in one-to-many concurrent bilateral negotiations for a range of e-market settings. △ Less

Submitted 3 February, 2020; v1 submitted 31 January, 2020; originally announced January 2020.

arXiv:1908.00528 [pdf, other]

Neural Simplex Architecture

Authors: Dung T. Phan, Radu Grosu, Nils Jansen, Nicola Paoletti, Scott A. Smolka, Scott D. Stoller

Abstract: We present the Neural Simplex Architecture (NSA), a new approach to runtime assurance that provides safety guarantees for neural controllers (obtained e.g. using reinforcement learning) of autonomous and other complex systems without unduly sacrificing performance. NSA is inspired by the Simplex control architecture of Sha et al., but with some significant differences. In the traditional approach,… ▽ More We present the Neural Simplex Architecture (NSA), a new approach to runtime assurance that provides safety guarantees for neural controllers (obtained e.g. using reinforcement learning) of autonomous and other complex systems without unduly sacrificing performance. NSA is inspired by the Simplex control architecture of Sha et al., but with some significant differences. In the traditional approach, the advanced controller (AC) is treated as a black box; when the decision module switches control to the baseline controller (BC), the BC remains in control forever. There is relatively little work on switching control back to the AC, and there are no techniques for correcting the AC's behavior after it generates a potentially unsafe control input that causes a failover to the BC. Our NSA addresses both of these limitations. NSA not only provides safety assurances in the presence of a possibly unsafe neural controller, but can also improve the safety of such a controller in an online setting via retraining, without overly degrading its performance. To demonstrate NSA's benefits, we have conducted several significant case studies in the continuous control domain. These include a target-seeking ground rover navigating an obstacle field, and a neural controller for an artificial pancreas system. △ Less

Submitted 24 March, 2020; v1 submitted 1 August, 2019; originally announced August 2019.

Comments: 12th NASA Formal Methods Symposium (NFM 2020)

arXiv:1903.01980 [pdf, other]

Statistical Guarantees for the Robustness of Bayesian Neural Networks

Authors: Luca Cardelli, Marta Kwiatkowska, Luca Laurenti, Nicola Paoletti, Andrea Patane, Matthew Wicker

Abstract: We introduce a probabilistic robustness measure for Bayesian Neural Networks (BNNs), defined as the probability that, given a test point, there exists a point within a bounded set such that the BNN prediction differs between the two. Such a measure can be used, for instance, to quantify the probability of the existence of adversarial examples. Building on statistical verification techniques for pr… ▽ More We introduce a probabilistic robustness measure for Bayesian Neural Networks (BNNs), defined as the probability that, given a test point, there exists a point within a bounded set such that the BNN prediction differs between the two. Such a measure can be used, for instance, to quantify the probability of the existence of adversarial examples. Building on statistical verification techniques for probabilistic models, we develop a framework that allows us to estimate probabilistic robustness for a BNN with statistical guarantees, i.e., with a priori error and confidence bounds. We provide experimental comparison for several approximate BNN inference techniques on image classification tasks associated to MNIST and a two-class subset of the GTSRB dataset. Our results enable quantification of uncertainty of BNN predictions in adversarial settings. △ Less

Submitted 5 March, 2019; originally announced March 2019.

Comments: 9 pages, 6 figures

arXiv:1901.03315 [pdf, other]

Automated Synthesis of Safe Digital Controllers for Sampled-Data Stochastic Nonlinear Systems

Authors: Fedor Shmarov, Sadegh Soudjani, Nicola Paoletti, Ezio Bartocci, Shan Lin, Scott A. Smolka, Paolo Zuliani

Abstract: We present a new method for the automated synthesis of digital controllers with formal safety guarantees for systems with nonlinear dynamics, noisy output measurements, and stochastic disturbances. Our method derives digital controllers such that the corresponding closed-loop system, modeled as a sampled-data stochastic control system, satisfies a safety specification with probability above a give… ▽ More We present a new method for the automated synthesis of digital controllers with formal safety guarantees for systems with nonlinear dynamics, noisy output measurements, and stochastic disturbances. Our method derives digital controllers such that the corresponding closed-loop system, modeled as a sampled-data stochastic control system, satisfies a safety specification with probability above a given threshold. The proposed synthesis method alternates between two steps: generation of a candidate controller pc, and verification of the candidate. pc is found by maximizing a Monte Carlo estimate of the safety probability, and by using a non-validated ODE solver for simulating the system. Such a candidate is therefore sub-optimal but can be generated very rapidly. To rule out unstable candidate controllers, we prove and utilize Lyapunov's indirect method for instability of sampled-data nonlinear systems. In the subsequent verification step, we use a validated solver based on SMT (Satisfiability Modulo Theories) to compute a numerically and statistically valid confidence interval for the safety probability of pc. If the probability so obtained is not above the threshold, we expand the search space for candidates by increasing the controller degree. We evaluate our technique on three case studies: an artificial pancreas model, a powertrain control model, and a quadruple-tank process. △ Less

Submitted 10 January, 2019; originally announced January 2019.

Comments: 12 pages, 4 figures, 4 tables

MSC Class: 68N30

arXiv:1810.03808 [pdf, ps, other]

Synthesizing Stealthy Reprogramming Attacks on Cardiac Devices

Authors: Nicola Paoletti, Zhihao Jiang, Md Ariful Islam, Houssam Abbas, Rahul Mangharam, Shan Lin, Zachary Gruber, Scott A. Smolka

Abstract: An Implantable Cardioverter Defibrillator (ICD) is a medical device used for the detection of potentially fatal cardiac arrhythmia and their treatment through the delivery of electrical shocks intended to restore normal heart rhythm. An ICD reprogramming attack seeks to alter the device's parameters to induce unnecessary shocks and, even more egregious, prevent required therapy. In this paper, we… ▽ More An Implantable Cardioverter Defibrillator (ICD) is a medical device used for the detection of potentially fatal cardiac arrhythmia and their treatment through the delivery of electrical shocks intended to restore normal heart rhythm. An ICD reprogramming attack seeks to alter the device's parameters to induce unnecessary shocks and, even more egregious, prevent required therapy. In this paper, we present a formal approach for the synthesis of ICD reprogramming attacks that are both effective, i.e., lead to fundamental changes in the required therapy, and stealthy, i.e., involve minimal changes to the nominal ICD parameters. We focus on the discrimination algorithm underlying Boston Scientific devices (one of the principal ICD manufacturers) and formulate the synthesis problem as one of multi-objective optimization. Our solution technique is based on an Optimization Modulo Theories encoding of the problem and allows us to derive device parameters that are optimal with respect to the effectiveness-stealthiness tradeoff (i.e., lie along the corresponding Pareto front). To the best of our knowledge, our work is the first to derive systematic ICD reprogramming attacks designed to maximize therapy disruption while minimizing detection. To evaluate our technique, we employ an extensive dataset of synthetic EGMs (cardiac signals), each generated with a prescribed arrhythmia, allowing us to synthesize attacks tailored to the victim's cardiac condition. Our approach readily generalizes to unseen signals, representing the unknown EGM of the victim patient. △ Less

Submitted 9 October, 2018; originally announced October 2018.

arXiv:1807.09901 [pdf, other]

doi 10.1007/978-3-030-01090-4_25

Neural State Classification for Hybrid Systems

Authors: Dung Phan, Nicola Paoletti, Timothy Zhang, Radu Grosu, Scott A. Smolka, Scott D. Stoller

Abstract: We introduce the State Classification Problem (SCP) for hybrid systems, and present Neural State Classification (NSC) as an efficient solution technique. SCP generalizes the model checking problem as it entails classifying each state $s$ of a hybrid automaton as either positive or negative, depending on whether or not $s$ satisfies a given time-bounded reachability specification. This is an intere… ▽ More We introduce the State Classification Problem (SCP) for hybrid systems, and present Neural State Classification (NSC) as an efficient solution technique. SCP generalizes the model checking problem as it entails classifying each state $s$ of a hybrid automaton as either positive or negative, depending on whether or not $s$ satisfies a given time-bounded reachability specification. This is an interesting problem in its own right, which NSC solves using machine-learning techniques, Deep Neural Networks in particular. State classifiers produced by NSC tend to be very efficient (run in constant time and space), but may be subject to classification errors. To quantify and mitigate such errors, our approach comprises: i) techniques for certifying, with statistical guarantees, that an NSC classifier meets given accuracy levels; ii) tuning techniques, including a novel technique based on adversarial sampling, that can virtually eliminate false negatives (positive states classified as negative), thereby making the classifier more conservative. We have applied NSC to six nonlinear hybrid system benchmarks, achieving an accuracy of 99.25% to 99.98%, and a false-negative rate of 0.0033 to 0, which we further reduced to 0.0015 to 0 after tuning the classifier. We believe that this level of accuracy is acceptable in many practical applications, and that these results demonstrate the promise of the NSC approach. △ Less

Submitted 25 July, 2018; originally announced July 2018.

Comments: ATVA2018 extended version

arXiv:1712.01935 [pdf, other]

How to Learn a Model Checker

Authors: Dung Phan, Radu Grosu, Nicola Paoletti, Scott A. Smolka, Scott D. Stoller

Abstract: We show how machine-learning techniques, particularly neural networks, offer a very effective and highly efficient solution to the approximate model-checking problem for continuous and hybrid systems, a solution where the general-purpose model checker is replaced by a model-specific classifier trained by sampling model trajectories. To the best of our knowledge, we are the first to establish this… ▽ More We show how machine-learning techniques, particularly neural networks, offer a very effective and highly efficient solution to the approximate model-checking problem for continuous and hybrid systems, a solution where the general-purpose model checker is replaced by a model-specific classifier trained by sampling model trajectories. To the best of our knowledge, we are the first to establish this link from machine learning to model checking. Our method comprises a pipeline of analysis techniques for estimating and obtaining statistical guarantees on the classifier's prediction performance, as well as tuning techniques to improve such performance. Our experimental evaluation considers the time-bounded reachability problem for three well-established benchmarks in the hybrid systems community. On these examples, we achieve an accuracy of 99.82% to 100% and a false-negative rate (incorrectly predicting that unsafe states are not reachable from a given state) of 0.0007 to 0. We believe that this level of accuracy is acceptable in many practical applications and we show how the approximate model checker can be made more conservative by tuning the classifier through further training and selection of the classification threshold. △ Less

Submitted 5 December, 2017; originally announced December 2017.

Comments: 16 pages, 13 figures, short version submitted to HSCC2018

arXiv:1710.10013 [pdf, other]

Declarative vs Rule-based Control for Flocking Dynamics

Authors: Usama Mehmood, Nicola Paoletti, Dung Phan, Radu Grosu, Shan Lin, Scott D. Stoller, Ashish Tiwari, Junxing Yang, Scott A. Smolka

Abstract: The popularity of rule-based flocking models, such as Reynolds' classic flocking model, raises the question of whether more declarative flocking models are possible. This question is motivated by the observation that declarative models are generally simpler and easier to design, understand, and analyze than operational models. We introduce a very simple control law for flocking based on a cost fun… ▽ More The popularity of rule-based flocking models, such as Reynolds' classic flocking model, raises the question of whether more declarative flocking models are possible. This question is motivated by the observation that declarative models are generally simpler and easier to design, understand, and analyze than operational models. We introduce a very simple control law for flocking based on a cost function capturing cohesion (agents want to stay together) and separation (agents do not want to get too close). We refer to it as {\textit declarative flocking} (DF). We use model-predictive control (MPC) to define controllers for DF in centralized and distributed settings. A thorough performance comparison of our declarative flocking with Reynolds' model, and with more recent flocking models that use MPC with a cost function based on lattice structures, demonstrate that DF-MPC yields the best cohesion and least fragmentation, and maintains a surprisingly good level of geometric regularity while still producing natural flock shapes similar to those produced by Reynolds' model. We also show that DF-MPC has high resilience to sensor noise. △ Less

Submitted 27 October, 2017; originally announced October 2017.

Comments: 7 Pages

arXiv:1707.05229 [pdf, other]

Automated Synthesis of Safe and Robust PID Controllers for Stochastic Hybrid Systems

Authors: Fedor Shmarov, Nicola Paoletti, Ezio Bartocci, Shan Lin, Scott A. Smolka, Paolo Zuliani

Abstract: We present a new method for the automated synthesis of safe and robust Proportional-Integral-Derivative (PID) controllers for stochastic hybrid systems. Despite their widespread use in industry, no automated method currently exists for deriving a PID controller (or any other type of controller, for that matter) with safety and performance guarantees for such a general class of systems. In particul… ▽ More We present a new method for the automated synthesis of safe and robust Proportional-Integral-Derivative (PID) controllers for stochastic hybrid systems. Despite their widespread use in industry, no automated method currently exists for deriving a PID controller (or any other type of controller, for that matter) with safety and performance guarantees for such a general class of systems. In particular, we consider hybrid systems with nonlinear dynamics (Lipschitz-continuous ordinary differential equations) and random parameters, and we synthesize PID controllers such that the resulting closed-loop systems satisfy safety and performance constraints given as probabilistic bounded reachability properties. Our technique leverages SMT solvers over the reals and nonlinear differential equations to provide formal guarantees that the synthesized controllers satisfy such properties. These controllers are also robust by design since they minimize the probability of reaching an unsafe state in the presence of random disturbances. We apply our approach to the problem of insulin regulation for type 1 diabetes, synthesizing controllers with robust responses to large random meal disturbances, thereby enabling them to maintain blood glucose levels within healthy, safe ranges. △ Less

Submitted 7 September, 2017; v1 submitted 17 July, 2017; originally announced July 2017.

Comments: Extended version of paper accepted at the 13th Haifa Verification Conference

arXiv:1404.0698 [pdf, other]

Adaptability Checking in Multi-Level Complex Systems

Authors: Emanuela Merelli, Nicola Paoletti, Luca Tesei

Abstract: A hierarchical model for multi-level adaptive systems is built on two basic levels: a lower behavioural level B accounting for the actual behaviour of the system and an upper structural level S describing the adaptation dynamics of the system. The behavioural level is modelled as a state machine and the structural level as a higher-order system whose states have associated logical formulas (constr… ▽ More A hierarchical model for multi-level adaptive systems is built on two basic levels: a lower behavioural level B accounting for the actual behaviour of the system and an upper structural level S describing the adaptation dynamics of the system. The behavioural level is modelled as a state machine and the structural level as a higher-order system whose states have associated logical formulas (constraints) over observables of the behavioural level. S is used to capture the global and stable features of B, by a defining set of allowed behaviours. The adaptation semantics is such that the upper S level imposes constraints on the lower B level, which has to adapt whenever it no longer can satisfy them. In this context, we introduce weak and strong adaptabil- ity, i.e. the ability of a system to adapt for some evolution paths or for all possible evolutions, respectively. We provide a relational characterisation for these two notions and we show that adaptability checking, i.e. deciding if a system is weak or strong adaptable, can be reduced to a CTL model checking problem. We apply the model and the theoretical results to the case study of motion control of autonomous transport vehicles. △ Less

Submitted 2 April, 2014; originally announced April 2014.

Comments: 57 page, 10 figures, research papaer, submitted

arXiv:1209.1628 [pdf, other]

doi 10.4204/EPTCS.91.8

A multi-level model for self-adaptive systems

Authors: Emanuela Merelli, Nicola Paoletti, Luca Tesei

Abstract: This work introduces a general multi-level model for self-adaptive systems. A self-adaptive system is seen as composed by two levels: the lower level describing the actual behaviour of the system and the upper level accounting for the dynamically changing environmental constraints on the system. In order to keep our description as general as possible, the lower level is modelled as a state machine… ▽ More This work introduces a general multi-level model for self-adaptive systems. A self-adaptive system is seen as composed by two levels: the lower level describing the actual behaviour of the system and the upper level accounting for the dynamically changing environmental constraints on the system. In order to keep our description as general as possible, the lower level is modelled as a state machine and the upper level as a second-order state machine whose states have associated formulas over observable variables of the lower level. Thus, each state of the second-order machine identifies the set of lower-level states satisfying the constraints. Adaptation is triggered when a second-order transition is performed; this means that the current system no longer can satisfy the current high-level constraints and, thus, it has to adapt its behaviour by reaching a state that meets the new constraints. The semantics of the multi-level system is given by a flattened transition system that can be statically checked in order to prove the correctness of the adaptation model. To this aim we formalize two concepts of weak and strong adaptability providing both a relational and a logical characterization. We report that this work gives a formal computational characterization of multi-level self-adaptive systems, evidencing the important role that (theoretical) computer science could play in the emerging science of complex systems. △ Less

Submitted 14 May, 2013; v1 submitted 6 September, 2012; originally announced September 2012.

Comments: In Proceedings of FOCLASA 2012, arXiv:1208.4327

ACM Class: F.1.1; D.2; F.3.1

Journal ref: EPTCS 91, 2012, pp. 112-126

arXiv:1208.3858 [pdf, other]

doi 10.4204/EPTCS.92.11

Disease processes as hybrid dynamical systems

Authors: Pietro Liò, Emanuela Merelli, Nicola Paoletti

Abstract: We investigate the use of hybrid techniques in complex processes of infectious diseases. Since predictive disease models in biomedicine require a multiscale approach for understanding the molecule-cell-tissue-organ-body interactions, heterogeneous methodologies are often employed for describing the different biological scales. Hybrid models provide effective means for complex disease modelling whe… ▽ More We investigate the use of hybrid techniques in complex processes of infectious diseases. Since predictive disease models in biomedicine require a multiscale approach for understanding the molecule-cell-tissue-organ-body interactions, heterogeneous methodologies are often employed for describing the different biological scales. Hybrid models provide effective means for complex disease modelling where the action and dosage of a drug or a therapy could be meaningfully investigated: the infection dynamics can be classically described in a continuous fashion, while the scheduling of multiple treatment discretely. We define an algebraic language for specifying general disease processes and multiple treatments, from which a semantics in terms of hybrid dynamical system can be derived. Then, the application of control-theoretic tools is proposed in order to compute the optimal scheduling of multiple therapies. The potentialities of our approach are shown in the case study of the SIR epidemic model and we discuss its applicability on osteomyelitis, a bacterial infection affecting the bone remodelling system in a specific and multiscale manner. We report that formal languages are helpful in giving a general homogeneous formulation for the different scales involved in a multiscale disease process; and that the combination of hybrid modelling and control theory provides solid grounds for computational medicine. △ Less

Submitted 19 August, 2012; originally announced August 2012.

Comments: In Proceedings HSB 2012, arXiv:1208.3151

ACM Class: D.3.1; J.3; I.2.8

Journal ref: EPTCS 92, 2012, pp. 152-166

arXiv:1109.1368 [pdf, other]

doi 10.4204/EPTCS.67.8

Multiple verification in computational modeling of bone pathologies

Authors: Pietro Liò, Emanuela Merelli, Nicola Paoletti

Abstract: We introduce a model checking approach to diagnose the emerging of bone pathologies. The implementation of a new model of bone remodeling in PRISM has led to an interesting characterization of osteoporosis as a defective bone remodeling dynamics with respect to other bone pathologies. Our approach allows to derive three types of model checking-based diagnostic estimators. The first diagnostic meas… ▽ More We introduce a model checking approach to diagnose the emerging of bone pathologies. The implementation of a new model of bone remodeling in PRISM has led to an interesting characterization of osteoporosis as a defective bone remodeling dynamics with respect to other bone pathologies. Our approach allows to derive three types of model checking-based diagnostic estimators. The first diagnostic measure focuses on the level of bone mineral density, which is currently used in medical practice. In addition, we have introduced a novel diagnostic estimator which uses the full patient clinical record, here simulated using the modeling framework. This estimator detects rapid (months) negative changes in bone mineral density. Independently of the actual bone mineral density, when the decrease occurs rapidly it is important to alarm the patient and monitor him/her more closely to detect insurgence of other bone co-morbidities. A third estimator takes into account the variance of the bone density, which could address the investigation of metabolic syndromes, diabetes and cancer. Our implementation could make use of different logical combinations of these statistical estimators and could incorporate other biomarkers for other systemic co-morbidities (for example diabetes and thalassemia). We are delighted to report that the combination of stochastic modeling with formal methods motivate new diagnostic framework for complex pathologies. In particular our approach takes into consideration important properties of biosystems such as multiscale and self-adaptiveness. The multi-diagnosis could be further expanded, inching towards the complexity of human diseases. Finally, we briefly introduce self-adaptiveness in formal methods which is a key property in the regulative mechanisms of biological systems and well known in other mathematical and engineering areas. △ Less

Submitted 7 September, 2011; originally announced September 2011.

Comments: In Proceedings CompMod 2011, arXiv:1109.1044

ACM Class: D.2.4; I.6; J.3

Journal ref: EPTCS 67, 2011, pp. 82-96

Showing 1–27 of 27 results for author: Paoletti, N