-
State Matching and Multiple References in Adaptive Active Automata Learning
Authors:
Loes Kruger,
Sebastian Junges,
Jurriaan Rot
Abstract:
Active automata learning (AAL) is a method to infer state machines by interacting with black-box systems. Adaptive AAL aims to reduce the sample complexity of AAL by incorporating domain specific knowledge in the form of (similar) reference models. Such reference models appear naturally when learning multiple versions or variants of a software system. In this paper, we present state matching, whic…
▽ More
Active automata learning (AAL) is a method to infer state machines by interacting with black-box systems. Adaptive AAL aims to reduce the sample complexity of AAL by incorporating domain specific knowledge in the form of (similar) reference models. Such reference models appear naturally when learning multiple versions or variants of a software system. In this paper, we present state matching, which allows flexible use of the structure of these reference models by the learner. State matching is the main ingredient of adaptive L#, a novel framework for adaptive learning, built on top of L#. Our empirical evaluation shows that adaptive L# improves the state of the art by up to two orders of magnitude.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Learning-Based Verification of Stochastic Dynamical Systems with Neural Network Policies
Authors:
Thom Badings,
Wietze Koops,
Sebastian Junges,
Nils Jansen
Abstract:
We consider the verification of neural network policies for reach-avoid control tasks in stochastic dynamical systems. We use a verification procedure that trains another neural network, which acts as a certificate proving that the policy satisfies the task. For reach-avoid tasks, it suffices to show that this certificate network is a reach-avoid supermartingale (RASM). As our main contribution, w…
▽ More
We consider the verification of neural network policies for reach-avoid control tasks in stochastic dynamical systems. We use a verification procedure that trains another neural network, which acts as a certificate proving that the policy satisfies the task. For reach-avoid tasks, it suffices to show that this certificate network is a reach-avoid supermartingale (RASM). As our main contribution, we significantly accelerate algorithmic approaches for verifying that a neural network is indeed a RASM. The main bottleneck of these approaches is the discretization of the state space of the dynamical system. The following two key contributions allow us to use a coarser discretization than existing approaches. First, we present a novel and fast method to compute tight upper bounds on Lipschitz constants of neural networks based on weighted norms. We further improve these bounds on Lipschitz constants based on the characteristics of the certificate network. Second, we integrate an efficient local refinement scheme that dynamically refines the state space discretization where necessary. Our empirical evaluation shows the effectiveness of our approach for verifying neural network policies in several benchmarks and trained with different reinforcement learning algorithms.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Tools at the Frontiers of Quantitative Verification
Authors:
Roman Andriushchenko,
Alexander Bork,
Carlos E. Budde,
Milan Češka,
Kush Grover,
Ernst Moritz Hahn,
Arnd Hartmanns,
Bryant Israelsen,
Nils Jansen,
Joshua Jeppson,
Sebastian Junges,
Maximilian A. Köhl,
Bettina Könighofer,
Jan Křetínský,
Tobias Meggendorfer,
David Parker,
Stefan Pranger,
Tim Quatmann,
Enno Ruijters,
Landon Taylor,
Matthias Volk,
Maximilian Weininger,
Zhen Zhang
Abstract:
The analysis of formal models that include quantitative aspects such as timing or probabilistic choices is performed by quantitative verification tools. Broad and mature tool support is available for computing basic properties such as expected rewards on basic models such as Markov chains. Previous editions of QComp, the comparison of tools for the analysis of quantitative formal models, focused o…
▽ More
The analysis of formal models that include quantitative aspects such as timing or probabilistic choices is performed by quantitative verification tools. Broad and mature tool support is available for computing basic properties such as expected rewards on basic models such as Markov chains. Previous editions of QComp, the comparison of tools for the analysis of quantitative formal models, focused on this setting. Many application scenarios, however, require more advanced property types such as LTL and parameter synthesis queries as well as advanced models like stochastic games and partially observable MDPs. For these, tool support is in its infancy today. This paper presents the outcomes of QComp 2023: a survey of the state of the art in quantitative verification tool support for advanced property types and models. With tools ranging from first research prototypes to well-supported integrations into established toolsets, this report highlights today's active areas and tomorrow's challenges in tool-focused research for quantitative verification.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Compositional Value Iteration with Pareto Caching
Authors:
Kazuki Watanabe,
Marck van der Vegt,
Sebastian Junges,
Ichiro Hasuo
Abstract:
The de-facto standard approach in MDP verification is based on value iteration (VI). We propose compositional VI, a framework for model checking compositional MDPs, that addresses efficiency while maintaining soundness. Concretely, compositional MDPs naturally arise from the combination of individual components, and their structure can be expressed using, e.g., string diagrams. Towards efficiency,…
▽ More
The de-facto standard approach in MDP verification is based on value iteration (VI). We propose compositional VI, a framework for model checking compositional MDPs, that addresses efficiency while maintaining soundness. Concretely, compositional MDPs naturally arise from the combination of individual components, and their structure can be expressed using, e.g., string diagrams. Towards efficiency, we observe that compositional VI repeatedly verifies individual components. We propose a technique called Pareto caching that allows to reuse verification results, even for previously unseen queries. Towards soundness, we present two stop** criteria: one generalizes the optimistic value iteration paradigm and the other uses Pareto caches in conjunction with recent baseline algorithms. Our experimental evaluations shows the promise of the novel algorithm and its variations, and identifies challenges for future work.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Approximate Dec-POMDP Solving Using Multi-Agent A*
Authors:
Wietze Koops,
Sebastian Junges,
Nils Jansen
Abstract:
We present an A*-based algorithm to compute policies for finite-horizon Dec-POMDPs. Our goal is to sacrifice optimality in favor of scalability for larger horizons. The main ingredients of our approach are (1) using clustered sliding window memory, (2) pruning the A* search tree, and (3) using novel A* heuristics. Our experiments show competitive performance to the state-of-the-art. Moreover, for…
▽ More
We present an A*-based algorithm to compute policies for finite-horizon Dec-POMDPs. Our goal is to sacrifice optimality in favor of scalability for larger horizons. The main ingredients of our approach are (1) using clustered sliding window memory, (2) pruning the A* search tree, and (3) using novel A* heuristics. Our experiments show competitive performance to the state-of-the-art. Moreover, for multiple benchmarks, we achieve superior performance. In addition, we provide an A* algorithm that finds upper bounds for the optimum, tailored towards problems with long horizons. The main ingredient is a new heuristic that periodically reveals the state, thereby limiting the number of reachable beliefs. Our experiments demonstrate the efficacy and scalability of the approach.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Imprecise Probabilities Meet Partial Observability: Game Semantics for Robust POMDPs
Authors:
Eline M. Bovy,
Marnix Suilen,
Sebastian Junges,
Nils Jansen
Abstract:
Partially observable Markov decision processes (POMDPs) rely on the key assumption that probability distributions are precisely known. Robust POMDPs (RPOMDPs) alleviate this concern by defining imprecise probabilities, referred to as uncertainty sets. While robust MDPs have been studied extensively, work on RPOMDPs is limited and primarily focuses on algorithmic solution methods. We expand the the…
▽ More
Partially observable Markov decision processes (POMDPs) rely on the key assumption that probability distributions are precisely known. Robust POMDPs (RPOMDPs) alleviate this concern by defining imprecise probabilities, referred to as uncertainty sets. While robust MDPs have been studied extensively, work on RPOMDPs is limited and primarily focuses on algorithmic solution methods. We expand the theoretical understanding of RPOMDPs by showing that 1) different assumptions on the uncertainty sets affect optimal policies and values; 2) RPOMDPs have a partially observable stochastic game (POSG) semantic; and 3) the same RPOMDP with different assumptions leads to semantically different POSGs and, thus, different policies and values. These novel semantics for RPOMDPS give access to results for the widely studied POSG model; concretely, we show the existence of a Nash equilibrium. Finally, we classify the existing RPOMDP literature using our semantics, clarifying under which uncertainty assumptions these existing works operate.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Small Test Suites for Active Automata Learning
Authors:
Loes Kruger,
Sebastian Junges,
Jurriaan Rot
Abstract:
A bottleneck in modern active automata learning is to test whether a hypothesized Mealy machine correctly describes the system under learning. The search space for possible counterexamples is given by so-called test suites, consisting of input sequences that have to be checked to decide whether a counterexample exists. This paper shows that significantly smaller test suites suffice under reasonabl…
▽ More
A bottleneck in modern active automata learning is to test whether a hypothesized Mealy machine correctly describes the system under learning. The search space for possible counterexamples is given by so-called test suites, consisting of input sequences that have to be checked to decide whether a counterexample exists. This paper shows that significantly smaller test suites suffice under reasonable assumptions on the structure of the black box. These smaller test suites help to refute false hypotheses during active automata learning, even when the assumptions do not hold. We combine multiple test suites using a multi-armed bandit setup that adaptively selects a test suite. An extensive empirical evaluation shows the efficacy of our approach. For small to medium-sized models, the performance gain is limited. However, the approach allows learning models from large, industrial case studies that were beyond the reach of known methods.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Pareto Curves for Compositionally Model Checking String Diagrams of MDPs
Authors:
Kazuki Watanabe,
Marck van der Vegt,
Ichiro Hasuo,
Jurriaan Rot,
Sebastian Junges
Abstract:
Computing schedulers that optimize reachability probabilities in MDPs is a standard verification task. To address scalability concerns, we focus on MDPs that are compositionally described in a high-level description formalism. In particular, this paper considers string diagrams, which specify an algebraic, sequential composition of subMDPs. Towards their compositional verification, the key challen…
▽ More
Computing schedulers that optimize reachability probabilities in MDPs is a standard verification task. To address scalability concerns, we focus on MDPs that are compositionally described in a high-level description formalism. In particular, this paper considers string diagrams, which specify an algebraic, sequential composition of subMDPs. Towards their compositional verification, the key challenge is to locally optimize schedulers on subMDPs without considering their context in the string diagram. This paper proposes to consider the schedulers in a subMDP which form a Pareto curve on a combination of local objectives. While considering all such schedulers is intractable, it gives rise to a highly efficient sound approximation algorithm. The prototype on top of the model checker Storm demonstrates the scalability of this approach.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
CTMCs with Imprecisely Timed Observations
Authors:
Thom Badings,
Matthias Volk,
Sebastian Junges,
Marielle Stoelinga,
Nils Jansen
Abstract:
Labeled continuous-time Markov chains (CTMCs) describe processes subject to random timing and partial observability. In applications such as runtime monitoring, we must incorporate past observations. The timing of these observations matters but may be uncertain. Thus, we consider a setting in which we are given a sequence of imprecisely timed labels called the evidence. The problem is to compute r…
▽ More
Labeled continuous-time Markov chains (CTMCs) describe processes subject to random timing and partial observability. In applications such as runtime monitoring, we must incorporate past observations. The timing of these observations matters but may be uncertain. Thus, we consider a setting in which we are given a sequence of imprecisely timed labels called the evidence. The problem is to compute reachability probabilities, which we condition on this evidence. Our key contribution is a method that solves this problem by unfolding the CTMC states over all possible timings for the evidence. We formalize this unfolding as a Markov decision process (MDP) in which each timing for the evidence is reflected by a scheduler. This MDP has infinitely many states and actions in general, making a direct analysis infeasible. Thus, we abstract the continuous MDP into a finite interval MDP (iMDP) and develop an iterative refinement scheme to upper-bound conditional probabilities in the CTMC. We show the feasibility of our method on several numerical benchmarks and discuss key challenges to further enhance the performance.
△ Less
Submitted 29 January, 2024; v1 submitted 12 January, 2024;
originally announced January 2024.
-
Factored Online Planning in Many-Agent POMDPs
Authors:
Maris F. L. Galesloot,
Thiago D. Simão,
Sebastian Junges,
Nils Jansen
Abstract:
In centralized multi-agent systems, often modeled as multi-agent partially observable Markov decision processes (MPOMDPs), the action and observation spaces grow exponentially with the number of agents, making the value and belief estimation of single-agent online planning ineffective. Prior work partially tackles value estimation by exploiting the inherent structure of multi-agent settings via so…
▽ More
In centralized multi-agent systems, often modeled as multi-agent partially observable Markov decision processes (MPOMDPs), the action and observation spaces grow exponentially with the number of agents, making the value and belief estimation of single-agent online planning ineffective. Prior work partially tackles value estimation by exploiting the inherent structure of multi-agent settings via so-called coordination graphs. Additionally, belief estimation methods have been improved by incorporating the likelihood of observations into the approximation. However, the challenges of value estimation and belief estimation have only been tackled individually, which prevents existing methods from scaling to settings with many agents. Therefore, we address these challenges simultaneously. First, we introduce weighted particle filtering to a sample-based online planner for MPOMDPs. Second, we present a scalable approximation of the belief. Third, we bring an approach that exploits the typical locality of agent interactions to novel online planning algorithms for MPOMDPs operating on a so-called sparse particle filter tree. Our experimental evaluation against several state-of-the-art baselines shows that our methods (1) are competitive in settings with only a few agents and (2) improve over the baselines in the presence of many agents.
△ Less
Submitted 23 February, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Learning Formal Specifications from Membership and Preference Queries
Authors:
Ameesh Shah,
Marcell Vazquez-Chanlatte,
Sebastian Junges,
Sanjit A. Seshia
Abstract:
Active learning is a well-studied approach to learning formal specifications, such as automata. In this work, we extend active specification learning by proposing a novel framework that strategically requests a combination of membership labels and pair-wise preferences, a popular alternative to membership labels. The combination of pair-wise preferences and membership labels allows for a more flex…
▽ More
Active learning is a well-studied approach to learning formal specifications, such as automata. In this work, we extend active specification learning by proposing a novel framework that strategically requests a combination of membership labels and pair-wise preferences, a popular alternative to membership labels. The combination of pair-wise preferences and membership labels allows for a more flexible approach to active specification learning, which previously relied on membership labels only. We instantiate our framework in two different domains, demonstrating the generality of our approach. Our results suggest that learning from both modalities allows us to robustly and conveniently identify specifications via membership and preferences.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
Search and Explore: Symbiotic Policy Synthesis in POMDPs
Authors:
Roman Andriushchenko,
Alexander Bork,
Milan Češka,
Sebastian Junges,
Joost-Pieter Katoen,
Filip Macák
Abstract:
This paper marries two state-of-the-art controller synthesis methods for partially observable Markov decision processes (POMDPs), a prominent model in sequential decision making under uncertainty. A central issue is to find a POMDP controller - that solely decides based on the observations seen so far - to achieve a total expected reward objective. As finding optimal controllers is undecidable, we…
▽ More
This paper marries two state-of-the-art controller synthesis methods for partially observable Markov decision processes (POMDPs), a prominent model in sequential decision making under uncertainty. A central issue is to find a POMDP controller - that solely decides based on the observations seen so far - to achieve a total expected reward objective. As finding optimal controllers is undecidable, we concentrate on synthesising good finite-state controllers (FSCs). We do so by tightly integrating two modern, orthogonal methods for POMDP controller synthesis: a belief-based and an inductive approach. The former method obtains an FSC from a finite fragment of the so-called belief MDP, an MDP that keeps track of the probabilities of equally observable POMDP states. The latter is an inductive search technique over a set of FSCs, e.g., controllers with a fixed memory size. The key result of this paper is a symbiotic anytime algorithm that tightly integrates both approaches such that each profits from the controllers constructed by the other. Experimental results indicate a substantial improvement in the value of the controllers while significantly reducing the synthesis time and memory footprint.
△ Less
Submitted 29 May, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Efficient Sensitivity Analysis for Parametric Robust Markov Chains
Authors:
Thom Badings,
Sebastian Junges,
Ahmadreza Marandi,
Ufuk Topcu,
Nils Jansen
Abstract:
We provide a novel method for sensitivity analysis of parametric robust Markov chains. These models incorporate parameters and sets of probability distributions to alleviate the often unrealistic assumption that precise probabilities are available. We measure sensitivity in terms of partial derivatives with respect to the uncertain transition probabilities regarding measures such as the expected r…
▽ More
We provide a novel method for sensitivity analysis of parametric robust Markov chains. These models incorporate parameters and sets of probability distributions to alleviate the often unrealistic assumption that precise probabilities are available. We measure sensitivity in terms of partial derivatives with respect to the uncertain transition probabilities regarding measures such as the expected reward. As our main contribution, we present an efficient method to compute these partial derivatives. To scale our approach to models with thousands of parameters, we present an extension of this method that selects the subset of $k$ parameters with the highest partial derivative. Our methods are based on linear programming and differentiating these programs around a given value for the parameters. The experiments show the applicability of our approach on models with over a million states and thousands of parameters. Moreover, we embed the results within an iterative learning scheme that profits from having access to a dedicated sensitivity analysis.
△ Less
Submitted 1 May, 2023;
originally announced May 2023.
-
Robust Almost-Sure Reachability in Multi-Environment MDPs
Authors:
Marck van der Vegt,
Nils Jansen,
Sebastian Junges
Abstract:
Multiple-environment MDPs (MEMDPs) capture finite sets of MDPs that share the states but differ in the transition dynamics. These models form a proper subclass of partially observable MDPs (POMDPs). We consider the synthesis of policies that robustly satisfy an almost-sure reachability property in MEMDPs, that is, one policy that satisfies a property for all environments. For POMDPs, deciding the…
▽ More
Multiple-environment MDPs (MEMDPs) capture finite sets of MDPs that share the states but differ in the transition dynamics. These models form a proper subclass of partially observable MDPs (POMDPs). We consider the synthesis of policies that robustly satisfy an almost-sure reachability property in MEMDPs, that is, one policy that satisfies a property for all environments. For POMDPs, deciding the existence of robust policies is an EXPTIME-complete problem. In this paper, we show that this problem is PSPACE-complete for MEMDPs, while the policies in general require exponential memory. We exploit the theoretical results to develop and implement an algorithm that shows promising results in synthesizing robust policies for various benchmarks.
△ Less
Submitted 26 January, 2023;
originally announced January 2023.
-
A Practitioner's Guide to MDP Model Checking Algorithms
Authors:
Arnd Hartmanns,
Sebastian Junges,
Tim Quatmann,
Maximilian Weininger
Abstract:
Model checking undiscounted reachability and expected-reward properties on Markov decision processes (MDPs) is key for the verification of systems that act under uncertainty. Popular algorithms are policy iteration and variants of value iteration; in tool competitions, most participants rely on the latter. These algorithms generally need worst-case exponential time. However the problem can equally…
▽ More
Model checking undiscounted reachability and expected-reward properties on Markov decision processes (MDPs) is key for the verification of systems that act under uncertainty. Popular algorithms are policy iteration and variants of value iteration; in tool competitions, most participants rely on the latter. These algorithms generally need worst-case exponential time. However the problem can equally be formulated as a linear program, solvable in polynomial time. In this paper, we give a detailed overview of today's state-of-the-art algorithms for MDP model checking with a focus on performance and correctness. We highlight their fundamental differences, and describe various optimisations and implementation variants. We experimentally compare floating-point and exact-arithmetic implementations of all algorithms on three benchmark sets using two probabilistic model checkers. Our results show that (optimistic) value iteration is a sensible default, but other algorithms are preferable in specific settings. This paper thereby provides a guide for MDP verification practitioners -- tool builders and users alike.
△ Less
Submitted 24 January, 2023;
originally announced January 2023.
-
COOL-MC: A Comprehensive Tool for Reinforcement Learning and Model Checking
Authors:
Dennis Gross,
Nils Jansen,
Sebastian Junges,
Guillermo A. Perez
Abstract:
This paper presents COOL-MC, a tool that integrates state-of-the-art reinforcement learning (RL) and model checking. Specifically, the tool builds upon the OpenAI gym and the probabilistic model checker Storm. COOL-MC provides the following features: (1) a simulator to train RL policies in the OpenAI gym for Markov decision processes (MDPs) that are defined as input for Storm, (2) a new model buil…
▽ More
This paper presents COOL-MC, a tool that integrates state-of-the-art reinforcement learning (RL) and model checking. Specifically, the tool builds upon the OpenAI gym and the probabilistic model checker Storm. COOL-MC provides the following features: (1) a simulator to train RL policies in the OpenAI gym for Markov decision processes (MDPs) that are defined as input for Storm, (2) a new model builder for Storm, which uses callback functions to verify (neural network) RL policies, (3) formal abstractions that relate models and policies specified in OpenAI gym or Storm, and (4) algorithms to obtain bounds on the performance of so-called permissive policies. We describe the components and architecture of COOL-MC and demonstrate its features on multiple benchmark environments.
△ Less
Submitted 15 September, 2022;
originally announced September 2022.
-
Parameter Synthesis in Markov Models: A Gentle Survey
Authors:
Nils Jansen,
Sebastian Junges,
Joost-Pieter Katoen
Abstract:
This paper surveys the analysis of parametric Markov models whose transitions are labelled with functions over a finite set of parameters. These models are symbolic representations of uncountable many concrete probabilistic models, each obtained by instantiating the parameters. We consider various analysis problems for a given logical specification $\varphi$: do all parameter instantiations within…
▽ More
This paper surveys the analysis of parametric Markov models whose transitions are labelled with functions over a finite set of parameters. These models are symbolic representations of uncountable many concrete probabilistic models, each obtained by instantiating the parameters. We consider various analysis problems for a given logical specification $\varphi$: do all parameter instantiations within a given region of parameter values satisfy $\varphi$?, which instantiations satisfy $\varphi$ and which ones do not?, and how can all such instantiations be characterised, either exactly or approximately? We address theoretical complexity results and describe the main ideas underlying state-of-the-art algorithms that established an impressive leap over the last decade enabling the fully automated analysis of models with millions of states and thousands of parameters.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
Abstraction-Refinement for Hierarchical Probabilistic Models
Authors:
Sebastian Junges,
Matthijs T. J. Spaan
Abstract:
Markov decision processes are a ubiquitous formalism for modelling systems with non-deterministic and probabilistic behavior. Verification of these models is subject to the famous state space explosion problem. We alleviate this problem by exploiting a hierarchical structure with repetitive parts. This structure not only occurs naturally in robotics, but also in probabilistic programs describing,…
▽ More
Markov decision processes are a ubiquitous formalism for modelling systems with non-deterministic and probabilistic behavior. Verification of these models is subject to the famous state space explosion problem. We alleviate this problem by exploiting a hierarchical structure with repetitive parts. This structure not only occurs naturally in robotics, but also in probabilistic programs describing, e.g., network protocols. Such programs often repeatedly call a subroutine with similar behavior. In this paper, we focus on a local case, in which the subroutines have a limited effect on the overall system state. The key ideas to accelerate analysis of such programs are (1) to treat the behavior of the subroutine as uncertain and only remove this uncertainty by a detailed analysis if needed, and (2) to abstract similar subroutines into a parametric template, and then analyse this template. These two ideas are embedded into an abstraction-refinement loop that analyses hierarchical MDPs. A prototypical implementation shows the efficacy of the approach.
△ Less
Submitted 6 June, 2022;
originally announced June 2022.
-
Sampling-Based Verification of CTMCs with Uncertain Rates
Authors:
Thom S. Badings,
Nils Jansen,
Sebastian Junges,
Marielle Stoelinga,
Matthias Volk
Abstract:
We employ uncertain parametric CTMCs with parametric transition rates and a prior on the parameter values. The prior encodes uncertainty about the actual transition rates, while the parameters allow dependencies between transition rates. Sampling the parameter values from the prior distribution then yields a standard CTMC, for which we may compute relevant reachability probabilities. We provide a…
▽ More
We employ uncertain parametric CTMCs with parametric transition rates and a prior on the parameter values. The prior encodes uncertainty about the actual transition rates, while the parameters allow dependencies between transition rates. Sampling the parameter values from the prior distribution then yields a standard CTMC, for which we may compute relevant reachability probabilities. We provide a principled solution, based on a technique called scenario-optimization, to the following problem: From a finite set of parameter samples and a user-specified confidence level, compute prediction regions on the reachability probabilities. The prediction regions should (with high probability) contain the reachability probabilities of a CTMC induced by any additional sample. To boost the scalability of the approach, we employ standard abstraction techniques and adapt our methodology to support approximate reachability probabilities. Experiments with various well-known benchmarks show the applicability of the approach.
△ Less
Submitted 21 June, 2022; v1 submitted 17 May, 2022;
originally announced May 2022.
-
Probabilistic Program Verification via Inductive Synthesis of Inductive Invariants
Authors:
Kevin Batz,
Mingshuai Chen,
Sebastian Junges,
Benjamin Lucien Kaminski,
Joost-Pieter Katoen,
Christoph Matheja
Abstract:
Essential tasks for the verification of probabilistic programs include bounding expected outcomes and proving termination in finite expected runtime. We contribute a simple yet effective inductive synthesis approach for proving such quantitative reachability properties by generating inductive invariants on source-code level. Our implementation shows promise: It finds invariants for (in)finite-stat…
▽ More
Essential tasks for the verification of probabilistic programs include bounding expected outcomes and proving termination in finite expected runtime. We contribute a simple yet effective inductive synthesis approach for proving such quantitative reachability properties by generating inductive invariants on source-code level. Our implementation shows promise: It finds invariants for (in)finite-state programs, can beat state-of-the-art probabilistic model checkers, and is competitive with modern tools dedicated to invariant synthesis and expected runtime reasoning.
△ Less
Submitted 8 February, 2023; v1 submitted 12 May, 2022;
originally announced May 2022.
-
Safe Reinforcement Learning via Shielding under Partial Observability
Authors:
Steven Carr,
Nils Jansen,
Sebastian Junges,
Ufuk Topcu
Abstract:
Safe exploration is a common problem in reinforcement learning (RL) that aims to prevent agents from making disastrous decisions while exploring their environment. A family of approaches to this problem assume domain knowledge in the form of a (partial) model of this environment to decide upon the safety of an action. A so-called shield forces the RL agent to select only safe actions. However, for…
▽ More
Safe exploration is a common problem in reinforcement learning (RL) that aims to prevent agents from making disastrous decisions while exploring their environment. A family of approaches to this problem assume domain knowledge in the form of a (partial) model of this environment to decide upon the safety of an action. A so-called shield forces the RL agent to select only safe actions. However, for adoption in various applications, one must look beyond enforcing safety and also ensure the applicability of RL with good performance. We extend the applicability of shields via tight integration with state-of-the-art deep RL, and provide an extensive, empirical study in challenging, sparse-reward environments under partial observability. We show that a carefully integrated shield ensures safety and can improve the convergence rate and final performance of RL agents. We furthermore show that a shield can be used to bootstrap state-of-the-art RL agents: they remain safe after initial learning in a shielded setting, allowing us to disable a potentially too conservative shield eventually.
△ Less
Submitted 22 August, 2022; v1 submitted 1 April, 2022;
originally announced April 2022.
-
Inductive Synthesis of Finite-State Controllers for POMDPs
Authors:
Roman Andriushchenko,
Milan Ceska,
Sebastian Junges,
Joost-Pieter Katoen
Abstract:
We present a novel learning framework to obtain finite-state controllers (FSCs) for partially observable Markov decision processes and illustrate its applicability for indefinite-horizon specifications. Our framework builds on oracle-guided inductive synthesis to explore a design space compactly representing available FSCs. The inductive synthesis approach consists of two stages: The outer stage d…
▽ More
We present a novel learning framework to obtain finite-state controllers (FSCs) for partially observable Markov decision processes and illustrate its applicability for indefinite-horizon specifications. Our framework builds on oracle-guided inductive synthesis to explore a design space compactly representing available FSCs. The inductive synthesis approach consists of two stages: The outer stage determines the design space, i.e., the set of FSC candidates, while the inner stage efficiently explores the design space. This framework is easily generalisable and shows promising results when compared to existing approaches. Experiments indicate that our technique is (i) competitive to state-of-the-art belief-based approaches for indefinite-horizon properties, (ii) yields smaller FSCs than existing methods for several models, and (iii) naturally treats multi-objective specifications.
△ Less
Submitted 22 March, 2022; v1 submitted 21 March, 2022;
originally announced March 2022.
-
Scenario-Based Verification of Uncertain Parametric MDPs
Authors:
Thom Badings,
Murat Cubuktepe,
Nils Jansen,
Sebastian Junges,
Joost-Pieter Katoen,
Ufuk Topcu
Abstract:
We consider parametric Markov decision processes (pMDPs) that are augmented with unknown probability distributions over parameter values. The problem is to compute the probability to satisfy a temporal logic specification with any concrete MDP that corresponds to a sample from these distributions. As solving this problem precisely is infeasible, we resort to sampling techniques that exploit the so…
▽ More
We consider parametric Markov decision processes (pMDPs) that are augmented with unknown probability distributions over parameter values. The problem is to compute the probability to satisfy a temporal logic specification with any concrete MDP that corresponds to a sample from these distributions. As solving this problem precisely is infeasible, we resort to sampling techniques that exploit the so-called scenario approach. Based on a finite number of samples of the parameters, the proposed method yields high-confidence bounds on the probability of satisfying the specification. The number of samples required to obtain a high confidence on these bounds is independent of the number of states and the number of random parameters. Experiments on a large set of benchmarks show that several thousand samples suffice to obtain tight and high-confidence lower and upper bounds on the satisfaction probability.
△ Less
Submitted 21 June, 2022; v1 submitted 24 December, 2021;
originally announced December 2021.
-
Querying Labelled Data with Scenario Programs for Sim-to-Real Validation
Authors:
Edward Kim,
Jay Shenoy,
Sebastian Junges,
Daniel Fremont,
Alberto Sangiovanni-Vincentelli,
Sanjit Seshia
Abstract:
Simulation-based testing of autonomous vehicles (AVs) has become an essential complement to road testing to ensure safety. Consequently, substantial research has focused on searching for failure scenarios in simulation. However, a fundamental question remains: are AV failure scenarios identified in simulation meaningful in reality, i.e., are they reproducible on the real system? Due to the sim-to-…
▽ More
Simulation-based testing of autonomous vehicles (AVs) has become an essential complement to road testing to ensure safety. Consequently, substantial research has focused on searching for failure scenarios in simulation. However, a fundamental question remains: are AV failure scenarios identified in simulation meaningful in reality, i.e., are they reproducible on the real system? Due to the sim-to-real gap arising from discrepancies between simulated and real sensor data, a failure scenario identified in simulation can be either a spurious artifact of the synthetic sensor data or an actual failure that persists with real sensor data. An approach to validate simulated failure scenarios is to identify instances of the scenario in a corpus of real data, and check if the failure persists on the real data. To this end, we propose a formal definition of what it means for a labelled data item to match an abstract scenario, encoded as a scenario program using the SCENIC probabilistic programming language. Using this definition, we develop a querying algorithm which, given a scenario program and a labelled dataset, finds the subset of data matching the scenario. Experiments demonstrate that our algorithm is accurate and efficient on a variety of realistic traffic scenarios, and scales to a reasonable number of agents.
△ Less
Submitted 30 November, 2021;
originally announced December 2021.
-
Gradient-Descent for Randomized Controllers under Partial Observability
Authors:
Linus Heck,
Jip Spel,
Sebastian Junges,
Joshua Moerman,
Joost-Pieter Katoen
Abstract:
Randomization is a powerful technique to create robust controllers, in particular in partially observable settings. The degrees of randomization have a significant impact on the system performance, yet they are intricate to get right. The use of synthesis algorithms for parametric Markov chains (pMCs) is a promising direction to support the design process of such controllers. This paper shows how…
▽ More
Randomization is a powerful technique to create robust controllers, in particular in partially observable settings. The degrees of randomization have a significant impact on the system performance, yet they are intricate to get right. The use of synthesis algorithms for parametric Markov chains (pMCs) is a promising direction to support the design process of such controllers. This paper shows how to define and evaluate gradients of pMCs. Furthermore, it investigates varieties of gradient descent techniques from the machine learning community to synthesize the probabilities in a pMC. The resulting method scales to significantly larger pMCs than before and empirically outperforms the state-of-the-art, often by at least one order of magnitude.
△ Less
Submitted 8 November, 2021;
originally announced November 2021.
-
Convex Optimization for Parameter Synthesis in MDPs
Authors:
Murat Cubuktepe,
Nils Jansen,
Sebastian Junges,
Joost-Pieter Katoen,
Ufuk Topcu
Abstract:
Probabilistic model checking aims to prove whether a Markov decision process (MDP) satisfies a temporal logic specification. The underlying methods rely on an often unrealistic assumption that the MDP is precisely known. Consequently, parametric MDPs (pMDPs) extend MDPs with transition probabilities that are functions over unspecified parameters. The parameter synthesis problem is to compute an in…
▽ More
Probabilistic model checking aims to prove whether a Markov decision process (MDP) satisfies a temporal logic specification. The underlying methods rely on an often unrealistic assumption that the MDP is precisely known. Consequently, parametric MDPs (pMDPs) extend MDPs with transition probabilities that are functions over unspecified parameters. The parameter synthesis problem is to compute an instantiation of these unspecified parameters such that the resulting MDP satisfies the temporal logic specification. We formulate the parameter synthesis problem as a quadratically constrained quadratic program (QCQP), which is nonconvex and is NP-hard to solve in general. We develop two approaches that iteratively obtain locally optimal solutions. The first approach exploits the so-called convex-concave procedure (CCP), and the second approach utilizes a sequential convex programming (SCP) method. The techniques improve the runtime and scalability by multiple orders of magnitude compared to black-box CCP and SCP by merging ideas from convex optimization and probabilistic model checking. We demonstrate the approaches on a satellite collision avoidance problem with hundreds of thousands of states and tens of thousands of parameters and their scalability on a wide range of commonly used benchmarks.
△ Less
Submitted 30 June, 2021;
originally announced July 2021.
-
Model Repair Revamped: On the Automated Synthesis of Markov Chains
Authors:
Milan Ceska,
Christian Dehnert,
Nils Jansen,
Sebastian Junges,
Joost-Pieter Katoen
Abstract:
This paper outlines two approaches|based on counterexample-guided abstraction refinement (CEGAR) and counterexample-guided inductive synthesis (CEGIS), respectively to the automated synthesis of finite-state probabilistic models and programs. Our CEGAR approach iteratively partitions the design space starting from an abstraction of this space and refines this by a light-weight analysis of verifica…
▽ More
This paper outlines two approaches|based on counterexample-guided abstraction refinement (CEGAR) and counterexample-guided inductive synthesis (CEGIS), respectively to the automated synthesis of finite-state probabilistic models and programs. Our CEGAR approach iteratively partitions the design space starting from an abstraction of this space and refines this by a light-weight analysis of verification results. The CEGIS technique exploits critical subsystems as counterexamples to prune all programs behaving incorrectly on that input. We show the applicability of these synthesis techniques to sketching of probabilistic programs, controller synthesis of POMDPs, and software product lines.
△ Less
Submitted 27 May, 2021;
originally announced May 2021.
-
Model Checking Finite-Horizon Markov Chains with Probabilistic Inference
Authors:
Steven Holtzen,
Sebastian Junges,
Marcell Vazquez-Chanlatte,
Todd Millstein,
Sanjit A. Seshia,
Guy Van Den Broeck
Abstract:
We revisit the symbolic verification of Markov chains with respect to finite horizon reachability properties. The prevalent approach iteratively computes step-bounded state reachability probabilities. By contrast, recent advances in probabilistic inference suggest symbolically representing all horizon-length paths through the Markov chain. We ask whether this perspective advances the state-of-the-…
▽ More
We revisit the symbolic verification of Markov chains with respect to finite horizon reachability properties. The prevalent approach iteratively computes step-bounded state reachability probabilities. By contrast, recent advances in probabilistic inference suggest symbolically representing all horizon-length paths through the Markov chain. We ask whether this perspective advances the state-of-the-art in probabilistic model checking. First, we formally describe both approaches in order to highlight their key differences. Then, using these insights we develop Rubicon, a tool that transpiles Prism models to the probabilistic inference tool Dice. Finally, we demonstrate better scalability compared to probabilistic model checkers on selected benchmarks. All together, our results suggest that probabilistic inference is a valuable addition to the probabilistic model checking portfolio -- with Rubicon as a first step towards integrating both perspectives.
△ Less
Submitted 30 June, 2021; v1 submitted 26 May, 2021;
originally announced May 2021.
-
Runtime Monitoring for Markov Decision Processes
Authors:
Sebastian Junges,
Hazem Torfah,
Sanjit A. Seshia
Abstract:
We investigate the problem of monitoring partially observable systems with nondeterministic and probabilistic dynamics. In such systems, every state may be associated with a risk, e.g., the probability of an imminent crash. During runtime, we obtain partial information about the system state in form of observations. The monitor uses this information to estimate the risk of the (unobservable) curre…
▽ More
We investigate the problem of monitoring partially observable systems with nondeterministic and probabilistic dynamics. In such systems, every state may be associated with a risk, e.g., the probability of an imminent crash. During runtime, we obtain partial information about the system state in form of observations. The monitor uses this information to estimate the risk of the (unobservable) current system state. Our results are threefold. First, we show that extensions of state estimation approaches do not scale due the combination of nondeterminism and probabilities. While convex hull algorithms improve the practical runtime, they do not prevent an exponential memory blowup. Second, we present a tractable algorithm based on model checking conditional reachability probabilities. Third, we provide prototypical implementations and manifest the applicability of our algorithms to a range of benchmarks. The results highlight the possibilities and boundaries of our novel algorithms.
△ Less
Submitted 26 May, 2021;
originally announced May 2021.
-
Entropy-Guided Control Improvisation
Authors:
Marcell Vazquez-Chanlatte,
Sebastian Junges,
Daniel J. Fremont,
Sanjit Seshia
Abstract:
High level declarative constraints provide a powerful (and popular) way to define and construct control policies; however, most synthesis algorithms do not support specifying the degree of randomness (unpredictability) of the resulting controller. In many contexts, e.g., patrolling, testing, behavior prediction,and planning on idealized models, predictable or biased controllers are undesirable. To…
▽ More
High level declarative constraints provide a powerful (and popular) way to define and construct control policies; however, most synthesis algorithms do not support specifying the degree of randomness (unpredictability) of the resulting controller. In many contexts, e.g., patrolling, testing, behavior prediction,and planning on idealized models, predictable or biased controllers are undesirable. To address these concerns, we introduce the \emph{Entropic Reactive Control Improvisation} (ERCI) framework and algorithm which supports synthesizing control policies for stochastic games that are declaratively specified by (i) a \emph{hard constraint} specifying what must occur, (ii) a \emph{soft constraint} specifying what typically occurs, and (iii) a \emph{randomization constraint} specifying the unpredictability and variety of the controller, as quantified using causal entropy. This framework, extends the state of the art by supporting arbitrary combinations of adversarial and probabilistic uncertainty in the environment. ERCI enables a flexible modeling formalism which we argue, theoretically and empirically, remains tractable.
△ Less
Submitted 28 June, 2021; v1 submitted 9 March, 2021;
originally announced March 2021.
-
Inductive Synthesis for Probabilistic Programs Reaches New Horizons
Authors:
Roman Andriushchenko,
Milan Ceska,
Sebastian Junges,
Joost-Pieter Katoen
Abstract:
This paper presents a novel method for the automated synthesis of probabilistic programs. The starting point is a program sketch representing a finite family of finite-state Markov chains with related but distinct topologies, and a PCTL specification. The method builds on a novel inductive oracle that greedily generates counter-examples (CEs) for violating programs and uses them to prune the famil…
▽ More
This paper presents a novel method for the automated synthesis of probabilistic programs. The starting point is a program sketch representing a finite family of finite-state Markov chains with related but distinct topologies, and a PCTL specification. The method builds on a novel inductive oracle that greedily generates counter-examples (CEs) for violating programs and uses them to prune the family. These CEs leverage the semantics of the family in the form of bounds on its best- and worst-case behaviour provided by a deductive oracle using an MDP abstraction. The method further monitors the performance of the synthesis and adaptively switches between the inductive and deductive reasoning. Our experiments demonstrate that the novel CE construction provides a significantly faster and more effective pruning strategy leading to acceleration of the synthesis process on a wide range of benchmarks. For challenging problems, such as the synthesis of decentralized partially-observable controllers, we reduce the run-time from a day to minutes.
△ Less
Submitted 29 January, 2021;
originally announced January 2021.
-
The Complexity of Reachability in Parametric Markov Decision Processes
Authors:
Sebastian Junges,
Joost-Pieter Katoen,
Guillermo A. Pérez,
Tobias Winkler
Abstract:
This article presents the complexity of reachability decision problems for parametric Markov decision processes (pMDPs), an extension to Markov decision processes (MDPs) where transitions probabilities are described by polynomials over a finite set of parameters. In particular, we study the complexity of finding values for these parameters such that the induced MDP satisfies some maximal or minima…
▽ More
This article presents the complexity of reachability decision problems for parametric Markov decision processes (pMDPs), an extension to Markov decision processes (MDPs) where transitions probabilities are described by polynomials over a finite set of parameters. In particular, we study the complexity of finding values for these parameters such that the induced MDP satisfies some maximal or minimal reachability probability constraints. We discuss different variants depending on the comparison operator in the constraints and the domain of the parameter values. We improve all known lower bounds for this problem, and notably provide ETR-completeness results for distinct variants of this problem.
△ Less
Submitted 28 September, 2020;
originally announced September 2020.
-
Robust Finite-State Controllers for Uncertain POMDPs
Authors:
Murat Cubuktepe,
Nils Jansen,
Sebastian Junges,
Ahmadreza Marandi,
Marnix Suilen,
Ufuk Topcu
Abstract:
Uncertain partially observable Markov decision processes (uPOMDPs) allow the probabilistic transition and observation functions of standard POMDPs to belong to a so-called uncertainty set. Such uncertainty, referred to as epistemic uncertainty, captures uncountable sets of probability distributions caused by, for instance, a lack of data available. We develop an algorithm to compute finite-memory…
▽ More
Uncertain partially observable Markov decision processes (uPOMDPs) allow the probabilistic transition and observation functions of standard POMDPs to belong to a so-called uncertainty set. Such uncertainty, referred to as epistemic uncertainty, captures uncountable sets of probability distributions caused by, for instance, a lack of data available. We develop an algorithm to compute finite-memory policies for uPOMDPs that robustly satisfy specifications against any admissible distribution. In general, computing such policies is theoretically and practically intractable. We provide an efficient solution to this problem in four steps. (1) We state the underlying problem as a nonconvex optimization problem with infinitely many constraints. (2) A dedicated dualization scheme yields a dual problem that is still nonconvex but has finitely many constraints. (3) We linearize this dual problem and (4) solve the resulting finite linear program to obtain locally optimal solutions to the original problem. The resulting problem formulation is exponentially smaller than those resulting from existing methods. We demonstrate the applicability of our algorithm using large instances of an aircraft collision-avoidance scenario and a novel spacecraft motion planning case study.
△ Less
Submitted 4 March, 2021; v1 submitted 23 September, 2020;
originally announced September 2020.
-
Verification of indefinite-horizon POMDPs
Authors:
Alexander Bork,
Sebastian Junges,
Joost-Pieter Katoen,
Tim Quatmann
Abstract:
The verification problem in MDPs asks whether, for any policy resolving the nondeterminism, the probability that something bad happens is bounded by some given threshold. This verification problem is often overly pessimistic, as the policies it considers may depend on the complete system state. This paper considers the verification problem for partially observable MDPs, in which the policies make…
▽ More
The verification problem in MDPs asks whether, for any policy resolving the nondeterminism, the probability that something bad happens is bounded by some given threshold. This verification problem is often overly pessimistic, as the policies it considers may depend on the complete system state. This paper considers the verification problem for partially observable MDPs, in which the policies make their decisions based on (the history of) the observations emitted by the system. We present an abstraction-refinement framework extending previous instantiations of the Lovejoy-approach. Our experiments show that this framework significantly improves the scalability of the approach.
△ Less
Submitted 30 June, 2020;
originally announced July 2020.
-
Enforcing Almost-Sure Reachability in POMDPs
Authors:
Sebastian Junges,
Nils Jansen,
Sanjit A. Seshia
Abstract:
Partially-Observable Markov Decision Processes (POMDPs) are a well-known stochastic model for sequential decision making under limited information. We consider the EXPTIME-hard problem of synthesising policies that almost-surely reach some goal state without ever visiting a bad state. In particular, we are interested in computing the winning region, that is, the set of system configurations from w…
▽ More
Partially-Observable Markov Decision Processes (POMDPs) are a well-known stochastic model for sequential decision making under limited information. We consider the EXPTIME-hard problem of synthesising policies that almost-surely reach some goal state without ever visiting a bad state. In particular, we are interested in computing the winning region, that is, the set of system configurations from which a policy exists that satisfies the reachability specification. A direct application of such a winning region is the safe exploration of POMDPs by, for instance, restricting the behavior of a reinforcement learning agent to the region. We present two algorithms: A novel SAT-based iterative approach and a decision-diagram based alternative. The empirical evaluation demonstrates the feasibility and efficacy of the approaches.
△ Less
Submitted 18 March, 2021; v1 submitted 30 June, 2020;
originally announced July 2020.
-
PrIC3: Property Directed Reachability for MDPs
Authors:
Kevin Batz,
Sebastian Junges,
Benjamin Lucien Kaminski,
Joost-Pieter Katoen,
Christoph Matheja,
Philipp Schröer
Abstract:
IC3 has been a leap forward in symbolic model checking. This paper proposes PrIC3 (pronounced pricy-three), a conservative extension of IC3 to symbolic model checking of MDPs. Our main focus is to develop the theory underlying PrIC3. Alongside, we present a first implementation of PrIC3 including the key ingredients from IC3 such as generalization, repushing, and propagation.
IC3 has been a leap forward in symbolic model checking. This paper proposes PrIC3 (pronounced pricy-three), a conservative extension of IC3 to symbolic model checking of MDPs. Our main focus is to develop the theory underlying PrIC3. Alongside, we present a first implementation of PrIC3 including the key ingredients from IC3 such as generalization, repushing, and propagation.
△ Less
Submitted 18 May, 2020; v1 submitted 30 April, 2020;
originally announced April 2020.
-
The Probabilistic Model Checker Storm
Authors:
Christian Hensel,
Sebastian Junges,
Joost-Pieter Katoen,
Tim Quatmann,
Matthias Volk
Abstract:
We present the probabilistic model checker Storm. Storm supports the analysis of discrete- and continuous-time variants of both Markov chains and Markov decision processes. Storm has three major distinguishing features. It supports multiple input languages for Markov models, including the JANI and PRISM modeling languages, dynamic fault trees, generalized stochastic Petri nets, and the probabilist…
▽ More
We present the probabilistic model checker Storm. Storm supports the analysis of discrete- and continuous-time variants of both Markov chains and Markov decision processes. Storm has three major distinguishing features. It supports multiple input languages for Markov models, including the JANI and PRISM modeling languages, dynamic fault trees, generalized stochastic Petri nets, and the probabilistic guarded command language. It has a modular set-up in which solvers and symbolic engines can easily be exchanged. Its Python API allows for rapid prototy** by encapsulating Storm's fast and scalable algorithms. This paper reports on the main features of Storm and explains how to effectively use them. A description is provided of the main distinguishing functionalities of Storm. Finally, an empirical evaluation of different configurations of Storm on the QComp 2019 benchmark set is presented.
△ Less
Submitted 6 October, 2020; v1 submitted 17 February, 2020;
originally announced February 2020.
-
Scenario-Based Verification of Uncertain MDPs
Authors:
Murat Cubuktepe,
Nils Jansen,
Sebastian Junges,
Joost-Pieter Katoen,
Ufuk Topcu
Abstract:
We consider Markov decision processes (MDPs) in which the transition probabilities and rewards belong to an uncertainty set parametrized by a collection of random variables. The probability distributions for these random parameters are unknown. The problem is to compute the probability to satisfy a temporal logic specification within any MDP that corresponds to a sample from these unknown distribu…
▽ More
We consider Markov decision processes (MDPs) in which the transition probabilities and rewards belong to an uncertainty set parametrized by a collection of random variables. The probability distributions for these random parameters are unknown. The problem is to compute the probability to satisfy a temporal logic specification within any MDP that corresponds to a sample from these unknown distributions. In general, this problem is undecidable, and we resort to techniques from so-called scenario optimization. Based on a finite number of samples of the uncertain parameters, each of which induces an MDP, the proposed method estimates the probability of satisfying the specification by solving a finite-dimensional convex optimization problem. The number of samples required to obtain a high confidence on this estimate is independent from the number of states and the number of random parameters. Experiments on a large set of benchmarks show that a few thousand samples suffice to obtain high-quality confidence bounds with a high probability.
△ Less
Submitted 25 February, 2020; v1 submitted 24 December, 2019;
originally announced December 2019.
-
Are Parametric Markov Chains Monotonic?
Authors:
Jip Spel,
Sebastian Junges,
Joost-Pieter Katoen
Abstract:
This paper presents a simple algorithm to check whether reachability probabilities in parametric Markov chains are monotonic in (some of) the parameters. The idea is to construct - only using the graph structure of the Markov chain and local transition probabilities - a pre-order on the states. Our algorithm cheaply checks a sufficient condition for monotonicity. Experiments show that monotonicity…
▽ More
This paper presents a simple algorithm to check whether reachability probabilities in parametric Markov chains are monotonic in (some of) the parameters. The idea is to construct - only using the graph structure of the Markov chain and local transition probabilities - a pre-order on the states. Our algorithm cheaply checks a sufficient condition for monotonicity. Experiments show that monotonicity in several benchmarks is automatically detected, and monotonicity can speed up parameter synthesis up to orders of magnitude faster than a symbolic baseline.
△ Less
Submitted 19 July, 2019;
originally announced July 2019.
-
Counterexample-Driven Synthesis for Probabilistic Program Sketches
Authors:
Milan Češka,
Christian Hensel,
Sebastian Junges,
Joost-Pieter Katoen
Abstract:
Probabilistic programs are key to deal with uncertainty in e.g. controller synthesis. They are typically small but intricate. Their development is complex and error prone requiring quantitative reasoning over a myriad of alternative designs. To mitigate this complexity, we adopt counterexample-guided inductive synthesis (CEGIS) to automatically synthesise finite-state probabilistic programs. Our a…
▽ More
Probabilistic programs are key to deal with uncertainty in e.g. controller synthesis. They are typically small but intricate. Their development is complex and error prone requiring quantitative reasoning over a myriad of alternative designs. To mitigate this complexity, we adopt counterexample-guided inductive synthesis (CEGIS) to automatically synthesise finite-state probabilistic programs. Our approach leverages efficient model checking, modern SMT solving, and counterexample generation at program level. Experiments on practically relevant case studies show that design spaces with millions of candidate designs can be fully explored using a few thousand verification queries.
△ Less
Submitted 28 April, 2019;
originally announced April 2019.
-
On the Complexity of Reachability in Parametric Markov Decision Processes
Authors:
Tobias Winkler,
Sebastian Junges,
Guillermo A. Pérez,
Joost-Pieter Katoen
Abstract:
This paper studies parametric Markov decision processes (pMDPs), an extension to Markov decision processes (MDPs) where transitions probabilities are described by polynomials over a finite set of parameters. Fixing values for all parameters yields MDPs. In particular, this paper studies the complexity of finding values for these parameters such that the induced MDP satisfies some reachability cons…
▽ More
This paper studies parametric Markov decision processes (pMDPs), an extension to Markov decision processes (MDPs) where transitions probabilities are described by polynomials over a finite set of parameters. Fixing values for all parameters yields MDPs. In particular, this paper studies the complexity of finding values for these parameters such that the induced MDP satisfies some reachability constraints. We discuss different variants depending on the comparison operator in the constraints and the domain of the parameter values. We improve all known lower bounds for this problem, and notably provide ETR-completeness results for distinct variants of this problem. Furthermore, we provide insights in the functions describing the induced reachability probabilities, and how pMDPs generalise concurrent stochastic reachability games.
△ Less
Submitted 2 April, 2019;
originally announced April 2019.
-
Parameter Synthesis for Markov Models: Covering the Parameter Space
Authors:
Sebastian Junges,
Erika Ábrahám,
Christian Hensel,
Nils Jansen,
Joost-Pieter Katoen,
Tim Quatmann,
Matthias Volk
Abstract:
Markov chain analysis is a key technique in formal verification. A practical obstacle is that all probabilities in Markov models need to be known. However, system quantities such as failure rates or packet loss ratios, etc. are often not -- or only partially -- known. This motivates considering parametric models with transitions labeled with functions over parameters. Whereas traditional Markov ch…
▽ More
Markov chain analysis is a key technique in formal verification. A practical obstacle is that all probabilities in Markov models need to be known. However, system quantities such as failure rates or packet loss ratios, etc. are often not -- or only partially -- known. This motivates considering parametric models with transitions labeled with functions over parameters. Whereas traditional Markov chain analysis relies on a single, fixed set of probabilities, analysing parametric Markov models focuses on synthesising parameter values that establish a given safety or performance specification $\varphi$. Examples are: what component failure rates ensure the probability of a system breakdown to be below 0.00000001?, or which failure rates maximise the performance, for instance the throughput, of the system? This paper presents various analysis algorithms for parametric discrete-time Markov chains and Markov decision processes. We focus on three problems: (a) do all parameter values within a given region satisfy $\varphi$?, (b) which regions satisfy $\varphi$ and which ones do not?, and (c) an approximate version of (b) focusing on covering a large fraction of all possible parameter values. We give a detailed account of the various algorithms, present a software tool realising these techniques, and report on an extensive experimental evaluation on benchmarks that span a wide range of applications.
△ Less
Submitted 7 November, 2023; v1 submitted 16 March, 2019;
originally announced March 2019.
-
Safety Analysis for Vehicle Guidance Systems with Dynamic Fault Trees
Authors:
Majdi Ghadhab,
Sebastian Junges,
Joost-Pieter Katoen,
Matthias Kuntz,
Matthias Volk
Abstract:
This paper considers the design-phase safety analysis of vehicle guidance systems. The proposed approach constructs dynamic fault trees (DFTs) to model a variety of safety concepts and E/E architectures for drive automation. The fault trees can be used to evaluate various quantitative measures by means of model checking. The approach is accompanied by a large-scale evaluation: The resulting DFTs w…
▽ More
This paper considers the design-phase safety analysis of vehicle guidance systems. The proposed approach constructs dynamic fault trees (DFTs) to model a variety of safety concepts and E/E architectures for drive automation. The fault trees can be used to evaluate various quantitative measures by means of model checking. The approach is accompanied by a large-scale evaluation: The resulting DFTs with up to 300 elements constitute larger-than-before DFTs, yet the concepts and architectures can be evaluated in a matter of minutes.
△ Less
Submitted 13 March, 2019;
originally announced March 2019.
-
Shepherding Hordes of Markov Chains
Authors:
Milan Ceska,
Nils Jansen,
Sebastian Junges,
Joost-Pieter Katoen
Abstract:
This paper considers large families of Markov chains (MCs) that are defined over a set of parameters with finite discrete domains. Such families occur in software product lines, planning under partial observability, and sketching of probabilistic programs. Simple questions, like `does at least one family member satisfy a property?', are NP-hard. We tackle two problems: distinguish family members t…
▽ More
This paper considers large families of Markov chains (MCs) that are defined over a set of parameters with finite discrete domains. Such families occur in software product lines, planning under partial observability, and sketching of probabilistic programs. Simple questions, like `does at least one family member satisfy a property?', are NP-hard. We tackle two problems: distinguish family members that satisfy a given quantitative property from those that do not, and determine a family member that satisfies the property optimally, i.e., with the highest probability or reward. We show that combining two well-known techniques, MDP model checking and abstraction refinement, mitigates the computational complexity. Experiments on a broad set of benchmarks show that in many situations, our approach is able to handle families of millions of MCs, providing superior scalability compared to existing solutions.
△ Less
Submitted 26 March, 2019; v1 submitted 15 February, 2019;
originally announced February 2019.
-
The Partially Observable Games We Play for Cyber Deception
Authors:
Mohamadreza Ahmadi,
Murat Cubuktepe,
Nils Jansen,
Sebastian Junges,
Joost-Pieter Katoen,
Ufuk Topcu
Abstract:
Progressively intricate cyber infiltration mechanisms have made conventional means of defense, such as firewalls and malware detectors, incompetent. These sophisticated infiltration mechanisms can study the defender's behavior, identify security caveats, and modify their actions adaptively. To tackle these security challenges, cyber-infrastructures require active defense techniques that incorporat…
▽ More
Progressively intricate cyber infiltration mechanisms have made conventional means of defense, such as firewalls and malware detectors, incompetent. These sophisticated infiltration mechanisms can study the defender's behavior, identify security caveats, and modify their actions adaptively. To tackle these security challenges, cyber-infrastructures require active defense techniques that incorporate cyber deception, in which the defender (deceiver) implements a strategy to mislead the infiltrator. To this end, we use a two-player partially observable stochastic game (POSG) framework, wherein the deceiver has full observability over the states of the POSG, and the infiltrator has partial observability. Then, the deception problem is to compute a strategy for the deceiver that minimizes the expected cost of deception against all strategies of the infiltrator. We first show that the underlying problem is a robust mixed-integer linear program, which is intractable to solve in general. Towards a scalable approach, we compute optimal finite-memory strategies for the infiltrator by a reduction to a series of synthesis problems for parametric Markov decision processes. We use these infiltration strategies to find robust strategies for the deceiver using mixed-integer linear programming. We illustrate the performance of our technique on a POSG model for network security. Our experiments demonstrate that the proposed approach handles scenarios considerably larger than those of the state-of-the-art methods.
△ Less
Submitted 28 September, 2018;
originally announced October 2018.
-
Safe Reinforcement Learning via Probabilistic Shields
Authors:
Nils Jansen,
Bettina Könighofer,
Sebastian Junges,
Alexandru C. Serban,
Roderick Bloem
Abstract:
This paper targets the efficient construction of a safety shield for decision making in scenarios that incorporate uncertainty. Markov decision processes (MDPs) are prominent models to capture such planning problems. Reinforcement learning (RL) is a machine learning technique to determine near-optimal policies in MDPs that may be unknown prior to exploring the model. However, during exploration, R…
▽ More
This paper targets the efficient construction of a safety shield for decision making in scenarios that incorporate uncertainty. Markov decision processes (MDPs) are prominent models to capture such planning problems. Reinforcement learning (RL) is a machine learning technique to determine near-optimal policies in MDPs that may be unknown prior to exploring the model. However, during exploration, RL is prone to induce behavior that is undesirable or not allowed in safety- or mission-critical contexts. We introduce the concept of a probabilistic shield that enables decision-making to adhere to safety constraints with high probability. In a separation of concerns, we employ formal verification to efficiently compute the probabilities of critical decisions within a safety-relevant fragment of the MDP. We use these results to realize a shield that is applied to an RL algorithm which then optimizes the actual performance objective. We discuss tradeoffs between sufficient progress in exploration of the environment and ensuring safety. In our experiments, we demonstrate on the arcade game PAC-MAN and on a case study involving service robots that the learning efficiency increases as the learning needs orders of magnitude fewer episodes.
△ Less
Submitted 25 November, 2019; v1 submitted 16 July, 2018;
originally announced July 2018.
-
One Net Fits All: A unifying semantics of Dynamic Fault Trees using GSPNs
Authors:
Sebastian Junges,
Joost-Pieter Katoen,
Marielle Stoelinga,
Matthias Volk
Abstract:
Dynamic Fault Trees (DFTs) are a prominent model in reliability engineering. They are strictly more expressive than static fault trees, but this comes at a price: their interpretation is non-trivial and leaves quite some freedom. This paper presents a GSPN semantics for DFTs. This semantics is rather simple and compositional. The key feature is that this GSPN semantics unifies all existing DFT sem…
▽ More
Dynamic Fault Trees (DFTs) are a prominent model in reliability engineering. They are strictly more expressive than static fault trees, but this comes at a price: their interpretation is non-trivial and leaves quite some freedom. This paper presents a GSPN semantics for DFTs. This semantics is rather simple and compositional. The key feature is that this GSPN semantics unifies all existing DFT semantics from the literature. All semantic variants can be obtained by choosing appropriate priorities and treatment of non-determinism.
△ Less
Submitted 14 March, 2018;
originally announced March 2018.
-
Synthesis in pMDPs: A Tale of 1001 Parameters
Authors:
Murat Cubuktepe,
Nils Jansen,
Sebastian Junges,
Joost-Pieter Katoen,
Ufuk Topcu
Abstract:
This paper considers parametric Markov decision processes (pMDPs) whose transitions are equipped with affine functions over a finite set of parameters. The synthesis problem is to find a parameter valuation such that the instantiated pMDP satisfies a specification under all strategies. We show that this problem can be formulated as a quadratically-constrained quadratic program (QCQP) and is non-co…
▽ More
This paper considers parametric Markov decision processes (pMDPs) whose transitions are equipped with affine functions over a finite set of parameters. The synthesis problem is to find a parameter valuation such that the instantiated pMDP satisfies a specification under all strategies. We show that this problem can be formulated as a quadratically-constrained quadratic program (QCQP) and is non-convex in general. To deal with the NP-hardness of such problems, we exploit a convex-concave procedure (CCP) to iteratively obtain local optima. An appropriate interplay between CCP solvers and probabilistic model checkers creates a procedure --- realized in the open-source tool PROPhESY --- that solves the synthesis problem for models with thousands of parameters.
△ Less
Submitted 31 July, 2018; v1 submitted 5 March, 2018;
originally announced March 2018.
-
Permissive Finite-State Controllers of POMDPs using Parameter Synthesis
Authors:
Sebastian Junges,
Nils Jansen,
Ralf Wimmer,
Tim Quatmann,
Leonore Winterer,
Joost-Pieter Katoen,
Bernd Becker
Abstract:
We study finite-state controllers (FSCs) for partially observable Markov decision processes (POMDPs) that are provably correct with respect to given specifications. The key insight is that computing (randomised) FSCs on POMDPs is equivalent to - and computationally as hard as - synthesis for parametric Markov chains (pMCs). This correspondence allows to use tools for parameter synthesis in pMCs to…
▽ More
We study finite-state controllers (FSCs) for partially observable Markov decision processes (POMDPs) that are provably correct with respect to given specifications. The key insight is that computing (randomised) FSCs on POMDPs is equivalent to - and computationally as hard as - synthesis for parametric Markov chains (pMCs). This correspondence allows to use tools for parameter synthesis in pMCs to compute correct-by-construction FSCs on POMDPs for a variety of specifications. Our experimental evaluation shows comparable performance to well-known POMDP solvers.
△ Less
Submitted 17 July, 2018; v1 submitted 24 October, 2017;
originally announced October 2017.
-
Strategy Synthesis in POMDPs via Game-Based Abstractions
Authors:
Leonore Winterer,
Sebastian Junges,
Ralf Wimmer,
Nils Jansen,
Ufuk Topcu,
Joost-Pieter Katoen,
Bernd Becker
Abstract:
We study synthesis problems with constraints in partially observable Markov decision processes (POMDPs), where the objective is to compute a strategy for an agent that is guaranteed to satisfy certain safety and performance specifications. Verification and strategy synthesis for POMDPs are, however, computationally intractable in general. We alleviate this difficulty by focusing on planning applic…
▽ More
We study synthesis problems with constraints in partially observable Markov decision processes (POMDPs), where the objective is to compute a strategy for an agent that is guaranteed to satisfy certain safety and performance specifications. Verification and strategy synthesis for POMDPs are, however, computationally intractable in general. We alleviate this difficulty by focusing on planning applications and exploiting typical structural properties of such scenarios; for instance, we assume that the agent has the ability to observe its own position inside an environment. We propose an abstraction refinement framework which turns such a POMDP model into a (fully observable) probabilistic two-player game (PG). For the obtained PGs, efficient verification and synthesis tools allow to determine strategies with optimal safety and performance measures, which approximate optimal schedulers on the POMDP. If the approximation is too coarse to satisfy the given specifications, an refinement scheme improves the computed strategies. As a running example, we use planning problems where an agent moves inside an environment with randomly moving obstacles and restricted observability. We demonstrate that the proposed method advances the state of the art by solving problems several orders-of-magnitude larger than those that can be handled by existing POMDP solvers. Furthermore, this method gives guarantees on safety constraints, which is not supported by the majority of the existing solvers.
△ Less
Submitted 27 May, 2019; v1 submitted 14 August, 2017;
originally announced August 2017.