-
DFAMiner: Mining minimal separating DFAs from labelled samples
Authors:
Daniele Dell'Erba,
Yong Li,
Sven Schewe
Abstract:
We propose DFAMiner, a passive learning tool for learning minimal separating deterministic finite automata (DFA) from a set of labelled samples. Separating automata are an interesting class of automata that occurs generally in regular model checking and has raised interest in foundational questions of parity game solving. We first propose a simple and linear-time algorithm that incrementally const…
▽ More
We propose DFAMiner, a passive learning tool for learning minimal separating deterministic finite automata (DFA) from a set of labelled samples. Separating automata are an interesting class of automata that occurs generally in regular model checking and has raised interest in foundational questions of parity game solving. We first propose a simple and linear-time algorithm that incrementally constructs a three-valued DFA (3DFA) from a set of labelled samples given in the usual lexicographical order. This 3DFA has accepting and rejecting states as well as don't-care states, so that it can exactly recognise the labelled examples. We then apply our tool to mining a minimal separating DFA for the labelled samples by minimising the constructed automata via a reduction to solving SAT problems. Empirical evaluation shows that our tool outperforms current state-of-the-art tools significantly on standard benchmarks for learning minimal separating DFAs from samples. Progress in the efficient construction of separating DFAs can also lead to finding the lower bound of parity game solving, where we show that DFAMiner can create optimal separating automata for simple languages with up to 7 colours. Future improvements might offer inroads to better data structures.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
An Objective Improvement Approach to Solving Discounted Payoff Games
Authors:
Daniele Dell'Erba,
Arthur Dumas,
Sven Schewe
Abstract:
While discounted payoff games and classic games that reduce to them, like parity and mean-payoff games, are symmetric, their solutions are not. We have taken a fresh view on the properties that optimal solutions need to have, and devised a novel way to converge to them, which is entirely symmetric. We achieve this by building a constraint system that uses every edge to define an inequation, and up…
▽ More
While discounted payoff games and classic games that reduce to them, like parity and mean-payoff games, are symmetric, their solutions are not. We have taken a fresh view on the properties that optimal solutions need to have, and devised a novel way to converge to them, which is entirely symmetric. We achieve this by building a constraint system that uses every edge to define an inequation, and update the objective function by taking a single outgoing edge for each vertex into account. These edges loosely represent strategies of both players, where the objective function intuitively asks to make the inequation to these edges sharp, leading to an `error' or 0. For co-optimal strategies, and only for them, this can be achieved, and while we have not found them, we step-wise improve the error by improving the solution for a given objective function or by improving the objective function for a given solution. This also challenges the gospel that methods for solving payoff games are either based on strategy improvement or on value iteration.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Omega-Regular Decision Processes
Authors:
Ernst Moritz Hahn,
Mateo Perez,
Sven Schewe,
Fabio Somenzi,
Ashutosh Trivedi,
Dominik Wojtczak
Abstract:
Regular decision processes (RDPs) are a subclass of non-Markovian decision processes where the transition and reward functions are guarded by some regular property of the past (a lookback). While RDPs enable intuitive and succinct representation of non-Markovian decision processes, their expressive power coincides with finite-state Markov decision processes (MDPs). We introduce omega-regular decis…
▽ More
Regular decision processes (RDPs) are a subclass of non-Markovian decision processes where the transition and reward functions are guarded by some regular property of the past (a lookback). While RDPs enable intuitive and succinct representation of non-Markovian decision processes, their expressive power coincides with finite-state Markov decision processes (MDPs). We introduce omega-regular decision processes (ODPs) where the non-Markovian aspect of the transition and reward functions are extended to an omega-regular lookahead over the system evolution. Semantically, these lookaheads can be considered as promises made by the decision maker or the learning agent about her future behavior. In particular, we assume that, if the promised lookaheads are not met, then the payoff to the decision maker is $\bot$ (least desirable payoff), overriding any rewards collected by the decision maker. We enable optimization and learning for ODPs under the discounted-reward objective by reducing them to lexicographic optimization and learning over finite MDPs. We present experimental results demonstrating the effectiveness of the proposed reduction.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
An Objective Improvement Approach to Solving Discounted Payoff Games
Authors:
Daniele Dell'Erba,
Arthur Dumas,
Sven Schewe
Abstract:
While discounted payoff games and classic games that reduce to them, like parity and mean-payoff games, are symmetric, their solutions are not. We have taken a fresh view on the constraints that optimal solutions need to satisfy, and devised a novel way to converge to them, which is entirely symmetric. It also challenges the gospel that methods for solving payoff games are either based on strategy…
▽ More
While discounted payoff games and classic games that reduce to them, like parity and mean-payoff games, are symmetric, their solutions are not. We have taken a fresh view on the constraints that optimal solutions need to satisfy, and devised a novel way to converge to them, which is entirely symmetric. It also challenges the gospel that methods for solving payoff games are either based on strategy improvement or on value iteration.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Omega-Regular Reward Machines
Authors:
Ernst Moritz Hahn,
Mateo Perez,
Sven Schewe,
Fabio Somenzi,
Ashutosh Trivedi,
Dominik Wojtczak
Abstract:
Reinforcement learning (RL) is a powerful approach for training agents to perform tasks, but designing an appropriate reward mechanism is critical to its success. However, in many cases, the complexity of the learning objectives goes beyond the capabilities of the Markovian assumption, necessitating a more sophisticated reward mechanism. Reward machines and omega-regular languages are two formalis…
▽ More
Reinforcement learning (RL) is a powerful approach for training agents to perform tasks, but designing an appropriate reward mechanism is critical to its success. However, in many cases, the complexity of the learning objectives goes beyond the capabilities of the Markovian assumption, necessitating a more sophisticated reward mechanism. Reward machines and omega-regular languages are two formalisms used to express non-Markovian rewards for quantitative and qualitative objectives, respectively. This paper introduces omega-regular reward machines, which integrate reward machines with omega-regular languages to enable an expressive and effective reward mechanism for RL. We present a model-free RL algorithm to compute epsilon-optimal strategies against omega-egular reward machines and evaluate the effectiveness of the proposed algorithm through experiments.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
On the Succinctness of Good-for-MDPs Automata
Authors:
Sven Schewe,
Qiyi Tang
Abstract:
Good-for-MDPs and good-for-games automata are two recent classes of nondeterministic automata that reside between general nondeterministic and deterministic automata. Deterministic automata are good-for-games, and good-for-games automata are good-for-MDPs, but not vice versa. One of the question this raises is how these classes relate in terms of succinctness. Good-for-games automata are known to…
▽ More
Good-for-MDPs and good-for-games automata are two recent classes of nondeterministic automata that reside between general nondeterministic and deterministic automata. Deterministic automata are good-for-games, and good-for-games automata are good-for-MDPs, but not vice versa. One of the question this raises is how these classes relate in terms of succinctness. Good-for-games automata are known to be exponentially more succinct than deterministic automata, but the gap between good-for-MDPs and good-for-games automata as well as the gap between ordinary nondeterministic automata and those that are good-for-MDPs have been open. We establish that these gaps are exponential, and sharpen this result by showing that the latter gap remains exponential when restricting the nondeterministic automata to separating safety or unambiguous reachability automata.
△ Less
Submitted 21 July, 2023;
originally announced July 2023.
-
A novel family of finite automata for recognizing and learning $ω$-regular languages
Authors:
Yong Li,
Sven Schewe,
Qiyi Tang
Abstract:
Families of DFAs (FDFAs) have recently been introduced as a new representation of $ω$-regular languages. They target ultimately periodic words, with acceptors revolving around accepting some representation $u\cdot v^ω$. Three canonical FDFAs have been suggested, called periodic, syntactic, and recurrent. We propose a fourth one, limit FDFAs, which can be exponentially coarser than periodic FDFAs a…
▽ More
Families of DFAs (FDFAs) have recently been introduced as a new representation of $ω$-regular languages. They target ultimately periodic words, with acceptors revolving around accepting some representation $u\cdot v^ω$. Three canonical FDFAs have been suggested, called periodic, syntactic, and recurrent. We propose a fourth one, limit FDFAs, which can be exponentially coarser than periodic FDFAs and are more succinct than syntactic FDFAs, while they are incomparable (and dual to) recurrent FDFAs. We show that limit FDFAs can be easily used to check not only whether ω-languages are regular, but also whether they are accepted by deterministic Büchi automata. We also show that canonical forms can be left behind in applications: the limit and recurrent FDFAs can complement each other nicely, and it may be a good way forward to use a combination of both. Using this observation as a starting point, we explore making more efficient use of Myhill-Nerode's right congruences in aggressively increasing the number of don't-care cases in order to obtain smaller progress automata. In pursuit of this goal, we gain succinctness, but pay a high price by losing constructiveness.
△ Less
Submitted 14 July, 2023;
originally announced July 2023.
-
Singly Exponential Translation of Alternating Weak Büchi Automata to Unambiguous Büchi Automata
Authors:
Yong Li,
Sven Schewe,
Moshe Y. Vardi
Abstract:
We introduce a method for translating an alternating weak Büchi automaton (AWA), which corresponds to a Linear Dynamic Logic (LDL) formula, to an unambiguous Büchi automaton (UBA). Our translations generalise constructions for Linear Temporal Logic (LTL), a less expressive specification language than LDL. In classical constructions, LTL formulas are first translated to alternating \emph{very weak}…
▽ More
We introduce a method for translating an alternating weak Büchi automaton (AWA), which corresponds to a Linear Dynamic Logic (LDL) formula, to an unambiguous Büchi automaton (UBA). Our translations generalise constructions for Linear Temporal Logic (LTL), a less expressive specification language than LDL. In classical constructions, LTL formulas are first translated to alternating \emph{very weak} automata (AVAs) -- automata that have only singleton strongly connected components (SCCs); the AVAs are then handled by efficient disambiguation procedures. However, general AWAs can have larger SCCs, which complicates disambiguation. Currently, the only available disambiguation procedure has to go through an intermediate construction of nondeterministic Büchi automata (NBAs), which would incur an exponential blow-up of its own. We introduce a translation from \emph{general} AWAs to UBAs with a \emph{singly} exponential blow-up, which also immediately provides a singly exponential translation from LDL to UBAs. Interestingly, the complexity of our translation is \emph{smaller} than the best known disambiguation algorithm for NBAs (broadly $(0.53n)^n$ vs. $(0.76n)^n$), while the input of our construction can be exponentially more succinct.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
History-deterministic Timed Automata
Authors:
Sougata Bose,
Thomas A. Henzinger,
Karoliina Lehtinen,
Sven Schewe,
Patrick Totzke
Abstract:
We explore the notion of history-determinism in the context of timed automata (TA) over infinite timed words. History-deterministic (HD) automata are those in which nondeterminism can be resolved on the fly, based on the run constructed thus far. History-determinism is a robust property that admits different game-based characterisations, and HD specifications allow for game-based verification with…
▽ More
We explore the notion of history-determinism in the context of timed automata (TA) over infinite timed words. History-deterministic (HD) automata are those in which nondeterminism can be resolved on the fly, based on the run constructed thus far. History-determinism is a robust property that admits different game-based characterisations, and HD specifications allow for game-based verification without an expensive determinization step.
We show that the class of timed $ω$-languages recognised by HD timed automata strictly extends that of deterministic ones, and is strictly included in those recognised by fully non-deterministic TA.
For non-deterministic timed automata it is known that universality is already undecidable for safety/reachability TA. For history-deterministic TA with arbitrary parity acceptance, we show that timed universality, inclusion, and synthesis all remain decidable and are EXPTIME-complete.
For the subclass of TA with safety or reachability acceptance, one can decide (in EXPTIME) whether such an automaton is history-deterministic. If so, it can effectively determinized without introducing new automata states.
△ Less
Submitted 9 November, 2023; v1 submitted 6 April, 2023;
originally announced April 2023.
-
Natural Colors of Infinite Words
Authors:
Rüdiger Ehlers,
Sven Schewe
Abstract:
While finite automata have minimal DFAs as a simple and natural normal form, deterministic omega-automata do not currently have anything similar. One reason for this is that a normal form for omega-regular languages has to speak about more than acceptance - for example, to have a normal form for a parity language, it should relate every infinite word to some natural color for this language. This r…
▽ More
While finite automata have minimal DFAs as a simple and natural normal form, deterministic omega-automata do not currently have anything similar. One reason for this is that a normal form for omega-regular languages has to speak about more than acceptance - for example, to have a normal form for a parity language, it should relate every infinite word to some natural color for this language. This raises the question of whether or not a concept such as a natural color of an infinite word (for a given language) exists, and, if it does, how it relates back to automata.
We define the natural color of a word purely based on an omega-regular language, and show how this natural color can be traced back from any deterministic parity automaton after two cheap and simple automaton transformations. The resulting streamlined automaton does not necessarily accept every word with its natural color, but it has a 'co-run', which is like a run, but can once move to a language equivalent state, whose color is the natural color, and no co-run with a higher color exists.
The streamlined automaton defines, for every color c, a good-for-games co-Büchi automaton that recognizes the words whose natural colors w.r.t. the represented language are at least c. This provides a canonical representation for every $ω$-regular language, because good-for-games co-Büchi automata have a canonical minimal (and cheap to obtain) representation for every co-Büchi language.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
Recursive Reinforcement Learning
Authors:
Ernst Moritz Hahn,
Mateo Perez,
Sven Schewe,
Fabio Somenzi,
Ashutosh Trivedi,
Dominik Wojtczak
Abstract:
Recursion is the fundamental paradigm to finitely describe potentially infinite objects. As state-of-the-art reinforcement learning (RL) algorithms cannot directly reason about recursion, they must rely on the practitioner's ingenuity in designing a suitable "flat" representation of the environment. The resulting manual feature constructions and approximations are cumbersome and error-prone; their…
▽ More
Recursion is the fundamental paradigm to finitely describe potentially infinite objects. As state-of-the-art reinforcement learning (RL) algorithms cannot directly reason about recursion, they must rely on the practitioner's ingenuity in designing a suitable "flat" representation of the environment. The resulting manual feature constructions and approximations are cumbersome and error-prone; their lack of transparency hampers scalability. To overcome these challenges, we develop RL algorithms capable of computing optimal policies in environments described as a collection of Markov decision processes (MDPs) that can recursively invoke one another. Each constituent MDP is characterized by several entry and exit points that correspond to input and output values of these invocations. These recursive MDPs (or RMDPs) are expressively equivalent to probabilistic pushdown systems (with call-stack playing the role of the pushdown stack), and can model probabilistic programs with recursive procedural calls. We introduce Recursive Q-learning -- a model-free RL algorithm for RMDPs -- and prove that it converges for finite, single-exit and deterministic multi-exit RMDPs under mild assumptions.
△ Less
Submitted 22 June, 2022;
originally announced June 2022.
-
Alternating Good-for-MDP Automata
Authors:
Ernst Moritz Hahn,
Mateo Perez,
Sven Schewe,
Fabio Somenzi,
Ashutosh Trivedi,
Dominik Wojtczak
Abstract:
When omega-regular objectives were first proposed in model-free reinforcement learning (RL) for controlling MDPs, deterministic Rabin automata were used in an attempt to provide a direct translation from their transitions to scalar values. While these translations failed, it has turned out that it is possible to repair them by using good-for-MDPs (GFM) Büchi automata instead. These are nondetermin…
▽ More
When omega-regular objectives were first proposed in model-free reinforcement learning (RL) for controlling MDPs, deterministic Rabin automata were used in an attempt to provide a direct translation from their transitions to scalar values. While these translations failed, it has turned out that it is possible to repair them by using good-for-MDPs (GFM) Büchi automata instead. These are nondeterministic Büchi automata with a restricted type of nondeterminism, albeit not as restricted as in good-for-games automata. Indeed, deterministic Rabin automata have a pretty straightforward translation to such GFM automata, which is bi-linear in the number of states and pairs. Interestingly, the same cannot be said for deterministic Streett automata: a translation to nondeterministic Rabin or Büchi automata comes at an exponential cost, even without requiring the target automaton to be good-for-MDPs. Do we have to pay more than that to obtain a good-for-MDP automaton? The surprising answer is that we have to pay significantly less when we instead expand the good-for-MDP property to alternating automata: like the nondeterministic GFM automata obtained from deterministic Rabin automata, the alternating good-for-MDP automata we produce from deterministic Streett automata are bi-linear in the the size of the deterministic automaton and its index, and can therefore be exponentially more succinct than minimal nondeterministic Büchi automata.
△ Less
Submitted 6 May, 2022;
originally announced May 2022.
-
Smaller Progress Measures and Separating Automata for Parity Games
Authors:
Daniele Dell'Erba,
Sven Schewe
Abstract:
Calude et al. have recently shown that parity games can be solved in quasi-polynomial time, a landmark result that has led to a number of approaches with quasi-polynomial complexity. Jurdinski and Lasic have further improved the precise complexity of parity games, especially when the number of priorities is low (logarithmic in the number of positions). Both of these algorithms belong to a class of…
▽ More
Calude et al. have recently shown that parity games can be solved in quasi-polynomial time, a landmark result that has led to a number of approaches with quasi-polynomial complexity. Jurdinski and Lasic have further improved the precise complexity of parity games, especially when the number of priorities is low (logarithmic in the number of positions). Both of these algorithms belong to a class of game solving techniques now often called separating automata: deterministic automata that can be used as witness automata to decide the winner in parity games up to a given number of states and colours. We suggest a number of adjustments to the approach of Calude et al. that lead to smaller statespaces. These include and improve over those earlier introduced by Fearnley et al.
We identify two of them that, together, lead to a statespace of exactly the same size Jurdzinski and Lasic's concise progress measures, which currently hold the crown as smallest statespace. The remaining improvements, hence, lead to a further reduction in the size of the statespace, making our approach the most succinct progress measures available for parity games.
△ Less
Submitted 2 May, 2022;
originally announced May 2022.
-
Enhancing Adversarial Training with Second-Order Statistics of Weights
Authors:
Gaojie **,
** Yi,
Wei Huang,
Sven Schewe,
Xiaowei Huang
Abstract:
Adversarial training has been shown to be one of the most effective approaches to improve the robustness of deep neural networks. It is formalized as a min-max optimization over model weights and adversarial perturbations, where the weights can be optimized through gradient descent methods like SGD. In this paper, we show that treating model weights as random variables allows for enhancing adversa…
▽ More
Adversarial training has been shown to be one of the most effective approaches to improve the robustness of deep neural networks. It is formalized as a min-max optimization over model weights and adversarial perturbations, where the weights can be optimized through gradient descent methods like SGD. In this paper, we show that treating model weights as random variables allows for enhancing adversarial training through \textbf{S}econd-Order \textbf{S}tatistics \textbf{O}ptimization (S$^2$O) with respect to the weights. By relaxing a common (but unrealistic) assumption of previous PAC-Bayesian frameworks that all weights are statistically independent, we derive an improved PAC-Bayesian adversarial generalization bound, which suggests that optimizing second-order statistics of weights can effectively tighten the bound. In addition to this theoretical insight, we conduct an extensive set of experiments, which show that S$^2$O not only improves the robustness and generalization of the trained neural networks when used in isolation, but also integrates easily in state-of-the-art adversarial training techniques like TRADES, AWP, MART, and AVMixup, leading to a measurable improvement of these techniques. The code is available at \url{https://github.com/Alexkael/S2O}.
△ Less
Submitted 11 March, 2022;
originally announced March 2022.
-
Deciding What is Good-for-MDPs
Authors:
Sven Schewe,
Qiyi Tang,
Tansholpan Zhanabekova
Abstract:
Nondeterministic Good-for-MDP (GFM) automata are for MDP model checking and reinforcement learning what good-for-games automata are for reactive synthesis: a more compact alternative to deterministic automata that displays nondeterminism, but only so much that it can be resolved locally, such that a syntactic product can be analysed. GFM has recently been introduced as a property for reinforcement…
▽ More
Nondeterministic Good-for-MDP (GFM) automata are for MDP model checking and reinforcement learning what good-for-games automata are for reactive synthesis: a more compact alternative to deterministic automata that displays nondeterminism, but only so much that it can be resolved locally, such that a syntactic product can be analysed. GFM has recently been introduced as a property for reinforcement learning, where the simpler Büchi acceptance conditions it allows to use is key. However, while there are classic and novel techniques to obtain automata that are GFM, there has not been a decision procedure for checking whether or not an automaton is GFM. We show that GFM-ness is decidable and provide an EXPTIME decision procedure as well as a PSPACE-hardness proof.
△ Less
Submitted 3 July, 2023; v1 submitted 15 February, 2022;
originally announced February 2022.
-
Weight Expansion: A New Perspective on Dropout and Generalization
Authors:
Gaojie **,
** Yi,
Pengfei Yang,
Lijun Zhang,
Sven Schewe,
Xiaowei Huang
Abstract:
While dropout is known to be a successful regularization technique, insights into the mechanisms that lead to this success are still lacking. We introduce the concept of \emph{weight expansion}, an increase in the signed volume of a parallelotope spanned by the column or row vectors of the weight covariance matrix, and show that weight expansion is an effective means of increasing the generalizati…
▽ More
While dropout is known to be a successful regularization technique, insights into the mechanisms that lead to this success are still lacking. We introduce the concept of \emph{weight expansion}, an increase in the signed volume of a parallelotope spanned by the column or row vectors of the weight covariance matrix, and show that weight expansion is an effective means of increasing the generalization in a PAC-Bayesian setting. We provide a theoretical argument that dropout leads to weight expansion and extensive empirical support for the correlation between dropout and weight expansion. To support our hypothesis that weight expansion can be regarded as an \emph{indicator} of the enhanced generalization capability endowed by dropout, and not just as a mere by-product, we have studied other methods that achieve weight expansion (resp.\ contraction), and found that they generally lead to an increased (resp.\ decreased) generalization ability. This suggests that dropout is an attractive regularizer, because it is a computationally cheap method for obtaining weight expansion. This insight justifies the role of dropout as a regularizer, while paving the way for identifying regularizers that promise improved generalization through weight expansion.
△ Less
Submitted 12 September, 2022; v1 submitted 23 January, 2022;
originally announced January 2022.
-
Reliability Assessment and Safety Arguments for Machine Learning Components in System Assurance
Authors:
Yi Dong,
Wei Huang,
Vibhav Bharti,
Victoria Cox,
Alec Banks,
Sen Wang,
Xingyu Zhao,
Sven Schewe,
Xiaowei Huang
Abstract:
The increasing use of Machine Learning (ML) components embedded in autonomous systems -- so-called Learning-Enabled Systems (LESs) -- has resulted in the pressing need to assure their functional safety. As for traditional functional safety, the emerging consensus within both, industry and academia, is to use assurance cases for this purpose. Typically assurance cases support claims of reliability…
▽ More
The increasing use of Machine Learning (ML) components embedded in autonomous systems -- so-called Learning-Enabled Systems (LESs) -- has resulted in the pressing need to assure their functional safety. As for traditional functional safety, the emerging consensus within both, industry and academia, is to use assurance cases for this purpose. Typically assurance cases support claims of reliability in support of safety, and can be viewed as a structured way of organising arguments and evidence generated from safety analysis and reliability modelling activities. While such assurance activities are traditionally guided by consensus-based standards developed from vast engineering experience, LESs pose new challenges in safety-critical application due to the characteristics and design of ML models. In this article, we first present an overall assurance framework for LESs with an emphasis on quantitative aspects, e.g., breaking down system-level safety targets to component-level requirements and supporting claims stated in reliability metrics. We then introduce a novel model-agnostic Reliability Assessment Model (RAM) for ML classifiers that utilises the operational profile and robustness verification evidence. We discuss the model assumptions and the inherent challenges of assessing ML reliability uncovered by our RAM and propose solutions to practical use. Probabilistic safety argument templates at the lower ML component-level are also developed based on the RAM. Finally, to evaluate and demonstrate our methods, we not only conduct experiments on synthetic/benchmark datasets but also scope our methods with case studies on simulated Autonomous Underwater Vehicles and physical Unmanned Ground Vehicles.
△ Less
Submitted 14 October, 2022; v1 submitted 30 November, 2021;
originally announced December 2021.
-
Mungojerrie: Reinforcement Learning of Linear-Time Objectives
Authors:
Ernst Moritz Hahn,
Mateo Perez,
Sven Schewe,
Fabio Somenzi,
Ashutosh Trivedi,
Dominik Wojtczak
Abstract:
Reinforcement learning synthesizes controllers without prior knowledge of the system. At each timestep, a reward is given. The controllers optimize the discounted sum of these rewards. Applying this class of algorithms requires designing a reward scheme, which is typically done manually. The designer must ensure that their intent is accurately captured. This may not be trivial, and is prone to err…
▽ More
Reinforcement learning synthesizes controllers without prior knowledge of the system. At each timestep, a reward is given. The controllers optimize the discounted sum of these rewards. Applying this class of algorithms requires designing a reward scheme, which is typically done manually. The designer must ensure that their intent is accurately captured. This may not be trivial, and is prone to error. An alternative to this manual programming, akin to programming directly in assembly, is to specify the objective in a formal language and have it "compiled" to a reward scheme. Mungojerrie (https://plv.colorado.edu/mungojerrie/) is a tool for testing reward schemes for $ω$-regular objectives on finite models. The tool contains reinforcement learning algorithms and a probabilistic model checker. Mungojerrie supports models specified in PRISM and $ω$-automata specified in HOA.
△ Less
Submitted 17 June, 2021; v1 submitted 16 June, 2021;
originally announced June 2021.
-
Model-free Reinforcement Learning for Branching Markov Decision Processes
Authors:
Ernst Moritz Hahn,
Mateo Perez,
Sven Schewe,
Fabio Somenzi,
Ashutosh Trivedi,
Dominik Wojtczak
Abstract:
We study reinforcement learning for the optimal control of Branching Markov Decision Processes (BMDPs), a natural extension of (multitype) Branching Markov Chains (BMCs). The state of a (discrete-time) BMCs is a collection of entities of various types that, while spawning other entities, generate a payoff. In comparison with BMCs, where the evolution of a each entity of the same type follows the s…
▽ More
We study reinforcement learning for the optimal control of Branching Markov Decision Processes (BMDPs), a natural extension of (multitype) Branching Markov Chains (BMCs). The state of a (discrete-time) BMCs is a collection of entities of various types that, while spawning other entities, generate a payoff. In comparison with BMCs, where the evolution of a each entity of the same type follows the same probabilistic pattern, BMDPs allow an external controller to pick from a range of options. This permits us to study the best/worst behaviour of the system. We generalise model-free reinforcement learning techniques to compute an optimal control strategy of an unknown BMDP in the limit. We present results of an implementation that demonstrate the practicality of the approach.
△ Less
Submitted 12 June, 2021;
originally announced June 2021.
-
Assessing the Reliability of Deep Learning Classifiers Through Robustness Evaluation and Operational Profiles
Authors:
Xingyu Zhao,
Wei Huang,
Alec Banks,
Victoria Cox,
David Flynn,
Sven Schewe,
Xiaowei Huang
Abstract:
The utilisation of Deep Learning (DL) is advancing into increasingly more sophisticated applications. While it shows great potential to provide transformational capabilities, DL also raises new challenges regarding its reliability in critical functions. In this paper, we present a model-agnostic reliability assessment method for DL classifiers, based on evidence from robustness evaluation and the…
▽ More
The utilisation of Deep Learning (DL) is advancing into increasingly more sophisticated applications. While it shows great potential to provide transformational capabilities, DL also raises new challenges regarding its reliability in critical functions. In this paper, we present a model-agnostic reliability assessment method for DL classifiers, based on evidence from robustness evaluation and the operational profile (OP) of a given application. We partition the input space into small cells and then "assemble" their robustness (to the ground truth) according to the OP, where estimators on the cells' robustness and OPs are provided. Reliability estimates in terms of the probability of misclassification per input (pmi) can be derived together with confidence levels. A prototype tool is demonstrated with simplified case studies. Model assumptions and extension to real-world applications are also discussed. While our model easily uncovers the inherent difficulties of assessing the DL dependability (e.g. lack of data with ground truth and scalability issues), we provide preliminary/compromised solutions to advance in this research direction.
△ Less
Submitted 2 June, 2021;
originally announced June 2021.
-
Priority Promotion with Parysian Flair
Authors:
Massimo Benerecetti,
Daniele Dell'Erba,
Fabio Mogavero,
Sven Schewe,
Dominik Wojtczak
Abstract:
We develop an algorithm that combines the advantages of priority promotion - one of the leading approaches to solving large parity games in practice - with the quasi-polynomial time guarantees offered by Parys' algorithm. Hybridising these algorithms sounds both natural and difficult, as they both generalise the classic recursive algorithm in different ways that appear to be irreconcilable: while…
▽ More
We develop an algorithm that combines the advantages of priority promotion - one of the leading approaches to solving large parity games in practice - with the quasi-polynomial time guarantees offered by Parys' algorithm. Hybridising these algorithms sounds both natural and difficult, as they both generalise the classic recursive algorithm in different ways that appear to be irreconcilable: while the promotion transcends the call structure, the guarantees change on each level. We show that an interface that respects both is not only effective, but also efficient.
△ Less
Submitted 31 August, 2021; v1 submitted 16 April, 2021;
originally announced May 2021.
-
A Recursive Approach to Solving Parity Games in Quasipolynomial Time
Authors:
Karoliina Lehtinen,
Paweł Parys,
Sven Schewe,
Dominik Wojtczak
Abstract:
Zielonka's classic recursive algorithm for solving parity games is perhaps the simplest among the many existing parity game algorithms. However, its complexity is exponential, while currently the state-of-the-art algorithms have quasipolynomial complexity. Here, we present a modification of Zielonka's classic algorithm that brings its complexity down to…
▽ More
Zielonka's classic recursive algorithm for solving parity games is perhaps the simplest among the many existing parity game algorithms. However, its complexity is exponential, while currently the state-of-the-art algorithms have quasipolynomial complexity. Here, we present a modification of Zielonka's classic algorithm that brings its complexity down to $n^{O\left(\log\left(1+\frac{d}{\log n}\right)\right)}$, for parity games of size $n$ with $d$ priorities, in line with previous quasipolynomial-time solutions.
△ Less
Submitted 11 January, 2022; v1 submitted 19 April, 2021;
originally announced April 2021.
-
Detecting Operational Adversarial Examples for Reliable Deep Learning
Authors:
Xingyu Zhao,
Wei Huang,
Sven Schewe,
Yi Dong,
Xiaowei Huang
Abstract:
The utilisation of Deep Learning (DL) raises new challenges regarding its dependability in critical applications. Sound verification and validation methods are needed to assure the safe and reliable use of DL. However, state-of-the-art debug testing methods on DL that aim at detecting adversarial examples (AEs) ignore the operational profile, which statistically depicts the software's future opera…
▽ More
The utilisation of Deep Learning (DL) raises new challenges regarding its dependability in critical applications. Sound verification and validation methods are needed to assure the safe and reliable use of DL. However, state-of-the-art debug testing methods on DL that aim at detecting adversarial examples (AEs) ignore the operational profile, which statistically depicts the software's future operational use. This may lead to very modest effectiveness on improving the software's delivered reliability, as the testing budget is likely to be wasted on detecting AEs that are unrealistic or encountered very rarely in real-life operation. In this paper, we first present the novel notion of "operational AEs" which are AEs that have relatively high chance to be seen in future operation. Then an initial design of a new DL testing method to efficiently detect "operational AEs" is provided, as well as some insights on our prospective research plan.
△ Less
Submitted 16 April, 2021; v1 submitted 13 April, 2021;
originally announced April 2021.
-
Abstraction and Symbolic Execution of Deep Neural Networks with Bayesian Approximation of Hidden Features
Authors:
Nicolas Berthier,
Amany Alshareef,
James Sharp,
Sven Schewe,
Xiaowei Huang
Abstract:
Intensive research has been conducted on the verification and validation of deep neural networks (DNNs), aiming to understand if, and how, DNNs can be applied to safety critical applications. However, existing verification and validation techniques are limited by their scalability, over both the size of the DNN and the size of the dataset. In this paper, we propose a novel abstraction method which…
▽ More
Intensive research has been conducted on the verification and validation of deep neural networks (DNNs), aiming to understand if, and how, DNNs can be applied to safety critical applications. However, existing verification and validation techniques are limited by their scalability, over both the size of the DNN and the size of the dataset. In this paper, we propose a novel abstraction method which abstracts a DNN and a dataset into a Bayesian network (BN). We make use of dimensionality reduction techniques to identify hidden features that have been learned by hidden layers of the DNN, and associate each hidden feature with a node of the BN. On this BN, we can conduct probabilistic inference to understand the behaviours of the DNN processing data. More importantly, we can derive a runtime monitoring approach to detect in operational time rare inputs and covariate shift of the input data. We can also adapt existing structural coverage-guided testing techniques (i.e., based on low-level elements of the DNN such as neurons), in order to generate test cases that better exercise hidden features. We implement and evaluate the BN abstraction technique using our DeepConcolic tool available at https://github.com/TrustAI/DeepConcolic.
△ Less
Submitted 5 March, 2021;
originally announced March 2021.
-
Simple Stochastic Games with Almost-Sure Energy-Parity Objectives are in NP and coNP
Authors:
Richard Mayr,
Sven Schewe,
Patrick Totzke,
Dominik Wojtczak
Abstract:
We study stochastic games with energy-parity objectives, which combine quantitative rewards with a qualitative $ω$-regular condition: The maximizer aims to avoid running out of energy while simultaneously satisfying a parity condition. We show that the corresponding almost-sure problem, i.e., checking whether there exists a maximizer strategy that achieves the energy-parity objective with probabil…
▽ More
We study stochastic games with energy-parity objectives, which combine quantitative rewards with a qualitative $ω$-regular condition: The maximizer aims to avoid running out of energy while simultaneously satisfying a parity condition. We show that the corresponding almost-sure problem, i.e., checking whether there exists a maximizer strategy that achieves the energy-parity objective with probability $1$ when starting at a given energy level $k$, is decidable and in $NP \cap coNP$. The same holds for checking if such a $k$ exists and if a given $k$ is minimal.
△ Less
Submitted 18 January, 2021;
originally announced January 2021.
-
Satisfiability Modulo Theories and Chiral Heterotic String Vacua with Positive Cosmological Constant
Authors:
Alon E. Faraggi,
Benjamin Percival,
Sven Schewe,
Dominik Wojtczak
Abstract:
We apply Boolean Satisfiability (SAT) and Satisfiability Modulo Theories (SMT) solvers in the context of finding chiral heterotic string models with positive cosmological constant from $\mathbb{Z}_2\times \mathbb{Z}_2$ orbifolds. The power of using SAT/SMT solvers to sift large parameter spaces quickly to decide satisfiability, both to declare and prove unsatisfiability and to declare satisfiabili…
▽ More
We apply Boolean Satisfiability (SAT) and Satisfiability Modulo Theories (SMT) solvers in the context of finding chiral heterotic string models with positive cosmological constant from $\mathbb{Z}_2\times \mathbb{Z}_2$ orbifolds. The power of using SAT/SMT solvers to sift large parameter spaces quickly to decide satisfiability, both to declare and prove unsatisfiability and to declare satisfiability, are demonstrated in this setting. These models are partly chosen to be small enough to plot the performance against exhaustive search, which takes around 2 hours 20 minutes to comb through the parameter space. We show that making use of SMT based techniques with integer encoding is rather simple and effective, while a more careful Boolean SAT encoding provides a significant speed-up -- determining satisfiability or unsatisfiability has, in our experiments varied between 0.03 and 0.06 seconds, while determining all models (where models exist) took 19 seconds for a constraint system that allows for 2048 models and 8.4 seconds for a constraint system that admits 640 models. We thus gain several orders of magnitude in speed, and this advantage is set to grow with a growing parameter space. This holds the promise that the method scales well beyond the initial problem we have used it for in this paper.
△ Less
Submitted 8 January, 2021;
originally announced January 2021.
-
How does Weight Correlation Affect the Generalisation Ability of Deep Neural Networks
Authors:
Gaojie **,
** Yi,
Liang Zhang,
Lijun Zhang,
Sven Schewe,
Xiaowei Huang
Abstract:
This paper studies the novel concept of weight correlation in deep neural networks and discusses its impact on the networks' generalisation ability. For fully-connected layers, the weight correlation is defined as the average cosine similarity between weight vectors of neurons, and for convolutional layers, the weight correlation is defined as the cosine similarity between filter matrices. Theoret…
▽ More
This paper studies the novel concept of weight correlation in deep neural networks and discusses its impact on the networks' generalisation ability. For fully-connected layers, the weight correlation is defined as the average cosine similarity between weight vectors of neurons, and for convolutional layers, the weight correlation is defined as the cosine similarity between filter matrices. Theoretically, we show that, weight correlation can, and should, be incorporated into the PAC Bayesian framework for the generalisation of neural networks, and the resulting generalisation bound is monotonic with respect to the weight correlation. We formulate a new complexity measure, which lifts the PAC Bayes measure with weight correlation, and experimentally confirm that it is able to rank the generalisation errors of a set of networks more precisely than existing measures. More importantly, we develop a new regulariser for training, and provide extensive experiments that show that the generalisation error can be greatly reduced with our novel approach.
△ Less
Submitted 17 October, 2020; v1 submitted 12 October, 2020;
originally announced October 2020.
-
Minimising Good-for-Games automata is NP complete
Authors:
Sven Schewe
Abstract:
This paper discusses the hardness of finding minimal good-for-games (GFG) Buchi, Co-Buchi, and parity automata with state based acceptance. The problem appears to sit between finding small deterministic and finding small nondeterministic automata, where minimality is NP-complete and PSPACE-complete, respectively. However, recent work of Radi and Kupferman has shown that minimising Co-Buchi automat…
▽ More
This paper discusses the hardness of finding minimal good-for-games (GFG) Buchi, Co-Buchi, and parity automata with state based acceptance. The problem appears to sit between finding small deterministic and finding small nondeterministic automata, where minimality is NP-complete and PSPACE-complete, respectively. However, recent work of Radi and Kupferman has shown that minimising Co-Buchi automata with transition based acceptance is tractable, which suggests that the complexity of minimising GFG automata might be cheaper than minimising deterministic automata.
We show for the standard state based acceptance that the minimality of a GFG automaton is NP-complete for Buchi, Co-Buchi, and parity GFG automata. The proofs are a surprisingly straight forward generalisation of the proofs from deterministic Buchi automata: they use a similar reductions, and the same hard class of languages.
△ Less
Submitted 26 March, 2020;
originally announced March 2020.
-
Reward Sha** for Reinforcement Learning with Omega-Regular Objectives
Authors:
E. M. Hahn,
M. Perez,
S. Schewe,
F. Somenzi,
A. Trivedi,
D. Wojtczak
Abstract:
Recently, successful approaches have been made to exploit good-for-MDPs automata (Büchi automata with a restricted form of nondeterminism) for model free reinforcement learning, a class of automata that subsumes good for games automata and the most widespread class of limit deterministic automata. The foundation of using these Büchi automata is that the Büchi condition can, for good-for-MDP automa…
▽ More
Recently, successful approaches have been made to exploit good-for-MDPs automata (Büchi automata with a restricted form of nondeterminism) for model free reinforcement learning, a class of automata that subsumes good for games automata and the most widespread class of limit deterministic automata. The foundation of using these Büchi automata is that the Büchi condition can, for good-for-MDP automata, be translated to reachability.
The drawback of this translation is that the rewards are, on average, reaped very late, which requires long episodes during the learning process. We devise a new reward sha** approach that overcomes this issue. We show that the resulting model is equivalent to a discounted payoff objective with a biased discount that simplifies and improves on prior work in this direction.
△ Less
Submitted 16 January, 2020;
originally announced January 2020.
-
Good-for-MDPs Automata for Probabilistic Analysis and Reinforcement Learning
Authors:
Ernst Moritz Hahn,
Mateo Perez,
Fabio Somenzi,
Ashutosh Trivedi,
Sven Schewe,
Dominik Wojtczak
Abstract:
We characterize the class of nondeterministic $ω$-automata that can be used for the analysis of finite Markov decision processes (MDPs). We call these automata `good-for-MDPs' (GFM). We show that GFM automata are closed under classic simulation as well as under more powerful simulation relations that leverage properties of optimal control strategies for MDPs. This closure enables us to exploit sta…
▽ More
We characterize the class of nondeterministic $ω$-automata that can be used for the analysis of finite Markov decision processes (MDPs). We call these automata `good-for-MDPs' (GFM). We show that GFM automata are closed under classic simulation as well as under more powerful simulation relations that leverage properties of optimal control strategies for MDPs. This closure enables us to exploit state-space reduction techniques, such as those based on direct and delayed simulation, that guarantee simulation equivalence. We demonstrate the promise of GFM automata by defining a new class of automata with favorable properties - they are Büchi automata with low branching degree obtained through a simple construction - and show that going beyond limit-deterministic automata may significantly benefit reinforcement learning.
△ Less
Submitted 30 October, 2019; v1 submitted 11 September, 2019;
originally announced September 2019.
-
Improving the complexity of Parys' recursive algorithm
Authors:
Karoliina Lehtinen,
Sven Schewe,
Dominik Wojtczak
Abstract:
Parys has recently proposed a quasi-polynomial version of Zielonka's recursive algorithm for solving parity games. In this brief note we suggest a variation of his algorithm that improves the complexity to meet the state-of-the-art complexity of broadly $2^{O((\log n)(\log c))}$, while providing polynomial bounds when the number of colours is logarithmic.
Parys has recently proposed a quasi-polynomial version of Zielonka's recursive algorithm for solving parity games. In this brief note we suggest a variation of his algorithm that improves the complexity to meet the state-of-the-art complexity of broadly $2^{O((\log n)(\log c))}$, while providing polynomial bounds when the number of colours is logarithmic.
△ Less
Submitted 5 June, 2019; v1 submitted 26 April, 2019;
originally announced April 2019.
-
Omega-Regular Objectives in Model-Free Reinforcement Learning
Authors:
Ernst Moritz Hahn,
Mateo Perez,
Sven Schewe,
Fabio Somenzi,
Ashutosh Trivedi,
Dominik Wojtczak
Abstract:
We provide the first solution for model-free reinforcement learning of ω-regular objectives for Markov decision processes (MDPs). We present a constructive reduction from the almost-sure satisfaction of ω-regular objectives to an almost- sure reachability problem and extend this technique to learning how to control an unknown model so that the chance of satisfying the objective is maximized. A key…
▽ More
We provide the first solution for model-free reinforcement learning of ω-regular objectives for Markov decision processes (MDPs). We present a constructive reduction from the almost-sure satisfaction of ω-regular objectives to an almost- sure reachability problem and extend this technique to learning how to control an unknown model so that the chance of satisfying the objective is maximized. A key feature of our technique is the compilation of ω-regular properties into limit- deterministic Buechi automata instead of the traditional Rabin automata; this choice sidesteps difficulties that have marred previous proposals. Our approach allows us to apply model-free, off-the-shelf reinforcement learning algorithms to compute optimal strategies from the observations of the MDP. We present an experimental evaluation of our technique on benchmark learning problems.
△ Less
Submitted 26 September, 2018;
originally announced October 2018.
-
Maximum Rooted Connected Expansion
Authors:
Ioannis Lamprou,
Russell Martin,
Sven Schewe,
Ioannis Sigalas,
Vassilis Zissimopoulos
Abstract:
Prefetching constitutes a valuable tool toward efficient Web surfing. As a result, estimating the amount of resources that need to be preloaded during a surfer's browsing becomes an important task. In this regard, prefetching can be modeled as a two-player combinatorial game [Fomin et al., Theoretical Computer Science 2014], where a surfer and a marker alternately play on a given graph (representi…
▽ More
Prefetching constitutes a valuable tool toward efficient Web surfing. As a result, estimating the amount of resources that need to be preloaded during a surfer's browsing becomes an important task. In this regard, prefetching can be modeled as a two-player combinatorial game [Fomin et al., Theoretical Computer Science 2014], where a surfer and a marker alternately play on a given graph (representing the Web graph). During its turn, the marker chooses a set of $k$ nodes to mark (prefetch), whereas the surfer, represented as a token resting on graph nodes, moves to a neighboring node (Web resource). The surfer's objective is to reach an unmarked node before all nodes become marked and the marker wins. Intuitively, since the surfer is step-by-step traversing a subset of nodes in the Web graph, a satisfactory prefetching procedure would load in cache all resources lying in the neighborhood of this growing subset.
Motivated by the above, we consider the following problem to which we refer to as the Maximum Rooted Connected Expansion (MRCE) problem. Given a graph $G$ and a root node $v_0$, we wish to find a subset of vertices $S$ such that $S$ is connected, $S$ contains $v_0$ and the ratio $|N[S]|/|S|$ is maximized, where $N[S]$ denotes the closed neighborhood of $S$, that is, $N[S]$ contains all nodes in $S$ and all nodes with at least one neighbor in $S$.
We prove that the problem is NP-hard even when the input graph $G$ is restricted to be a split graph. On the positive side, we demonstrate a polynomial time approximation scheme for split graphs. Furthermore, we present a $\frac{1}{6}(1-\frac{1}{e})$-approximation algorithm for general graphs based on techniques for the Budgeted Connected Domination problem [Khuller et al., SODA 2014]. Finally, we provide a polynomial-time algorithm for the special case of interval graphs.
△ Less
Submitted 25 June, 2018;
originally announced June 2018.
-
Accelerated Model Checking of Parametric Markov Chains
Authors:
Paul Gainer,
Ernst Moritz Hahn,
Sven Schewe
Abstract:
Parametric Markov chains occur quite naturally in various applications: they can be used for a conservative analysis of probabilistic systems (no matter how the parameter is chosen, the system works to specification); they can be used to find optimal settings for a parameter; they can be used to visualise the influence of system parameters; and they can be used to make it easy to adjust the analys…
▽ More
Parametric Markov chains occur quite naturally in various applications: they can be used for a conservative analysis of probabilistic systems (no matter how the parameter is chosen, the system works to specification); they can be used to find optimal settings for a parameter; they can be used to visualise the influence of system parameters; and they can be used to make it easy to adjust the analysis for the case that parameters change. Unfortunately, these advancements come at a cost: parametric model checking is---or rather was---often slow. To make the analysis of parametric Markov models scale, we need three ingredients: clever algorithms, the right data structure, and good engineering. Clever algorithms are often the main (or sole) selling point; and we face the trouble that this paper focuses on -- the latter ingredients to efficient model checking. Consequently, our easiest claim to fame is in the speed-up we have often realised when comparing to the state of the art.
△ Less
Submitted 2 November, 2018; v1 submitted 15 May, 2018;
originally announced May 2018.
-
Parity Games with Weights
Authors:
Sven Schewe,
Alexander Weinert,
Martin Zimmermann
Abstract:
Quantitative extensions of parity games have recently attracted significant interest. These extensions include parity games with energy and payoff conditions as well as finitary parity games and their generalization to parity games with costs. Finitary parity games enjoy a special status among these extensions, as they offer a native combination of the qualitative and quantitative aspects in infin…
▽ More
Quantitative extensions of parity games have recently attracted significant interest. These extensions include parity games with energy and payoff conditions as well as finitary parity games and their generalization to parity games with costs. Finitary parity games enjoy a special status among these extensions, as they offer a native combination of the qualitative and quantitative aspects in infinite games: The quantitative aspect of finitary parity games is a quality measure for the qualitative aspect, as it measures the limit superior of the time it takes to answer an odd color by a larger even one. Finitary parity games have been extended to parity games with costs, where each transition is labeled with a nonnegative weight that reflects the costs incurred by taking it. We lift this restriction and consider parity games with costs with arbitrary integer weights.
We show that solving such games is in NP $\cap$ coNP, the signature complexity for games of this type. We also show that the protagonist has finite-state winning strategies, and provide tight pseudo-polynomial bounds for the memory he needs to win the game. Naturally, the antagonist may need infinite memory to win. Moreover, we present tight bounds on the quality of winning strategies for the protagonist.
Furthermore, we investigate the problem of determining, for a given threshold $b$, whether the protagonist has a strategy of quality at most $b$ and show this problem to be EXPTIME-complete. The protagonist inherits the necessity of exponential memory for implementing such strategies from the special case of finitary parity games.
△ Less
Submitted 22 August, 2019; v1 submitted 17 April, 2018;
originally announced April 2018.
-
Incremental Verification of Parametric and Reconfigurable Markov Chains
Authors:
Paul Gainer,
Ernst Moritz Hahn,
Sven Schewe
Abstract:
The analysis of parametrised systems is a growing field in verification, but the analysis of parametrised probabilistic systems is still in its infancy. This is partly because it is much harder: while there are beautiful cut-off results for non-stochastic systems that allow to focus only on small instances, there is little hope that such approaches extend to the quantitative analysis of probabilis…
▽ More
The analysis of parametrised systems is a growing field in verification, but the analysis of parametrised probabilistic systems is still in its infancy. This is partly because it is much harder: while there are beautiful cut-off results for non-stochastic systems that allow to focus only on small instances, there is little hope that such approaches extend to the quantitative analysis of probabilistic systems, as the probabilities depend on the size of a system. The unicorn would be an automatic transformation of a parametrised system into a formula, which allows to plot, say, the likelihood to reach a goal or the expected costs to do so, against the parameters of a system. While such analysis exists for narrow classes of systems, such as waiting queues, we aim both lower---stepwise exploring the parameter space---and higher---considering general systems.
The novelty is to heavily exploit the similarity between instances of parametrised systems. When the parameter grows, the system for the smaller parameter is, broadly speaking, present in the larger system. We use this observation to guide the elegant state-elimination method for parametric Markov chains in such a way, that the model transformations will start with those parts of the system that are stable under increasing the parameter. We argue that this can lead to a very cheap iterative way to analyse parametric systems, show how this approach extends to reconfigurable systems, and demonstrate on two benchmarks that this approach scales.
△ Less
Submitted 5 April, 2018;
originally announced April 2018.
-
CTL* synthesis via LTL synthesis
Authors:
Roderick Bloem,
Sven Schewe,
Ayrat Khalimov
Abstract:
We reduce synthesis for CTL* properties to synthesis for LTL. In the context of model checking this is impossible - CTL* is more expressive than LTL. Yet, in synthesis we have knowledge of the system structure and we can add new outputs. These outputs can be used to encode witnesses of the satisfaction of CTL* subformulas directly into the system. This way, we construct an LTL formula, over old an…
▽ More
We reduce synthesis for CTL* properties to synthesis for LTL. In the context of model checking this is impossible - CTL* is more expressive than LTL. Yet, in synthesis we have knowledge of the system structure and we can add new outputs. These outputs can be used to encode witnesses of the satisfaction of CTL* subformulas directly into the system. This way, we construct an LTL formula, over old and new outputs and original inputs, which is realisable if, and only if, the original CTL* formula is realisable. The CTL*-via-LTL synthesis approach preserves the problem complexity, although it might increase the minimal system size. We implemented the reduction, and evaluated the CTL*-via-LTL synthesiser on several examples.
△ Less
Submitted 28 November, 2017;
originally announced November 2017.
-
Optimal Control for Multi-Mode Systems with Discrete Costs
Authors:
Mahmoud A. A. Mousa,
Sven Schewe,
Dominik Wojtczak
Abstract:
This paper studies optimal time-bounded control in multi-mode systems with discrete costs. Multi-mode systems are an important subclass of linear hybrid systems, in which there are no guards on transitions and all invariants are global. Each state has a continuous cost attached to it, which is linear in the sojourn time, while a discrete cost is attached to each transition taken. We show that an o…
▽ More
This paper studies optimal time-bounded control in multi-mode systems with discrete costs. Multi-mode systems are an important subclass of linear hybrid systems, in which there are no guards on transitions and all invariants are global. Each state has a continuous cost attached to it, which is linear in the sojourn time, while a discrete cost is attached to each transition taken. We show that an optimal control for this model can be computed in NEXPTIME and approximated in PSPACE. We also show that the one-dimensional case is simpler: although the problem is NP-complete (and in LOGSPACE for an infinite time horizon), we develop an FPTAS for finding an approximate solution.
△ Less
Submitted 29 June, 2017;
originally announced June 2017.
-
An Ordered Approach to Solving Parity Games in Quasi Polynomial Time and Quasi Linear Space
Authors:
John Fearnley,
Sanjay Jain,
Sven Schewe,
Frank Stephan,
Dominik Wojtczak
Abstract:
Parity games play an important role in model checking and synthesis. In their paper, Calude et al. have shown that these games can be solved in quasi-polynomial time. We show that their algorithm can be implemented efficiently: we use their data structure as a progress measure, allowing for a backward implementation instead of a complete unravelling of the game. To achieve this, a number of change…
▽ More
Parity games play an important role in model checking and synthesis. In their paper, Calude et al. have shown that these games can be solved in quasi-polynomial time. We show that their algorithm can be implemented efficiently: we use their data structure as a progress measure, allowing for a backward implementation instead of a complete unravelling of the game. To achieve this, a number of changes have to be made to their techniques, where the main one is to add power to the antagonistic player that allows for determining her rational move without changing the outcome of the game. We provide a first implementation for a quasi-polynomial algorithm, test it on small examples, and provide a number of side results, including minor algorithmic improvements, a quasi bi-linear complexity in the number of states and edges for a fixed number of colours, and matching lower bounds for the algorithm of Calude et al.
△ Less
Submitted 29 January, 2018; v1 submitted 3 March, 2017;
originally announced March 2017.
-
MDPs with Energy-Parity Objectives
Authors:
Richard Mayr,
Sven Schewe,
Patrick Totzke,
Dominik Wojtczak
Abstract:
Energy-parity objectives combine $ω$-regular with quantitative objectives of reward MDPs. The controller needs to avoid to run out of energy while satisfying a parity objective.
We refute the common belief that, if an energy-parity objective holds almost-surely, then this can be realised by some finite memory strategy. We provide a surprisingly simple counterexample that only uses coBüchi condit…
▽ More
Energy-parity objectives combine $ω$-regular with quantitative objectives of reward MDPs. The controller needs to avoid to run out of energy while satisfying a parity objective.
We refute the common belief that, if an energy-parity objective holds almost-surely, then this can be realised by some finite memory strategy. We provide a surprisingly simple counterexample that only uses coBüchi conditions.
We introduce the new class of bounded (energy) storage objectives that, when combined with parity objectives, preserve the finite memory property. Based on these, we show that almost-sure and limit-sure energy-parity objectives, as well as almost-sure and limit-sure storage parity objectives, are in $\mathit{NP}\cap \mathit{coNP}$ and can be solved in pseudo-polynomial time for energy-parity MDPs.
△ Less
Submitted 18 April, 2017; v1 submitted 10 January, 2017;
originally announced January 2017.
-
Perpetually Dominating Large Grids
Authors:
Ioannis Lamprou,
Russell Martin,
Sven Schewe
Abstract:
In the m-\emph{Eternal Domination} game, a team of guard tokens initially occupies a dominating set on a graph $G$. An attacker then picks a vertex without a guard on it and attacks it. The guards defend against the attack: one of them has to move to the attacked vertex, while each remaining one can choose to move to one of his neighboring vertices. The new guards' placement must again be dominati…
▽ More
In the m-\emph{Eternal Domination} game, a team of guard tokens initially occupies a dominating set on a graph $G$. An attacker then picks a vertex without a guard on it and attacks it. The guards defend against the attack: one of them has to move to the attacked vertex, while each remaining one can choose to move to one of his neighboring vertices. The new guards' placement must again be dominating. This attack-defend procedure continues eternally. The guards win if they can eternally maintain a dominating set against any sequence of attacks, otherwise, the attacker wins.
The m-\emph{eternal domination number} for a graph $G$ is the minimum amount of guards such that they win against any attacker strategy in $G$ (all guards move model). We study rectangular grids and provide the first known general upper bound on the m-eternal domination number for these graphs. Our novel strategy implements a square rotation principle and eternally dominates $m \times n$ grids by using approximately $\frac{mn}{5}$ guards, which is asymptotically optimal even for ordinary domination.
△ Less
Submitted 14 October, 2018; v1 submitted 24 November, 2016;
originally announced November 2016.
-
Synthesising Strategy Improvement and Recursive Algorithms for Solving 2.5 Player Parity Games
Authors:
Ernst Moritz Hahn,
Sven Schewe,
Andrea Turrini,
Lijun Zhang
Abstract:
2.5 player parity games combine the challenges posed by 2.5 player reachability games and the qualitative analysis of parity games. These two types of problems are best approached with different types of algorithms: strategy improvement algorithms for 2.5 player reachability games and recursive algorithms for the qualitative analysis of parity games. We present a method that - in contrast to exist…
▽ More
2.5 player parity games combine the challenges posed by 2.5 player reachability games and the qualitative analysis of parity games. These two types of problems are best approached with different types of algorithms: strategy improvement algorithms for 2.5 player reachability games and recursive algorithms for the qualitative analysis of parity games. We present a method that - in contrast to existing techniques - tackles both aspects with the best suited approach and works exclusively on the 2.5 player game itself. The resulting technique is powerful enough to handle games with several million states.
△ Less
Submitted 5 July, 2016;
originally announced July 2016.
-
Fast Two-Robot Disk Evacuation with Wireless Communication
Authors:
Ioannis Lamprou,
Russell Martin,
Sven Schewe
Abstract:
In the fast evacuation problem, we study the path planning problem for two robots who want to minimize the worst-case evacuation time on the unit disk. The robots are initially placed at the center of the disk. In order to evacuate, they need to reach an unknown point, the exit, on the boundary of the disk. Once one of the robots finds the exit, it will instantaneously notify the other agent, who…
▽ More
In the fast evacuation problem, we study the path planning problem for two robots who want to minimize the worst-case evacuation time on the unit disk. The robots are initially placed at the center of the disk. In order to evacuate, they need to reach an unknown point, the exit, on the boundary of the disk. Once one of the robots finds the exit, it will instantaneously notify the other agent, who will make a beeline to it.
The problem has been studied for robots with the same speed~\cite{s1}. We study a more general case where one robot has speed $1$ and the other has speed $s \geq 1$. We provide optimal evacuation strategies in the case that $s \geq c_{2.75} \approx 2.75$ by showing matching upper and lower bounds on the worst-case evacuation time. For $1\leq s < c_{2.75}$, we show (non-matching) upper and lower bounds on the evacuation time with a ratio less than $1.22$. Moreover, we demonstrate that a generalization of the two-robot search strategy from~\cite{s1} is outperformed by our proposed strategies for any $s \geq c_{1.71} \approx 1.71$.
△ Less
Submitted 14 April, 2016;
originally announced April 2016.
-
Incentive Stackelberg Mean-payoff Games
Authors:
Anshul Gupta,
M. S. Krishna Deepak,
Bharath Kumar Padarthi,
Sven Schewe,
Ashutosh Trivedi
Abstract:
We introduce and study incentive equilibria for multi-player meanpayoff games. Incentive equilibria generalise well-studied solution concepts such as Nash equilibria and leader equilibria (also known as Stackelberg equilibria). Recall that a strategy profile is a Nash equilibrium if no player can improve his payoff by changing his strategy unilaterally. In the setting of incentive and leader equil…
▽ More
We introduce and study incentive equilibria for multi-player meanpayoff games. Incentive equilibria generalise well-studied solution concepts such as Nash equilibria and leader equilibria (also known as Stackelberg equilibria). Recall that a strategy profile is a Nash equilibrium if no player can improve his payoff by changing his strategy unilaterally. In the setting of incentive and leader equilibria, there is a distinguished player called the leader who can assign strategies to all other players, referred to as her followers. A strategy profile is a leader strategy profile if no player, except for the leader, can improve his payoff by changing his strategy unilaterally, and a leader equilibrium is a leader strategy profile with a maximal return for the leader. In the proposed case of incentive equilibria, the leader can additionally influence the behaviour of her followers by transferring parts of her payoff to her followers. The ability to incentivise her followers provides the leader with more freedom in selecting strategy profiles, and we show that this can indeed improve the payoff for the leader in such games. The key fundamental result of the paper is the existence of incentive equilibria in mean-payoff games. We further show that the decision problem related to constructing incentive equilibria is NP-complete. On a positive note, we show that, when the number of players is fixed, the complexity of the problem falls in the same class as two-player mean-payoff games. We also present an implementation of the proposed algorithms, and discuss experimental results that demonstrate the feasibility of the analysis of medium sized games.
△ Less
Submitted 31 October, 2015;
originally announced November 2015.
-
Symmetric Strategy Improvement
Authors:
Sven Schewe,
Ashutosh Trivedi,
Thomas Varghese
Abstract:
Symmetry is inherent in the definition of most of the two-player zero-sum games, including parity, mean-payoff, and discounted-payoff games. It is therefore quite surprising that no symmetric analysis techniques for these games exist. We develop a novel symmetric strategy improvement algorithm where, in each iteration, the strategies of both players are improved simultaneously. We show that symmet…
▽ More
Symmetry is inherent in the definition of most of the two-player zero-sum games, including parity, mean-payoff, and discounted-payoff games. It is therefore quite surprising that no symmetric analysis techniques for these games exist. We develop a novel symmetric strategy improvement algorithm where, in each iteration, the strategies of both players are improved simultaneously. We show that symmetric strategy improvement defies Friedmann's traps, which shook the belief in the potential of classic strategy improvement to be polynomial.
△ Less
Submitted 26 January, 2015;
originally announced January 2015.
-
Bounded-Rate Multi-Mode Systems Based Motion Planning
Authors:
Devendra Bhave,
Sagar Jha,
Shankara Narayanan Krishna,
Sven Schewe,
Ashutosh Trivedi
Abstract:
Bounded-rate multi-mode systems are hybrid systems that can switch among a finite set of modes. Its dynamics is specified by a finite number of real-valued variables with mode-dependent rates that can vary within given bounded sets. Given an arbitrary piecewise linear trajectory, we study the problem of following the trajectory with arbitrary precision, using motion primitives given as bounded-rat…
▽ More
Bounded-rate multi-mode systems are hybrid systems that can switch among a finite set of modes. Its dynamics is specified by a finite number of real-valued variables with mode-dependent rates that can vary within given bounded sets. Given an arbitrary piecewise linear trajectory, we study the problem of following the trajectory with arbitrary precision, using motion primitives given as bounded-rate multi-mode systems. We give an algorithm to solve the problem and show that the problem is co-NP complete. We further prove that the problem can be solved in polynomial time for multi-mode systems with fixed dimension. We study the problem with dwell-time requirement and show the decidability of the problem under certain positivity restriction on the rate vectors. Finally, we show that introducing structure to the multi-mode systems leads to undecidability, even when using only a single clock variable.
△ Less
Submitted 9 December, 2014;
originally announced December 2014.
-
Making the Best of Limited Memory in Multi-Player Discounted Sum Games
Authors:
Anshul Gupta,
Sven Schewe,
Dominik Wojtczak
Abstract:
In this paper, we establish the existence of optimal bounded memory strategy profiles in multi-player discounted sum games. We introduce a non-deterministic approach to compute optimal strategy profiles with bounded memory. Our approach can be used to obtain optimal rewards in a setting where a powerful player selects the strategies of all players for Nash and leader equilibria, where in leader e…
▽ More
In this paper, we establish the existence of optimal bounded memory strategy profiles in multi-player discounted sum games. We introduce a non-deterministic approach to compute optimal strategy profiles with bounded memory. Our approach can be used to obtain optimal rewards in a setting where a powerful player selects the strategies of all players for Nash and leader equilibria, where in leader equilibria the Nash condition is waived for the strategy of this powerful player. The resulting strategy profiles are optimal for this player among all strategy profiles that respect the given memory bound, and the related decision problem is NP-complete. We also provide simple examples, which show that having more memory will improve the optimal strategy profile, and that sufficient memory to obtain optimal strategy profiles cannot be inferred from the structure of the game.
△ Less
Submitted 23 September, 2015; v1 submitted 15 October, 2014;
originally announced October 2014.
-
Tight Bounds for Complementing Parity Automata
Authors:
Sven Schewe,
Thomas Varghese
Abstract:
We follow a connection between tight determinisation and complementation and establish a complementation procedure from parity automata to nondeterministic Büchi automata and prove it to be tight up to an $O(n)$ factor, where $n$ is the size of the nondeterministic parity automaton. This factor does not depend on the number of priorities.
We follow a connection between tight determinisation and complementation and establish a complementation procedure from parity automata to nondeterministic Büchi automata and prove it to be tight up to an $O(n)$ factor, where $n$ is the size of the nondeterministic parity automaton. This factor does not depend on the number of priorities.
△ Less
Submitted 4 June, 2014;
originally announced June 2014.
-
Determinising Parity Automata
Authors:
Sven Schewe,
Thomas Varghese
Abstract:
Parity word automata and their determinisation play an important role in automata and game theory. We discuss a determinisation procedure for nondeterministic parity automata through deterministic Rabin to deterministic parity automata. We prove that the intermediate determinisation to Rabin automata is optimal. We show that the resulting determinisation to parity automata is optimal up to a small…
▽ More
Parity word automata and their determinisation play an important role in automata and game theory. We discuss a determinisation procedure for nondeterministic parity automata through deterministic Rabin to deterministic parity automata. We prove that the intermediate determinisation to Rabin automata is optimal. We show that the resulting determinisation to parity automata is optimal up to a small constant. Moreover, the lower bound refers to the more liberal Streett acceptance. We thus show that determinisation to Streett would not lead to better bounds than determinisation to parity. As a side-result, this optimality extends to the determinisation of Büchi automata.
△ Less
Submitted 21 January, 2014;
originally announced January 2014.
-
Coverage Games for Testing Nondeterministic Systems
Authors:
Farn Wang,
Jung-Hsuan Wu,
Sven Schewe,
Chung-Hao Huang
Abstract:
Modern software systems may exhibit a nondeterministic behavior due to many unpredictable factors. In this work, we propose the node coverage game, a two player turn-based game played on a finite game graph, as a formalization of the problem to test such systems. Each node in the graph represents a {\em functional equivalence class} of the software under test (SUT). One player, the tester, wants t…
▽ More
Modern software systems may exhibit a nondeterministic behavior due to many unpredictable factors. In this work, we propose the node coverage game, a two player turn-based game played on a finite game graph, as a formalization of the problem to test such systems. Each node in the graph represents a {\em functional equivalence class} of the software under test (SUT). One player, the tester, wants to maximize the node coverage, measured by the number of nodes visited when exploring the game graphs, while his opponent, the SUT, wants to minimize it. An optimal test would maximize the cover, and it is an interesting problem to find the maximal number of nodes that the tester can guarantee to visit, irrespective of the responses of the SUT. We show that the decision problem of whether the guarantee is less than a given number is NP-complete. Then we present techniques for testing nondeterministic SUTs with existing test suites for deterministic models. Finally, we report our implementation and experiments.
△ Less
Submitted 20 December, 2013;
originally announced December 2013.