Search | arXiv e-print repository

doi 10.1007/978-3-031-46002-9_3

Shielded Reinforcement Learning for Hybrid Systems

Authors: Asger Horn Brorholt, Peter Gjøl Jensen, Kim Guldstrand Larsen, Florian Lorber, Christian Schilling

Abstract: Safe and optimal controller synthesis for switched-controlled hybrid systems, which combine differential equations and discrete changes of the system's state, is known to be intricately hard. Reinforcement learning has been leveraged to construct near-optimal controllers, but their behavior is not guaranteed to be safe, even when it is encouraged by reward engineering. One way of imposing safety t… ▽ More Safe and optimal controller synthesis for switched-controlled hybrid systems, which combine differential equations and discrete changes of the system's state, is known to be intricately hard. Reinforcement learning has been leveraged to construct near-optimal controllers, but their behavior is not guaranteed to be safe, even when it is encouraged by reward engineering. One way of imposing safety to a learned controller is to use a shield, which is correct by design. However, obtaining a shield for non-linear and hybrid environments is itself intractable. In this paper, we propose the construction of a shield using the so-called barbaric method, where an approximate finite representation of an underlying partition-based two-player safety game is extracted via systematically picked samples of the true transition function. While hard safety guarantees are out of reach, we experimentally demonstrate strong statistical safety guarantees with a prototype implementation and UPPAAL STRATEGO. Furthermore, we study the impact of the synthesized shield when applied as either a pre-shield (applied before learning a controller) or a post-shield (only applied after learning a controller). We experimentally demonstrate superiority of the pre-shielding approach. We apply our technique on a range of case studies, including two industrial examples, and further study post-optimization of the post-shielding approach. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Journal ref: AISoLA 2023

arXiv:2301.08553 [pdf, other]

Optimality-preserving Reduction of Chemical Reaction Networks

Authors: Kim G. Larsen, Daniele Toller, Mirco Tribastone, Max Tschaikowski, Andrea Vandin

Abstract: Across many disciplines, chemical reaction networks (CRNs) are an established population model defined as a system of coupled nonlinear ordinary differential equations. In many applications, for example, in systems biology and epidemiology, CRN parameters such as the kinetic reaction rates can be used as control inputs to steer the system toward a given target. Unfortunately, the resulting optimal… ▽ More Across many disciplines, chemical reaction networks (CRNs) are an established population model defined as a system of coupled nonlinear ordinary differential equations. In many applications, for example, in systems biology and epidemiology, CRN parameters such as the kinetic reaction rates can be used as control inputs to steer the system toward a given target. Unfortunately, the resulting optimal control problem is nonlinear, therefore, computationally very challenging. We address this issue by introducing an optimality-preserving reduction algorithm for CRNs. The algorithm partitions the original state variables into a reduced set of macro-variables for which one can define a reduced optimal control problem from which one can exactly recover the solution of the original control problem. Notably, the reduction algorithm runs with polynomial time complexity in the size of the CRN. We use this result to reduce reachability and control problems of large-scale protein-interaction networks and vaccination models with hundreds of thousands of state variables. △ Less

Submitted 20 January, 2023; originally announced January 2023.

arXiv:2104.12509 [pdf, other]

Learning Safe and Optimal Control Strategies for Storm Water Detention Ponds

Authors: Martijn A. Goorden, Kim G. Larsen, Jesper E. Nielsen, Thomas D. Nielsen, Michael R. Rasmussen, Jiri Srba

Abstract: Storm water detention ponds are used to manage the discharge of rainfall runoff from urban areas to nearby streams. Their purpose is to reduce the hydraulic impact and sediment loads of the receiving waters. Detention ponds are currently designed based on static controls: the output flow of a pond is capped at a fixed value. This is not optimal with respect to the current infrastructure capacity a… ▽ More Storm water detention ponds are used to manage the discharge of rainfall runoff from urban areas to nearby streams. Their purpose is to reduce the hydraulic impact and sediment loads of the receiving waters. Detention ponds are currently designed based on static controls: the output flow of a pond is capped at a fixed value. This is not optimal with respect to the current infrastructure capacity and for some detention ponds it might even violate current regulations set by the European Water Framework Directive. We apply formal methods to synthesize (i.e., derive automatically) a safe and optimal active controller. We model the storm water detention pond, including the urban catchment area and the rain forecasts, as a hybrid Markov decision process. Subsequently, we use the tool Uppaal Stratego to synthesize a control strategy minimizing the cost related to pollution (optimality) while guaranteeing no emergency overflow of the detention pond (safety). Simulation results for an existing pond show that Uppaal Stratego can learn optimal strategies that prevent emergency overflows, where the current static control is not always able to prevent it. At the same time, our approach can improve sedimentation during low rain periods. △ Less

Submitted 26 April, 2021; originally announced April 2021.

Comments: Accepted for the IFAC Conference on Analysis and Design of Hybrid Systems 2021. Full paper version

arXiv:2007.10539 [pdf, other]

Verification and Parameter Synthesis for Real-Time Programs using Refinement of Trace Abstraction

Authors: Franck Cassez, Peter Gjøl Jensen, Kim Guldstrand Larsen

Abstract: We address the safety verification and synthesis problems for real-time systems. We introduce real-time programs that are made of instructions that can perform assignments to discrete and real-valued variables. They are general enough to capture interesting classes of timed systems such as timed automata, stopwatch automata, time(d) Petri nets and hybrid automata. We propose a semi-algorithm usi… ▽ More We address the safety verification and synthesis problems for real-time systems. We introduce real-time programs that are made of instructions that can perform assignments to discrete and real-valued variables. They are general enough to capture interesting classes of timed systems such as timed automata, stopwatch automata, time(d) Petri nets and hybrid automata. We propose a semi-algorithm using refinement of trace abstractions to solve both the reachability verification problem and the parameter synthesis problem for real-time programs. All of the algorithms proposed have been implemented and we have conducted a series of experiments, comparing the performance of our new approach to state-of-the-art tools in classical reachability, robustness analysis and parameter synthesis for timed systems. We show that our new method provides solutions to problems which are unsolvable by the current state-of-the-art tools. △ Less

Submitted 20 July, 2020; originally announced July 2020.

arXiv:1906.10640 [pdf, other]

doi 10.1007/978-3-030-30281-8_9

SOS: Safe, Optimal and Small Strategies for Hybrid Markov Decision Processes

Authors: Pranav Ashok, Jan Křetínský, Kim Guldstrand Larsen, Adrien Le Coënt, Jakob Haahr Taankvist, Maximilian Weininger

Abstract: For hybrid Markov decision processes, UPPAAL Stratego can compute strategies that are safe for a given safety property and (in the limit) optimal for a given cost function. Unfortunately, these strategies cannot be exported easily since they are computed as a very long list. In this paper, we demonstrate methods to learn compact representations of the strategies in the form of decision trees. Thes… ▽ More For hybrid Markov decision processes, UPPAAL Stratego can compute strategies that are safe for a given safety property and (in the limit) optimal for a given cost function. Unfortunately, these strategies cannot be exported easily since they are computed as a very long list. In this paper, we demonstrate methods to learn compact representations of the strategies in the form of decision trees. These decision trees are much smaller, more understandable, and can easily be exported as code that can be loaded into embedded systems. Despite the size compression and actual differences to the original strategy, we provide guarantees on both safety and optimality of the decision-tree strategy. On the top, we show how to obtain yet smaller representations, which are still guaranteed safe, but achieve a desired trade-off between size and optimality. △ Less

Submitted 25 June, 2019; originally announced June 2019.

arXiv:1809.07177 [pdf, other]

Parameter Synthesis Problems for one parametric clock Timed Automata

Authors: Liyun Dai, Taolue Chen, Zhiming Liu, Bican Xia, Naijun Zhan, Kim G. Larsen

Abstract: In this paper, we study the parameter synthesis problem for a class of parametric timed automata. The problem asks to construct the set of valuations of the parameters in the parametric timed automa- ton, referred to as the feasible region, under which the resulting timed automaton satisfies certain properties. We show that the parameter syn- thesis problem of parametric timed automata with only o… ▽ More In this paper, we study the parameter synthesis problem for a class of parametric timed automata. The problem asks to construct the set of valuations of the parameters in the parametric timed automa- ton, referred to as the feasible region, under which the resulting timed automaton satisfies certain properties. We show that the parameter syn- thesis problem of parametric timed automata with only one parametric clock (unlimited concretely constrained clock) and arbitrarily many pa- rameters is solvable when all the expressions are linear expressions. And it is moreover the synthesis problem is solvable when the form of con- straints are parameter polynomial inequality not just simple constraint and parameter domain is nonnegative real number. △ Less

Submitted 15 September, 2018; originally announced September 2018.

Comments: 20 pages, 1 figure. arXiv admin note: substantial text overlap with arXiv:1808.06792

arXiv:1805.00847 [pdf, other]

Optimal and Robust Controller Synthesis: using Energy Timed Automata with Uncertainty

Authors: Giovanni Bacci, Patricia Bouyer, Uli Fahrenberg, Kim G. Larsen, Nicolas Markey, Pierre-Alain Reynier

Abstract: In this paper, we propose a novel framework for the synthesis of robust and optimal energy-aware controllers. The framework is based on energy timed automata, allowing for easy expression of timing constraints and variable energy rates. We prove decidability of the energy-constrained infinite-run problem in settings with both certainty and uncertainty of the energy rates. We also consider the opti… ▽ More In this paper, we propose a novel framework for the synthesis of robust and optimal energy-aware controllers. The framework is based on energy timed automata, allowing for easy expression of timing constraints and variable energy rates. We prove decidability of the energy-constrained infinite-run problem in settings with both certainty and uncertainty of the energy rates. We also consider the optimization problem of identifying the minimal upper bound that will permit the existence of energy-constrained infinite runs. Our algorithms are based on quantifier elimination for linear real arithmetic. Using Mathematica and Mjollnir, we illustrate our framework through a real industrial example of a hydraulic oil pump. Compared with previous approaches our method is completely automated and provides improved results. △ Less

Submitted 3 May, 2018; v1 submitted 2 May, 2018; originally announced May 2018.

Comments: long version

arXiv:1207.1276 [pdf, ps, other]

Controllers with Minimal Observation Power (Application to Timed Systems)

Authors: Peter Bulychev, Franck Cassez, Alexandre David, Kim G. Larsen, Jean-Francois Raskin, Pierre-Alain Reynier

Abstract: We consider the problem of controller synthesis under imperfect information in a setting where there is a set of available observable predicates equipped with a cost function. The problem that we address is the computation of a subset of predicates sufficient for control and whose cost is minimal. Our solution avoids a full exploration of all possible subsets of predicates and reuses some informat… ▽ More We consider the problem of controller synthesis under imperfect information in a setting where there is a set of available observable predicates equipped with a cost function. The problem that we address is the computation of a subset of predicates sufficient for control and whose cost is minimal. Our solution avoids a full exploration of all possible subsets of predicates and reuses some information between different iterations. We apply our approach to timed systems. We have developed a tool prototype and analyze the performance of our optimization algorithm on two case studies. △ Less

Submitted 2 July, 2012; originally announced July 2012.

Comments: This is the full version of the ATVA'12 paper

arXiv:1207.0922 [pdf, ps, other]

MDM: A Mode Diagram Modeling Framework for Periodic Control Systems

Authors: Zheng Wang, Geguang Pu, Shenchao Qin, Jianwen Li, Kim G. Larsen, Jan Madsen, Bin Gu, Jifeng He

Abstract: Periodic control systems used in spacecrafts and automotives are usually period-driven and can be decomposed into different modes with each mode representing a system state observed from outside. Such systems may also involve intensive computing in their modes. Despite the fact that such control systems are widely used in the above-mentioned safety-critical embedded domains, there is lack of domai… ▽ More Periodic control systems used in spacecrafts and automotives are usually period-driven and can be decomposed into different modes with each mode representing a system state observed from outside. Such systems may also involve intensive computing in their modes. Despite the fact that such control systems are widely used in the above-mentioned safety-critical embedded domains, there is lack of domain-specific formal modelling languages for such systems in the relevant industry. To address this problem, we propose a formal visual modeling framework called MDM as a concise and precise way to specify and analyze such systems. To capture the temporal properties of periodic control systems, we provide, along with MDM, a property specification language based on interval logic for the description of concrete temporal requirements the engineers are concerned with. The statistical model checking technique can then be used to verify the MDM models against desired properties. To demonstrate the viability of our approach, we have applied our modelling framework to some real life case studies from industry and helped detect two design defects for some spacecraft control systems. △ Less

Submitted 4 July, 2012; originally announced July 2012.

arXiv:1203.6025 [pdf, ps, other]

A "Hybrid" Approach for Synthesizing Optimal Controllers of Hybrid Systems: A Case Study of the Oil Pump Industrial Example

Authors: Hengjun Zhao, Naijun Zhan, Deepak Kapur, Kim G. Larsen

Abstract: In this paper, we propose an approach to reduce the optimal controller synthesis problem of hybrid systems to quantifier elimination; furthermore, we also show how to combine quantifier elimination with numerical computation in order to make it more scalable but at the same time, keep arising errors due to discretization manageable and within bounds. A major advantage of our approach is not only t… ▽ More In this paper, we propose an approach to reduce the optimal controller synthesis problem of hybrid systems to quantifier elimination; furthermore, we also show how to combine quantifier elimination with numerical computation in order to make it more scalable but at the same time, keep arising errors due to discretization manageable and within bounds. A major advantage of our approach is not only that it avoids errors due to numerical computation, but it also gives a better optimal controller. In order to illustrate our approach, we use the real industrial example of an oil pump provided by the German company HYDAC within the European project Quasimodo as a case study throughout this paper, and show that our method improves (up to 7.5%) the results reported in [3] based on game theory and model checking. △ Less

Submitted 2 June, 2012; v1 submitted 26 March, 2012; originally announced March 2012.

Showing 1–10 of 10 results for author: Larsen, K G