Search | arXiv e-print repository

doi 10.1007/978-3-031-46002-9_3

Shielded Reinforcement Learning for Hybrid Systems

Authors: Asger Horn Brorholt, Peter Gjøl Jensen, Kim Guldstrand Larsen, Florian Lorber, Christian Schilling

Abstract: Safe and optimal controller synthesis for switched-controlled hybrid systems, which combine differential equations and discrete changes of the system's state, is known to be intricately hard. Reinforcement learning has been leveraged to construct near-optimal controllers, but their behavior is not guaranteed to be safe, even when it is encouraged by reward engineering. One way of imposing safety t… ▽ More Safe and optimal controller synthesis for switched-controlled hybrid systems, which combine differential equations and discrete changes of the system's state, is known to be intricately hard. Reinforcement learning has been leveraged to construct near-optimal controllers, but their behavior is not guaranteed to be safe, even when it is encouraged by reward engineering. One way of imposing safety to a learned controller is to use a shield, which is correct by design. However, obtaining a shield for non-linear and hybrid environments is itself intractable. In this paper, we propose the construction of a shield using the so-called barbaric method, where an approximate finite representation of an underlying partition-based two-player safety game is extracted via systematically picked samples of the true transition function. While hard safety guarantees are out of reach, we experimentally demonstrate strong statistical safety guarantees with a prototype implementation and UPPAAL STRATEGO. Furthermore, we study the impact of the synthesized shield when applied as either a pre-shield (applied before learning a controller) or a post-shield (only applied after learning a controller). We experimentally demonstrate superiority of the pre-shielding approach. We apply our technique on a range of case studies, including two industrial examples, and further study post-optimization of the post-shielding approach. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Journal ref: AISoLA 2023

arXiv:2302.04529 [pdf, other]

Timed I/O Automata: It is never too late to complete your timed specification theory

Authors: Martijn A. Goorden, Kim G. Larsen, Axel Legay, Florian Lorber, Ulrik Nyman, Andrzej Wasowski

Abstract: A specification theory combines notions of specifications and implementations with a satisfaction relation, a refinement relation and a set of operators supporting stepwise design. We develop a complete specification framework for real-time systems using Timed I/O Automata as the specification formalism, with the semantics expressed in terms of Timed I/O Transition Systems. We provide constructs f… ▽ More A specification theory combines notions of specifications and implementations with a satisfaction relation, a refinement relation and a set of operators supporting stepwise design. We develop a complete specification framework for real-time systems using Timed I/O Automata as the specification formalism, with the semantics expressed in terms of Timed I/O Transition Systems. We provide constructs for refinement, consistency checking, logical and structural composition, and quotient of specifications -- all indispensable ingredients of a compositional design methodology. The theory is backed by rigorous proofs and is being implemented in the open-source tool ECDAR. △ Less

Submitted 13 July, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

Comments: Version submitted for review

arXiv:2006.16688 [pdf, other]

It's Time to Play Safe: Shield Synthesis for Timed Systems

Authors: Roderick Bloem, Peter Gjøl Jensen, Bettina Könighofer, Kim Guldstrand Larsen, Florian Lorber, Alexander Palmisano

Abstract: Erroneous behaviour in safety critical real-time systems may inflict serious consequences. In this paper, we show how to synthesize timed shields from timed safety properties given as timed automata. A timed shield enforces the safety of a running system while interfering with the system as little as possible. We present timed post-shields and timed pre-shields. A timed pre-shield is placed before… ▽ More Erroneous behaviour in safety critical real-time systems may inflict serious consequences. In this paper, we show how to synthesize timed shields from timed safety properties given as timed automata. A timed shield enforces the safety of a running system while interfering with the system as little as possible. We present timed post-shields and timed pre-shields. A timed pre-shield is placed before the system and provides a set of safe outputs. This set restricts the choices of the system. A timed post-shield is implemented after the system. It monitors the system and corrects the system's output only if necessary. We further extend the timed post-shield construction to provide a guarantee on the recovery phase, i.e., the time between a specification violation and the point at which full control can be handed back to the system. In our experimental results, we use timed post-shields to ensure the safety in a reinforcement learning setting for controlling a platoon of cars, during the learning and execution phase, and study the effect. △ Less

Submitted 30 June, 2020; originally announced June 2020.

Comments: Submitted to RV2020

arXiv:1902.03776 [pdf, other]

COST Action IC 1402 ArVI: Runtime Verification Beyond Monitoring -- Activity Report of Working Group 1

Authors: Wolfgang Ahrendt, Cyrille Artho, Christian Colombo, Yliès Falcone, Srdan Krstic, Martin Leucker, Florian Lorber, Joao Lourenço, Leonardo Mariani, César Sánchez, Gerardo Schneider, Volker Stolz

Abstract: This report presents the activities of the first working group of the COST Action ArVI, Runtime Verification beyond Monitoring. The report aims to provide an overview of some of the major core aspects involved in Runtime Verification. Runtime Verification is the field of research dedicated to the analysis of system executions. It is often seen as a discipline that studies how a system run satisfie… ▽ More This report presents the activities of the first working group of the COST Action ArVI, Runtime Verification beyond Monitoring. The report aims to provide an overview of some of the major core aspects involved in Runtime Verification. Runtime Verification is the field of research dedicated to the analysis of system executions. It is often seen as a discipline that studies how a system run satisfies or violates correctness properties. The report exposes a taxonomy of Runtime Verification (RV) presenting the terminology involved with the main concepts of the field. The report also develops the concept of instrumentation, the various ways to instrument systems, and the fundamental role of instrumentation in designing an RV framework. We also discuss how RV interplays with other verification techniques such as model-checking, deductive verification, model learning, testing, and runtime assertion checking. Finally, we propose challenges in monitoring quantitative and statistical data beyond detecting property violation. △ Less

Submitted 11 February, 2019; originally announced February 2019.

arXiv:1809.03889 [pdf, other]

doi 10.4204/EPTCS.277.11

Effortless Fault Localisation: Conformance Testing of Real-Time Systems in Ecdar

Authors: Tobias R. Gundersen, Florian Lorber, Ulrik Nyman, Christian Ovesen

Abstract: Model checking of real-time systems has evolved throughout the years. Recently, the model checker Ecdar, using timed I/O automata, was used to perform compositional verification. However, in order to fully integrate model checking of real-time systems into industrial development, we need a productive and reliable way to test if such a system conforms to its corresponding model. Hence, we present a… ▽ More Model checking of real-time systems has evolved throughout the years. Recently, the model checker Ecdar, using timed I/O automata, was used to perform compositional verification. However, in order to fully integrate model checking of real-time systems into industrial development, we need a productive and reliable way to test if such a system conforms to its corresponding model. Hence, we present an extension of Ecdar that integrates conformance testing into a new IDE that now features modelling, verification, and testing. The new tool uses model-based mutation testing, requiring only the model and the system under test, to locate faults and to prove the absence of certain types of faults. It supports testing using either real-time or simulated time. It parallelises test-case generation and test execution to provide a significant speed-up. We also introduce new mutation operators that improve the ability to detect and locate faults. Finally, we conduct a case study with 140 faulty systems, where Ecdar detects all faults. △ Less

Submitted 9 September, 2018; originally announced September 2018.

Comments: In Proceedings GandALF 2018, arXiv:1809.02416

Journal ref: EPTCS 277, 2018, pp. 147-160

arXiv:1808.07744 [pdf, other]

Learning Timed Automata via Genetic Programming

Authors: Martin Tappler, Bernhard K. Aichernig, Kim Guldstrand Larsen, Florian Lorber

Abstract: Model learning has gained increasing interest in recent years. It derives behavioural models from test data of black-box systems. The main advantage offered by such techniques is that they enable model-based analysis without access to the internals of a system. Applications range from fully automated testing over model checking to system understanding. Current work focuses on learning variations o… ▽ More Model learning has gained increasing interest in recent years. It derives behavioural models from test data of black-box systems. The main advantage offered by such techniques is that they enable model-based analysis without access to the internals of a system. Applications range from fully automated testing over model checking to system understanding. Current work focuses on learning variations of finite state machines. However, most techniques consider discrete time. In this paper, we present a method for learning timed automata, finite state machines extended with real-valued clocks. The learning method generates a model consistent with a set of timed traces collected by testing. This generation is based on genetic programming, a search-based technique for automatic program creation. We evaluate our approach on 44 timed systems, comprising four systems from the literature and 40 randomly generated examples. △ Less

Submitted 15 February, 2019; v1 submitted 23 August, 2018; originally announced August 2018.

Comments: added missing link

arXiv:1508.03575 [pdf, other]

Bounded Determinization of Timed Automata with Silent Transitions

Authors: Florian Lorber, Amnon Rosenmann, Dejan Nickovic, Bernhard Aichernig

Abstract: Deterministic timed automata are strictly less expressive than their non-deterministic counterparts, which are again less expressive than those with silent transitions. As a consequence, timed automata are in general non-determinizable. This is unfortunate since deterministic automata play a major role in model-based testing, observability and implementability. However, by bounding the length of t… ▽ More Deterministic timed automata are strictly less expressive than their non-deterministic counterparts, which are again less expressive than those with silent transitions. As a consequence, timed automata are in general non-determinizable. This is unfortunate since deterministic automata play a major role in model-based testing, observability and implementability. However, by bounding the length of the traces in the automaton, effective determinization becomes possible. We propose a novel procedure for bounded determinization of timed automata. The procedure unfolds the automata to bounded trees, removes all silent transitions and determinizes via disjunction of guards. The proposed algorithms are optimized to the bounded setting and thus are more efficient and can handle a larger class of timed automata than the general algorithms. The approach is implemented in a prototype tool and evaluated on several examples. To our best knowledge, this is the first implementation of this type of procedure for timed automata. △ Less

Submitted 14 August, 2015; originally announced August 2015.

Comments: 25 pages

Showing 1–7 of 7 results for author: Lorber, F