Skip to main content

Showing 1–12 of 12 results for author: Tappler, M

.
  1. arXiv:2306.17204  [pdf, other

    cs.LG cs.FL

    Learning Environment Models with Continuous Stochastic Dynamics

    Authors: Martin Tappler, Edi Muškardin, Bernhard K. Aichernig, Bettina Könighofer

    Abstract: Solving control tasks in complex environments automatically through learning offers great potential. While contemporary techniques from deep reinforcement learning (DRL) provide effective solutions, their decision-making is not transparent. We aim to provide insights into the decisions faced by the agent by learning an automaton model of environmental behavior under the control of an agent. Howeve… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

  2. arXiv:2306.16854  [pdf, other

    cs.LG

    On the Relationship Between RNN Hidden State Vectors and Semantic Ground Truth

    Authors: Edi Muškardin, Martin Tappler, Ingo Pill, Bernhard K. Aichernig, Thomas Pock

    Abstract: We examine the assumption that the hidden-state vectors of recurrent neural networks (RNNs) tend to form clusters of semantically similar vectors, which we dub the clustering hypothesis. While this hypothesis has been assumed in the analysis of RNNs in recent years, its validity has not been studied thoroughly on modern neural network architectures. We examine the clustering hypothesis in the cont… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

  3. arXiv:2212.01861  [pdf, other

    cs.LG cs.LO

    Online Shielding for Reinforcement Learning

    Authors: Bettina Könighofer, Julian Rudolf, Alexander Palmisano, Martin Tappler, Roderick Bloem

    Abstract: Besides the recent impressive results on reinforcement learning (RL), safety is still one of the major research challenges in RL. RL is a machine-learning approach to determine near-optimal policies in Markov decision processes (MDPs). In this paper, we consider the setting where the safety-relevant fragment of the MDP together with a temporal logic safety specification is given and many safety vi… ▽ More

    Submitted 4 December, 2022; originally announced December 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2012.09539

  4. arXiv:2212.01838  [pdf, other

    cs.LG cs.LO

    Automata Learning meets Shielding

    Authors: Martin Tappler, Stefan Pranger, Bettina Könighofer, Edi Muškardin, Roderick Bloem, Kim Larsen

    Abstract: Safety is still one of the major research challenges in reinforcement learning (RL). In this paper, we address the problem of how to avoid safety violations of RL agents during exploration in probabilistic and partially unknown environments. Our approach combines automata learning for Markov Decision Processes (MDPs) and shield synthesis in an iterative approach. Initially, the MDP representing th… ▽ More

    Submitted 4 December, 2022; originally announced December 2022.

  5. arXiv:2206.11708  [pdf, other

    cs.LG cs.AI

    Reinforcement Learning under Partial Observability Guided by Learned Environment Models

    Authors: Edi Muskardin, Martin Tappler, Bernhard K. Aichernig, Ingo Pill

    Abstract: In practical applications, we can rarely assume full observability of a system's environment, despite such knowledge being important for determining a reactive control system's precise interaction with its environment. Therefore, we propose an approach for reinforcement learning (RL) in partially observable environments. While assuming that the environment behaves like a partially observable Marko… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

  6. arXiv:2205.04887  [pdf, other

    cs.LG cs.AI cs.SE

    Search-Based Testing of Reinforcement Learning

    Authors: Martin Tappler, Filip Cano Córdoba, Bernhard K. Aichernig, Bettina Könighofer

    Abstract: Evaluation of deep reinforcement learning (RL) is inherently challenging. Especially the opaqueness of learned policies and the stochastic nature of both agents and environments make testing the behavior of deep RL agents difficult. We present a search-based testing framework that enables a wide range of novel analysis capabilities for evaluating the safety and performance of deep RL agents. For s… ▽ More

    Submitted 14 May, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

    Comments: 11 pages, 15 figures, Accepted at IJCAI-ECAI 2022 (Main Track)

  7. arXiv:2012.09539  [pdf, other

    cs.LO

    Online Shielding for Stochastic Systems

    Authors: Bettina Könighofer, Julian Rudolf, Alexander Palmisano, Martin Tappler, Roderick Bloem

    Abstract: In this paper, we propose a method to develop trustworthy reinforcement learning systems. To ensure safety especially during exploration, we automatically synthesize a correct-by-construction runtime enforcer, called a shield, that blocks all actions that are unsafe with respect to a temporal logic specification from the agent. Our main contribution is a new synthesis algorithm for computing the s… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: 18 Pages, 6 Figures, under submission

  8. arXiv:2010.03842  [pdf, other

    cs.LO

    Adaptive Shielding under Uncertainty

    Authors: Stefan Pranger, Bettina Könighofer, Martin Tappler, Martin Deixelberger, Nils Jansen, Roderick Bloem

    Abstract: This paper targets control problems that exhibit specific safety and performance requirements. In particular, the aim is to ensure that an agent, operating under uncertainty, will at runtime strictly adhere to such requirements. Previous works create so-called shields that correct an existing controller for the agent if it is about to take unbearable safety risks. However, so far, shields do not c… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

    Comments: 8 pages, 6 figures, 1 table

  9. arXiv:1907.04708  [pdf, other

    cs.LG stat.ML

    Learning a Behavior Model of Hybrid Systems Through Combining Model-Based Testing and Machine Learning (Full Version)

    Authors: Bernhard K. Aichernig, Roderick Bloem, Masoud Ebrahimi, Martin Horn, Franz Pernkopf, Wolfgang Roth, Astrid Rupp, Martin Tappler, Markus Tranninger

    Abstract: Models play an essential role in the design process of cyber-physical systems. They form the basis for simulation and analysis and help in identifying design problems as early as possible. However, the construction of models that comprise physical and digital behavior is challenging. Therefore, there is considerable interest in learning such hybrid behavior by means of machine learning which requi… ▽ More

    Submitted 10 July, 2019; originally announced July 2019.

    Comments: This is an extended version of the conference paper "Learning a Behavior Model of Hybrid Systems Through Combining Model-Based Testing and Machine Learning" accepted for presentation at IFIP-ICTSS 2019, the 31st International Conference on Testing Software and Systems in Paris, France

  10. arXiv:1906.12239  [pdf, ps, other

    cs.LG stat.ML

    L*-Based Learning of Markov Decision Processes (Extended Version)

    Authors: Martin Tappler, Bernhard K. Aichernig, Giovanni Bacci, Maria Eichlseder, Kim G. Larsen

    Abstract: Automata learning techniques automatically generate system models from test observations. These techniques usually fall into two categories: passive and active. Passive learning uses a predetermined data set, e.g., system logs. In contrast, active learning actively queries the system under learning, which is considered more efficient. An influential active learning technique is Angluin's L* algo… ▽ More

    Submitted 28 June, 2019; originally announced June 2019.

    Comments: an extended version of a conference paper accepted for presentation at FM 2019, the 23rd international symposium on formal methods

  11. Model-Based Testing IoT Communication via Active Automata Learning

    Authors: Martin Tappler, Bernhard K. Aichernig, Roderick Bloem

    Abstract: This paper presents a learning-based approach to detecting failures in reactive systems. The technique is based on inferring models of multiple implementations of a common specification which are pair-wise cross-checked for equivalence. Any counterexample to equivalence is flagged as suspicious and has to be analysed manually. Hence, it is possible to find possible failures in a semi-automatic way… ▽ More

    Submitted 15 April, 2019; originally announced April 2019.

  12. arXiv:1808.07744  [pdf, other

    cs.SE

    Learning Timed Automata via Genetic Programming

    Authors: Martin Tappler, Bernhard K. Aichernig, Kim Guldstrand Larsen, Florian Lorber

    Abstract: Model learning has gained increasing interest in recent years. It derives behavioural models from test data of black-box systems. The main advantage offered by such techniques is that they enable model-based analysis without access to the internals of a system. Applications range from fully automated testing over model checking to system understanding. Current work focuses on learning variations o… ▽ More

    Submitted 15 February, 2019; v1 submitted 23 August, 2018; originally announced August 2018.

    Comments: added missing link