Showing 1–2 of 2 results for author: Shahrooei, Z
-
Optimal Transport-Assisted Risk-Sensitive Q-Learning
Authors:
Zahra Shahrooei,
Ali Baheri
Abstract:
The primary goal of reinforcement learning is to develop decision-making policies that prioritize optimal performance without considering risk or safety. In contrast, safe reinforcement learning aims to mitigate or avoid unsafe states. This paper presents a risk-sensitive Q-learning algorithm that leverages optimal transport theory to enhance the agent safety. By integrating optimal transport into…
▽ More
The primary goal of reinforcement learning is to develop decision-making policies that prioritize optimal performance without considering risk or safety. In contrast, safe reinforcement learning aims to mitigate or avoid unsafe states. This paper presents a risk-sensitive Q-learning algorithm that leverages optimal transport theory to enhance the agent safety. By integrating optimal transport into the Q-learning framework, our approach seeks to optimize the policy's expected return while minimizing the Wasserstein distance between the policy's stationary distribution and a predefined risk distribution, which encapsulates safety preferences from domain experts. We validate the proposed algorithm in a Gridworld environment. The results indicate that our method significantly reduces the frequency of visits to risky states and achieves faster convergence to a stable policy compared to the traditional Q-learning algorithm.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Falsification of Learning-Based Controllers through Multi-Fidelity Bayesian Optimization
Authors:
Zahra Shahrooei,
Mykel J. Kochenderfer,
Ali Baheri
Abstract:
Simulation-based falsification is a practical testing method to increase confidence that the system will meet safety requirements. Because full-fidelity simulations can be computationally demanding, we investigate the use of simulators with different levels of fidelity. As a first step, we express the overall safety specification in terms of environmental parameters and structure this safety speci…
▽ More
Simulation-based falsification is a practical testing method to increase confidence that the system will meet safety requirements. Because full-fidelity simulations can be computationally demanding, we investigate the use of simulators with different levels of fidelity. As a first step, we express the overall safety specification in terms of environmental parameters and structure this safety specification as an optimization problem. We propose a multi-fidelity falsification framework using Bayesian optimization, which is able to determine at which level of fidelity we should conduct a safety evaluation in addition to finding possible instances from the environment that cause the system to fail. This method allows us to automatically switch between inexpensive, inaccurate information from a low-fidelity simulator and expensive, accurate information from a high-fidelity simulator in a cost-effective way. Our experiments on various environments in simulation demonstrate that multi-fidelity Bayesian optimization has falsification performance comparable to single-fidelity Bayesian optimization but with much lower cost.
△ Less
Submitted 28 April, 2023; v1 submitted 28 December, 2022;
originally announced December 2022.