-
Physics-informed RL for Maximal Safety Probability Estimation
Authors:
Hikaru Hoshino,
Yorie Nakahira
Abstract:
Accurate risk quantification and reachability analysis are crucial for safe control and learning, but sampling from rare events, risky states, or long-term trajectories can be prohibitively costly. Motivated by this, we study how to estimate the long-term safety probability of maximally safe actions without sufficient coverage of samples from risky states and long-term trajectories. The use of max…
▽ More
Accurate risk quantification and reachability analysis are crucial for safe control and learning, but sampling from rare events, risky states, or long-term trajectories can be prohibitively costly. Motivated by this, we study how to estimate the long-term safety probability of maximally safe actions without sufficient coverage of samples from risky states and long-term trajectories. The use of maximal safety probability in control and learning is expected to avoid conservative behaviors due to over-approximation of risk. Here, we first show that long-term safety probability, which is multiplicative in time, can be converted into additive costs and be solved using standard reinforcement learning methods. We then derive this probability as solutions of partial differential equations (PDEs) and propose Physics-Informed Reinforcement Learning (PIRL) algorithm. The proposed method can learn using sparse rewards because the physics constraints help propagate risk information through neighbors. This suggests that, for the purpose of extracting more information for efficient learning, physics constraints can serve as an alternative to reward sha**. The proposed method can also estimate long-term risk using short-term samples and deduce the risk of unsampled states. This feature is in stark contrast with the unconstrained deep RL that demands sufficient data coverage. These merits of the proposed method are demonstrated in numerical simulation.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
Iterative Linear Quadratic Regulator With Variational Equation-Based Discretization
Authors:
Katsuya Shigematsu,
Hikaru Hoshino,
Eiko Furutani
Abstract:
This paper discusses discretization methods for implementing nonlinear model predictive controllers using Iterative Linear Quadratic Regulator (ILQR). Finite-difference approximations are mostly used to derive a discrete-time state equation from the original continuous-time model. However, the timestep of the discretization is sometimes restricted to be small to suppress the approximation error. I…
▽ More
This paper discusses discretization methods for implementing nonlinear model predictive controllers using Iterative Linear Quadratic Regulator (ILQR). Finite-difference approximations are mostly used to derive a discrete-time state equation from the original continuous-time model. However, the timestep of the discretization is sometimes restricted to be small to suppress the approximation error. In this paper, we propose to use the variational equation for deriving linearizations of the discretized system required in ILQR algorithms, which allows accurate computation regardless of the timestep. Numerical simulations of the swing-up control of an inverted pendulum demonstrate the effectiveness of this method. By the relaxing stringent requirement for the size of the timestep, the use of the variational equation can improve control performance by increasing the number of ILQR iterations possible at each timestep in the realtime computation.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
Screening Curve Method for Economic Analysis of Household Solar Energy Self-Consumption
Authors:
Hikaru Hoshino,
Yosuke Irie,
Eiko Furutani
Abstract:
The profitability of solar energy self-consumption in households, the so-called photovoltaic (PV) self-consumption, is expected to boost the deployment of PV and battery storage systems. This paper develops a novel method for economic analysis of PV self-consumption using battery storage based on an extension of the Screening Curve Method (SCM). The SCM enables quick and intuitive estimation of th…
▽ More
The profitability of solar energy self-consumption in households, the so-called photovoltaic (PV) self-consumption, is expected to boost the deployment of PV and battery storage systems. This paper develops a novel method for economic analysis of PV self-consumption using battery storage based on an extension of the Screening Curve Method (SCM). The SCM enables quick and intuitive estimation of the least-cost generation mix for a target load curve and has been used for generation planning for bulk power systems. In this paper, we generalize the framework of existing SCM to take into account the intermittent nature of renewable energy sources and apply it to the problem of optimal sizing of PV and battery storage systems for a household. Numerical studies are provided to verify the estimation accuracy of the proposed SCM and to illustrate its effectiveness in a sensitivity analysis, owing to its ability to show intuitive plots of cost curves for researchers or policy-makers to understand the reasons behind the optimization results.
△ Less
Submitted 12 January, 2024; v1 submitted 31 July, 2023;
originally announced August 2023.
-
Model Predictive Control of Smart Districts Participating in Frequency Regulation Market: A Case Study of Using Heating Network Storage
Authors:
Hikaru Hoshino,
T. John Koo,
Yun-Chung Chu,
Yoshihiko Susuki
Abstract:
Flexibility provided by Combined Heat and Power (CHP) units in district heating networks is an important means to cope with increasing penetration of intermittent renewable energy resources, and various methods have been proposed to exploit thermal storage tanks installed in these networks. This paper studies a novel problem motivated by an example of district heating and cooling networks in Japan…
▽ More
Flexibility provided by Combined Heat and Power (CHP) units in district heating networks is an important means to cope with increasing penetration of intermittent renewable energy resources, and various methods have been proposed to exploit thermal storage tanks installed in these networks. This paper studies a novel problem motivated by an example of district heating and cooling networks in Japan, where high-temperature steam is used as the heating medium. In steam-based networks, storage tanks are usually absent, and there is a strong need to utilize thermal inertia of the pipeline network as storage. However, this type of use of a heating network directly affects the operating condition of the network, and assuring safety and supply quality at the use side is an open problem. To address this, we formulate a novel control problem to utilize CHP units in frequency regulation market while satisfying physical constraints on a steam network described by a nonlinear model capturing dynamics of heat flows and heat accumulation in the network. Furthermore, a Model Predictive Control (MPC) framework is proposed to solve this problem. By consistently combining several nonlinear control techniques, a computationally efficient MPC controller is obtained and shown to work in real-time.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.
-
Simultaneous Modeling of In Vivo and In Vitro Effects of Nondepolarizing Neuromuscular Blocking Drugs
Authors:
Hikaru Hoshino,
Eiko Furutani
Abstract:
Nondepolarizing neuromuscular blocking drugs (NDNBs) are clinically used to produce muscle relaxation during general anesthesia. This paper explores a suitable model structure to simultaneously describe in vivo and in vitro effects of three clinically used NDNBs, cisatracurium, vecuronium, and rocuronium. In particular, it is discussed how to reconcile an apparent discrepancy that rocuronium is le…
▽ More
Nondepolarizing neuromuscular blocking drugs (NDNBs) are clinically used to produce muscle relaxation during general anesthesia. This paper explores a suitable model structure to simultaneously describe in vivo and in vitro effects of three clinically used NDNBs, cisatracurium, vecuronium, and rocuronium. In particular, it is discussed how to reconcile an apparent discrepancy that rocuronium is less potent at inducing muscle relaxation in vivo than predicted from in vitro experiments. We develop a framework for estimating model parameters from published in vivo and in vitro data, and thereby compare the descriptive abilities of several candidate models. It is found that modeling of dynamic effect of activation of acetylcholine receptors (AChRs) is essential for describing in vivo experimental results, and a cyclic gating scheme of AChRs is suggested to be appropriate. Furthermore, it is shown that the above discrepancy in experimental results can be resolved when we consider the fact that the in vivo concentration of ACh is quite low to activate only a part of AChRs, whereas more than 95% of AChRs are activated during in vitro experiments, and that the site-selectivity is smaller for rocuronium than those for cisatracurium and vecuronium.
△ Less
Submitted 12 January, 2024; v1 submitted 11 May, 2023;
originally announced May 2023.