Search | arXiv e-print repository

Safe Exploration Method for Reinforcement Learning under Existence of Disturbance

Authors: Yoshihiro Okawa, Tomotake Sasaki, Hitoshi Yanami, Toru Namerikawa

Abstract: Recent rapid developments in reinforcement learning algorithms have been giving us novel possibilities in many fields. However, due to their exploring property, we have to take the risk into consideration when we apply those algorithms to safety-critical problems especially in real environments. In this study, we deal with a safe exploration problem in reinforcement learning under the existence of… ▽ More Recent rapid developments in reinforcement learning algorithms have been giving us novel possibilities in many fields. However, due to their exploring property, we have to take the risk into consideration when we apply those algorithms to safety-critical problems especially in real environments. In this study, we deal with a safe exploration problem in reinforcement learning under the existence of disturbance. We define the safety during learning as satisfaction of the constraint conditions explicitly defined in terms of the state and propose a safe exploration method that uses partial prior knowledge of a controlled object and disturbance. The proposed method assures the satisfaction of the explicit state constraints with a pre-specified probability even if the controlled object is exposed to a stochastic disturbance following a normal distribution. As theoretical results, we introduce sufficient conditions to construct conservative inputs not containing an exploring aspect used in the proposed method and prove that the safety in the above explained sense is guaranteed with the proposed method. Furthermore, we illustrate the validity and effectiveness of the proposed method through numerical simulations of an inverted pendulum and a four-bar parallel link robot manipulator. △ Less

Submitted 20 March, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

Comments: Accepted by the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD) 2022. The Version of Record is available at https://doi.org/10.1007/978-3-031-26412-2_9

arXiv:2002.10515 [pdf, other]

Improving Rate of Convergence via Gain Adaptation in Multi-Agent Distributed ADMM Framework

Authors: Towfiq Rahman, Zhihua Qu, Toru Namerikawa

Abstract: In this paper, the alternating direction method of multipliers (ADMM) is investigated for distributed optimization problems in a networked multi-agent system. In particular, a new adaptive-gain ADMM algorithm is derived in a closed form and under the standard convex property in order to greatly speed up convergence of ADMM-based distributed optimization. Using Lyapunov direct approach, the propose… ▽ More In this paper, the alternating direction method of multipliers (ADMM) is investigated for distributed optimization problems in a networked multi-agent system. In particular, a new adaptive-gain ADMM algorithm is derived in a closed form and under the standard convex property in order to greatly speed up convergence of ADMM-based distributed optimization. Using Lyapunov direct approach, the proposed solution embeds control gains into weighted network matrix among the agents and uses those weights as adaptive penalty gains in the augmented Lagrangian. It is shown that the proposed closed loop gain adaptation scheme significantly improves the convergence time of underlying ADMM optimization. Convergence analysis is provided and simulation results are included to demonstrate the effectiveness of the proposed scheme. △ Less

Submitted 24 February, 2020; originally announced February 2020.

arXiv:1907.08289 [pdf, other]

doi 10.1109/TPWRS.2020.3009630

Data-Driven Wide-Area Control Design of Power System Using the Passivity Shortage Framework

Authors: Ying Xu, Zhihua Qu, Roland Harvey, Toru Namerikawa

Abstract: A novel wide-area control design is presented to mitigate inter-area power frequency oscillations. A large-scale power system is decomposed into a network of passivity-short subsystems whose nonlinear interconnections have a state-dependent affine form, and by utilizing the passivity shortage framework a two-level design procedure is developed. At the lower level, any generator control can be view… ▽ More A novel wide-area control design is presented to mitigate inter-area power frequency oscillations. A large-scale power system is decomposed into a network of passivity-short subsystems whose nonlinear interconnections have a state-dependent affine form, and by utilizing the passivity shortage framework a two-level design procedure is developed. At the lower level, any generator control can be viewed as one that makes the generator passivity-short and $L_2$ stable, and the stability impact of the lower-level control on the overall system can be characterized in terms of two parameters. While the system is nonlinear, the impact parameters can be optimized by solving a data-driven matrix inequality (DMI), and the high-level wide-area control is then designed by solving another Lyapunov matrix inequality in terms of the design parameters. The proposed methodology makes the design modular, and the resulting control is adaptive with respect to operating conditions of the power system. A test system is used to illustrate the proposed design, including DMI and the wide-area control, and simulation results demonstrate effectiveness in dam** out inter-area oscillations. △ Less

Submitted 18 July, 2019; originally announced July 2019.

Comments: 12 pages, 9 figures

arXiv:1310.4962 [pdf, ps, other]

A New Dynamic Pricing Model based on Convex Hull Pricing

Authors: Naoki Ito, Akiko Takeda, Toru Namerikawa

Abstract: This paper presents a new dynamic pricing model (a.k.a. real-time pricing) that reflects startup costs of generators. Dynamic pricing, which is a method to control demand by pricing electricity at hourly (or more often) intervals, has been studied by many researchers. They assume that the cost functions of suppliers are convex, although they may be nonconvex because of the startup costs of generat… ▽ More This paper presents a new dynamic pricing model (a.k.a. real-time pricing) that reflects startup costs of generators. Dynamic pricing, which is a method to control demand by pricing electricity at hourly (or more often) intervals, has been studied by many researchers. They assume that the cost functions of suppliers are convex, although they may be nonconvex because of the startup costs of generators in practice. We provide a dynamic pricing model that takes into account such cost functions within the settings of unit commitment problems (UCPs). Our model gives convex hull price (CHP), which has not been used in the context of dynamic pricing, though it is known that the CHP minimizes the uplift payment which is disadvantageous to suppliers for a given demand. In addition, we apply an iterative algorithm based on the subgradient method to solve our model. Numerical experiments show the efficiency of our model on reducing uplift payments. The prices determined by our algorithm give sufficiently small uplift payments in a realistic computational time. △ Less

Submitted 4 March, 2014; v1 submitted 18 October, 2013; originally announced October 2013.

Showing 1–4 of 4 results for author: Namerikawa, T