Skip to main content

Showing 1–39 of 39 results for author: Li, S E

Searching in archive eess. Search in all archives.
.
  1. arXiv:2405.09317  [pdf, other

    eess.SY

    Controllability Test for Nonlinear Datatic Systems

    Authors: Yujie Yang, Letian Tao, Likun Wang, Shengbo Eben Li

    Abstract: Controllability is a fundamental property of control systems, serving as the prerequisite for controller design. While controllability test is well established in modelic (i.e., model-driven) control systems, extending it to datatic (i.e., data-driven) control systems is still a challenging task due to the absence of system models. In this study, we propose a general controllability test method fo… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  2. arXiv:2404.10064  [pdf, other

    eess.SY

    The Feasibility of Constrained Reinforcement Learning Algorithms: A Tutorial Study

    Authors: Yujie Yang, Zhilong Zheng, Shengbo Eben Li, Masayoshi Tomizuka, Changliu Liu

    Abstract: Satisfying safety constraints is a priority concern when solving optimal control problems (OCPs). Due to the existence of infeasibility phenomenon, where a constraint-satisfying solution cannot be found, it is necessary to identify a feasible region before implementing a policy. Existing feasibility theories built for model predictive control (MPC) only consider the feasibility of optimal policy.… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  3. arXiv:2404.00481  [pdf, other

    stat.ML cs.LG eess.SY

    Convolutional Bayesian Filtering

    Authors: Wenhan Cao, Shiqi Liu, Chang Liu, Zeyu He, Stephen S. -T. Yau, Shengbo Eben Li

    Abstract: Bayesian filtering serves as the mainstream framework of state estimation in dynamic systems. Its standard version utilizes total probability rule and Bayes' law alternatively, where how to define and compute conditional probability is critical to state distribution inference. Previously, the conditional probability is assumed to be exactly known, which represents a measure of the occurrence proba… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  4. arXiv:2403.01768  [pdf, other

    eess.SY cs.AI

    Canonical Form of Datatic Description in Control Systems

    Authors: Guojian Zhan, Ziang Zheng, Shengbo Eben Li

    Abstract: The design of feedback controllers is undergoing a paradigm shift from modelic (i.e., model-driven) control to datatic (i.e., data-driven) control. Canonical form of state space model is an important concept in modelic control systems, exemplified by Jordan form, controllable form and observable form, whose purpose is to facilitate system analysis and controller synthesis. In the realm of datatic… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  5. arXiv:2401.16793  [pdf, other

    eess.SY

    On the Stability of Datatic Control Systems

    Authors: Yujie Yang, Zhilong Zheng, Shengbo Eben Li

    Abstract: The development of feedback controllers is undergoing a paradigm shift from $\textit{modelic}$ (model-driven) control to $\textit{datatic}$ (data-driven) control. Stability, as a fundamental property in control, is less well studied in datatic control paradigm. The difficulty is that traditional stability criteria rely on explicit system models, which are not available in those systems with datati… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  6. arXiv:2310.19022  [pdf, other

    math.OC cs.LG eess.SY

    Optimization Landscape of Policy Gradient Methods for Discrete-time Static Output Feedback

    Authors: **gliang Duan, Jie Li, Xuyang Chen, Kai Zhao, Shengbo Eben Li, Lin Zhao

    Abstract: In recent times, significant advancements have been made in delving into the optimization landscape of policy gradient methods for achieving optimal control in linear time-invariant (LTI) systems. Compared with state-feedback control, output-feedback control is more prevalent since the underlying state of the system may not be fully observed in many practical settings. This paper analyzes the opti… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Journal ref: IEEE Transactions on Cybernetics, 2023

  7. arXiv:2310.05858  [pdf, other

    cs.LG eess.SY

    DSAC-T: Distributional Soft Actor-Critic with Three Refinements

    Authors: **gliang Duan, Wenxuan Wang, Liming Xiao, Jiaxin Gao, Shengbo Eben Li

    Abstract: Reinforcement learning (RL) has proven to be highly effective in tackling complex decision-making and control tasks. However, prevalent model-free RL methods often face severe performance degradation due to the well-known overestimation issue. In response to this problem, we recently introduced an off-policy RL algorithm, called distributional soft actor-critic (DSAC or DSAC-v1), which can effecti… ▽ More

    Submitted 28 December, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

  8. arXiv:2309.09734  [pdf, other

    eess.SY

    Learning Optimal Robust Control of Connected Vehicles in Mixed Traffic Flow

    Authors: Jie Li, Jiawei Wang, Shengbo Eben Li, Keqiang Li

    Abstract: Connected and automated vehicles (CAVs) technologies promise to attenuate undesired traffic disturbances. However, in mixed traffic where human-driven vehicles (HDVs) also exist, the nonlinear human-driving behavior has brought critical challenges for effective CAV control. This paper employs the policy iteration method to learn the optimal robust controller for nonlinear mixed traffic systems. Pr… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  9. arXiv:2304.08845  [pdf, other

    cs.LG eess.SY

    Feasible Policy Iteration

    Authors: Yujie Yang, Zhilong Zheng, Shengbo Eben Li, **gliang Duan, **g**g Liu, Xianyuan Zhan, Ya-Qin Zhang

    Abstract: Safe reinforcement learning (RL) aims to find the optimal policy and its feasible region in a constrained optimal control problem (OCP). Ensuring feasibility and optimality simultaneously has been a major challenge. Existing methods either attempt to solve OCPs directly with constrained optimization algorithms, leading to unstable training processes and unsatisfactory feasibility, or restrict poli… ▽ More

    Submitted 28 January, 2024; v1 submitted 18 April, 2023; originally announced April 2023.

  10. arXiv:2210.07553  [pdf, other

    cs.RO cs.LG eess.SY

    Safe Model-Based Reinforcement Learning with an Uncertainty-Aware Reachability Certificate

    Authors: Dongjie Yu, Wenjun Zou, Yujie Yang, Haitong Ma, Shengbo Eben Li, **gliang Duan, Jianyu Chen

    Abstract: Safe reinforcement learning (RL) that solves constraint-satisfactory policies provides a promising way to the broader safety-critical applications of RL in real-world problems such as robotics. Among all safe RL approaches, model-based methods reduce training time violations further due to their high sample efficiency. However, lacking safety robustness against the model uncertainties remains an i… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: 12 pages, 6 figures

  11. arXiv:2210.02166  [pdf, other

    eess.SY

    Robust Bayesian Inference for Moving Horizon Estimation

    Authors: Wenhan Cao, Chang Liu, Zhiqian Lan, Shengbo Eben Li, Wei Pan, Angelo Alessandri

    Abstract: The accuracy of moving horizon estimation (MHE) suffers significantly in the presence of measurement outliers. Existing methods address this issue by treating measurements leading to large MHE cost function values as outliers, which are subsequently discarded. This strategy, achieved through solving combinatorial optimization problems, is confined to linear systems to guarantee computational tract… ▽ More

    Submitted 2 October, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: 17 pages

  12. arXiv:2209.04854  [pdf, other

    eess.SY cs.LG

    Performance-Driven Controller Tuning via Derivative-Free Reinforcement Learning

    Authors: Yuheng Lei, Jianyu Chen, Shengbo Eben Li, Sifa Zheng

    Abstract: Choosing an appropriate parameter set for the designed controller is critical for the final performance but usually requires a tedious and careful tuning process, which implies a strong need for automatic tuning methods. However, among existing methods, derivative-free ones suffer from poor scalability or low efficiency, while gradient-based ones are often unavailable due to possibly non-different… ▽ More

    Submitted 11 September, 2022; originally announced September 2022.

    Comments: Accepted by the 61st IEEE Conference on Decision and Control (CDC), 2022. Copyright @IEEE

  13. arXiv:2204.04403  [pdf, other

    cs.RO eess.SY

    Improve Generalization of Driving Policy at Signalized Intersections with Adversarial Learning

    Authors: Yangang Ren, Guojian Zhan, Liye Tang, Shengbo Eben Li, Jianhua Jiang, **gliang Duan

    Abstract: Intersections are quite challenging among various driving scenes wherein the interaction of signal lights and distinct traffic actors poses great difficulty to learn a wise and robust driving policy. Current research rarely considers the diversity of intersections and stochastic behaviors of traffic participants. For practical applications, the randomness usually leads to some devastating events,… ▽ More

    Submitted 9 April, 2022; originally announced April 2022.

  14. arXiv:2204.02857  [pdf, other

    eess.SY

    Primal-dual Estimator Learning: an Offline Constrained Moving Horizon Estimation Method with Feasibility and Near-optimality Guarantees

    Authors: Wenhan Cao, **gliang Duan, Shengbo Eben Li, Chen Chen, Chang Liu, Yu Wang

    Abstract: This paper proposes a primal-dual framework to learn a stable estimator for linear constrained estimation problems leveraging the moving horizon approach. To avoid the online computational burden in most existing methods, we learn a parameterized function offline to approximate the primal estimate. Meanwhile, a dual estimator is trained to check the suboptimality of the primal estimator during exe… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

  15. arXiv:2201.12518  [pdf, other

    cs.LG cs.AI eess.SY

    Zeroth-Order Actor-Critic

    Authors: Yuheng Lei, Jianyu Chen, Shengbo Eben Li, Sifa Zheng

    Abstract: The recent advanced evolution-based zeroth-order optimization methods and the policy gradient-based first-order methods are two promising alternatives to solve reinforcement learning (RL) problems with complementary advantages. The former methods work with arbitrary policies, drive state-dependent and temporally-extended exploration, possess robustness-seeking property, but suffer from high sample… ▽ More

    Submitted 11 June, 2022; v1 submitted 29 January, 2022; originally announced January 2022.

  16. arXiv:2111.12953  [pdf, other

    cs.LG cs.AI cs.RO eess.SY

    Learn Zero-Constraint-Violation Policy in Model-Free Constrained Reinforcement Learning

    Authors: Haitong Ma, Changliu Liu, Shengbo Eben Li, Sifa Zheng, Wenchao Sun, Jianyu Chen

    Abstract: In the trial-and-error mechanism of reinforcement learning (RL), a notorious contradiction arises when we expect to learn a safe policy: how to learn a safe policy without enough data and prior model about the dangerous region? Existing methods mostly use the posterior penalty for dangerous actions, which means that the agent is not penalized until experiencing danger. This fact causes that the ag… ▽ More

    Submitted 25 November, 2021; originally announced November 2021.

  17. Optimization Landscape of Gradient Descent for Discrete-time Static Output Feedback

    Authors: **gliang Duan, Jie Li, Shengbo Eben Li, Lin Zhao

    Abstract: In this paper, we analyze the optimization landscape of gradient descent methods for static output feedback (SOF) control of discrete-time linear time-invariant systems with quadratic cost. The SOF setting can be quite common, for example, when there are unmodeled hidden states in the underlying process. We first establish several important properties of the SOF cost function, including coercivity… ▽ More

    Submitted 10 March, 2022; v1 submitted 27 September, 2021; originally announced September 2021.

    Journal ref: 2022 American Control Conference (ACC)

  18. arXiv:2109.05540  [pdf, other

    cs.RO eess.SY

    Encoding Distributional Soft Actor-Critic for Autonomous Driving in Multi-lane Scenarios

    Authors: **gliang Duan, Yangang Ren, Fawang Zhang, Yang Guan, Dongjie Yu, Shengbo Eben Li, Bo Cheng, Lin Zhao

    Abstract: In this paper, we propose a new reinforcement learning (RL) algorithm, called encoding distributional soft actor-critic (E-DSAC), for decision-making in autonomous driving. Unlike existing RL-based decision-making methods, E-DSAC is suitable for situations where the number of surrounding vehicles is variable and eliminates the requirement for manually pre-designed sorting rules, resulting in highe… ▽ More

    Submitted 12 September, 2021; originally announced September 2021.

  19. Integrated Decision and Control at Multi-Lane Intersections with Mixed Traffic Flow

    Authors: Jianhua Jiang, Yangang Ren, Yang Guan, Shengbo Eben Li, Yuming Yin, ** **

    Abstract: Autonomous driving at intersections is one of the most complicated and accident-prone traffic scenarios, especially with mixed traffic participants such as vehicles, bicycles and pedestrians. The driving policy should make safe decisions to handle the dynamic traffic conditions and meet the requirements of on-board computation. However, most of the current researches focuses on simplified intersec… ▽ More

    Submitted 30 August, 2021; originally announced August 2021.

    Comments: 8 pages, 10 figures, 11 equations and 14 conferences

  20. arXiv:2108.11623  [pdf, other

    cs.LG cs.RO eess.SY

    Model-based Chance-Constrained Reinforcement Learning via Separated Proportional-Integral Lagrangian

    Authors: Baiyu Peng, **gliang Duan, Jianyu Chen, Shengbo Eben Li, Gen** Xie, Congsheng Zhang, Yang Guan, Yao Mu, Enxin Sun

    Abstract: Safety is essential for reinforcement learning (RL) applied in the real world. Adding chance constraints (or probabilistic constraints) is a suitable way to enhance RL safety under uncertainty. Existing chance-constrained RL methods like the penalty methods and the Lagrangian methods either exhibit periodic oscillations or learn an over-conservative or unsafe policy. In this paper, we address thes… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

  21. Fixed-Dimensional and Permutation Invariant State Representation of Autonomous Driving

    Authors: **gliang Duan, Dongjie Yu, Shengbo Eben Li, Wenxuan Wang, Yangang Ren, Ziyu Lin, Bo Cheng

    Abstract: In this paper, we propose a new state representation method, called encoding sum and concatenation (ESC), for the state representation of decision-making in autonomous driving. Unlike existing state representation methods, ESC is applicable to a variable number of surrounding vehicles and eliminates the need for manually pre-designed sorting rules, leading to higher representation ability and gene… ▽ More

    Submitted 4 March, 2022; v1 submitted 24 May, 2021; originally announced May 2021.

    Journal ref: IEEE Transactions on Intelligent Transportation Systems, 2021

  22. arXiv:2103.05505  [pdf

    eess.SY cs.LG

    Approximate Optimal Filter for Linear Gaussian Time-invariant Systems

    Authors: Kaiming Tang, Shengbo Eben Li, Yuming Yin, Yang Guan, **gliang Duan, Wenhan Cao, Jie Li

    Abstract: State estimation is critical to control systems, especially when the states cannot be directly measured. This paper presents an approximate optimal filter, which enables to use policy iteration technique to obtain the steady-state gain in linear Gaussian time-invariant systems. This design transforms the optimal filtering problem with minimum mean square error into an optimal control problem, call… ▽ More

    Submitted 9 March, 2021; originally announced March 2021.

  23. arXiv:2102.13304  [pdf

    eess.SY

    Feasibility Enhancement of Constrained Receding Horizon Control Using Generalized Control Barrier Function

    Authors: Haitong Ma, Xiangteng Zhang, Shengbo Eben Li, Ziyu Lin, Yao Lyu, Sifa Zheng

    Abstract: Receding horizon control (RHC) is a popular procedure to deal with optimal control problems. Due to the existence of state constraints, optimization-based RHC often suffers the notorious issue of infeasibility, which strongly shrinks the region of controllable state. This paper proposes a generalized control barrier function (CBF) to enlarge the feasible region of constrained RHC with only a one-s… ▽ More

    Submitted 26 February, 2021; originally announced February 2021.

  24. arXiv:2102.11736  [pdf, other

    eess.SY cs.AI

    Recurrent Model Predictive Control

    Authors: Zhengyu Liu, **gliang Duan, Wenxuan Wang, Shengbo Eben Li, Yuming Yin, Ziyu Lin, Qi Sun, Bo Cheng

    Abstract: This paper proposes an off-line algorithm, called Recurrent Model Predictive Control (RMPC), to solve general nonlinear finite-horizon optimal control problems. Unlike traditional Model Predictive Control (MPC) algorithms, it can make full use of the current computing resources and adaptively select the longest model prediction horizon. Our algorithm employs a recurrent function to approximate the… ▽ More

    Submitted 23 February, 2021; originally announced February 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2102.10289

  25. Recurrent Model Predictive Control: Learning an Explicit Recurrent Controller for Nonlinear Systems

    Authors: Zhengyu Liu, **gliang Duan, Wenxuan Wang, Shengbo Eben Li, Yuming Yin, Ziyu Lin, Bo Cheng

    Abstract: This paper proposes an offline control algorithm, called Recurrent Model Predictive Control (RMPC), to solve large-scale nonlinear finite-horizon optimal control problems. It can be regarded as an explicit solver of traditional Model Predictive Control (MPC) algorithms, which can adaptively select appropriate model prediction horizon according to current computing resources, so as to improve the p… ▽ More

    Submitted 8 April, 2022; v1 submitted 20 February, 2021; originally announced February 2021.

    Journal ref: IEEE Transactions on Industrial Electronics, 2022

  26. arXiv:2102.08539  [pdf, other

    cs.LG cs.AI eess.SY

    Separated Proportional-Integral Lagrangian for Chance Constrained Reinforcement Learning

    Authors: Baiyu Peng, Yao Mu, **gliang Duan, Yang Guan, Shengbo Eben Li, Jianyu Chen

    Abstract: Safety is essential for reinforcement learning (RL) applied in real-world tasks like autonomous driving. Chance constraints which guarantee the satisfaction of state constraints at a high probability are suitable to represent the requirements in real-world environment with uncertainty. Existing chance constrained RL methods like the penalty method and the Lagrangian method either exhibit periodic… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

  27. arXiv:2102.08072  [pdf, other

    cs.LG cs.AI cs.RO eess.SY

    Steadily Learn to Drive with Virtual Memory

    Authors: Yuhang Zhang, Yao Mu, Yujie Yang, Yang Guan, Shengbo Eben Li, Qi Sun, Jianyu Chen

    Abstract: Reinforcement learning has shown great potential in develo** high-level autonomous driving. However, for high-dimensional tasks, current RL methods suffer from low data efficiency and oscillation in the training process. This paper proposes an algorithm called Learn to drive with Virtual Memory (LVM) to overcome these problems. LVM compresses the high-dimensional information into compact latent… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    Comments: Submitted to the 32nd IEEE Intelligent Vehicles Symposium

  28. arXiv:2012.10716  [pdf, other

    cs.LG cs.AI eess.SY

    Model-Based Actor-Critic with Chance Constraint for Stochastic System

    Authors: Baiyu Peng, Yao Mu, Yang Guan, Shengbo Eben Li, Yuming Yin, Jianyu Chen

    Abstract: Safety is essential for reinforcement learning (RL) applied in real-world situations. Chance constraints are suitable to represent the safety requirements in stochastic systems. Previous chance-constrained RL methods usually have a low convergence rate, or only learn a conservative policy. In this paper, we propose a model-based chance constrained actor-critic (CCAC) algorithm which can efficientl… ▽ More

    Submitted 16 March, 2021; v1 submitted 19 December, 2020; originally announced December 2020.

  29. arXiv:2011.09612  [pdf, other

    eess.SY

    Numerically Stable Dynamic Bicycle Model for Discrete-time Control

    Authors: Qiang Ge, Shengbo Eben Li, Qi Sun, Sifa Zheng

    Abstract: Dynamic/kinematic model is of great significance in decision and control of intelligent vehicles. However, due to the singularity of dynamic models at low speed, kinematic models have been the only choice under many driving scenarios. This paper presents a discrete dynamic bicycle model feasible at any low speed utilizing the concept of backward Euler method. We further give a sufficient condition… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

    Comments: 6 pages, 7 figures, conference

  30. arXiv:2008.13081  [pdf, other

    eess.SY

    Centralized Coordination of Connected Vehicles at Intersections using Graphical Mixed Integer Optimization

    Authors: Qiang Ge, Qi Sun, Zhen Wang, Shengbo Eben Li, Ziqing Gu, Sifa Zheng

    Abstract: This paper proposes a centralized multi-vehicle coordination scheme serving unsignalized intersections. The whole process consists of three stages: a) target velocity optimization: formulate the collision-free vehicle coordination as a Mixed Integer Linear Programming (MILP) problem, with each incoming lane representing an independent variable; b) dynamic vehicle selection: build a directed graph… ▽ More

    Submitted 29 August, 2020; originally announced August 2020.

    Comments: 6 pages, 9 figures, conference

  31. arXiv:2008.00674  [pdf, other

    eess.SY math.OC

    Reinforcement Solver for H-infinity Filter with Bounded Noise

    Authors: Jie Li, Shengbo Eben Li, Kaiming Tang, Yao Lv, Wenhan Cao

    Abstract: H-infinity filter has been widely applied in engineering field, but cop** with bounded noise is still an open problem and difficult to solve. This paper considers the H-infinity filtering problem for linear system with bounded process and measurement noise. The problem is first formulated as a zero-sum game where the dynamic of estimation error is non-affine with respect to filter gain and measu… ▽ More

    Submitted 3 August, 2020; originally announced August 2020.

  32. arXiv:2007.06810  [pdf

    eess.SY cs.GT cs.LG

    Ternary Policy Iteration Algorithm for Nonlinear Robust Control

    Authors: Jie Li, Shengbo Eben Li, Yang Guan, **gliang Duan, Wenyu Li, Yuming Yin

    Abstract: The uncertainties in plant dynamics remain a challenge for nonlinear control problems. This paper develops a ternary policy iteration (TPI) algorithm for solving nonlinear robust control problems with bounded uncertainties. The controller and uncertainty of the system are considered as game players, and the robust control problem is formulated as a two-player zero-sum differential game. In order t… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

  33. arXiv:2007.02070  [pdf, other

    eess.SY

    Continuous-time finite-horizon ADP for automated vehicle controller design with high efficiency

    Authors: Ziyu Lin, **gliang Duan, Shengbo Eben Li, Haitong Ma, Yuming Yin

    Abstract: The design of an automated vehicle controller can be generally formulated into an optimal control problem. This paper proposes a continuous-time finite-horizon approximate dynamicprogramming (ADP) method, which can synthesis off-line near-optimal control policy with analytical vehicle dynamics. Lying on the general Policy Iteration framework, it employs value andpolicy neural networks to approxima… ▽ More

    Submitted 4 July, 2020; originally announced July 2020.

    Comments: 7 pages,conference

  34. arXiv:2003.00848  [pdf, other

    eess.SY cs.LG cs.RO stat.ML

    Mixed Reinforcement Learning with Additive Stochastic Uncertainty

    Authors: Yao Mu, Shengbo Eben Li, Chang Liu, Qi Sun, Bingbing Nie, Bo Cheng, Baiyu Peng

    Abstract: Reinforcement learning (RL) methods often rely on massive exploration data to search optimal policies, and suffer from poor sampling efficiency. This paper presents a mixed reinforcement learning (mixed RL) algorithm by simultaneously using dual representations of environmental dynamics to search the optimal policy with the purpose of improving both learning accuracy and training speed. The dual r… ▽ More

    Submitted 28 February, 2020; originally announced March 2020.

  35. Hierarchical Reinforcement Learning for Self-Driving Decision-Making without Reliance on Labeled Driving Data

    Authors: **gliang Duan, Shengbo Eben Li, Yang Guan, Qi Sun, Bo Cheng

    Abstract: Decision making for self-driving cars is usually tackled by manually encoding rules from drivers' behaviors or imitating drivers' manipulation using supervised learning techniques. Both of them rely on mass driving data to cover all possible driving scenarios. This paper presents a hierarchical reinforcement learning method for decision making of self-driving cars, which does not depend on a large… ▽ More

    Submitted 27 January, 2020; originally announced January 2020.

    Journal ref: IET Intelligent Transport Systems, 2020, 14(5): 297-305

  36. arXiv:2001.02811  [pdf, other

    cs.LG cs.AI eess.SY

    Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors

    Authors: **gliang Duan, Yang Guan, Shengbo Eben Li, Yangang Ren, Bo Cheng

    Abstract: In reinforcement learning (RL), function approximation errors are known to easily lead to the Q-value overestimations, thus greatly reducing policy performance. This paper presents a distributional soft actor-critic (DSAC) algorithm, which is an off-policy RL method for continuous control setting, to improve the policy performance by mitigating Q-value overestimations. We first discover in theory… ▽ More

    Submitted 11 June, 2021; v1 submitted 8 January, 2020; originally announced January 2020.

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems, 2021

  37. arXiv:1911.11397  [pdf, other

    eess.SY cs.LG math.OC

    Adaptive dynamic programming for nonaffine nonlinear optimal control problem with state constraints

    Authors: **gliang Duan, Zhengyu Liu, Shengbo Eben Li, Qi Sun, Zhenzhong Jia, Bo Cheng

    Abstract: This paper presents a constrained adaptive dynamic programming (CADP) algorithm to solve general nonlinear nonaffine optimal control problems with known dynamics. Unlike previous ADP algorithms, it can directly deal with problems with state constraints. Firstly, a constrained generalized policy iteration (CGPI) framework is developed to handle state constraints by transforming the traditional poli… ▽ More

    Submitted 8 April, 2022; v1 submitted 26 November, 2019; originally announced November 2019.

    Journal ref: Neurocomputing 484 (2022) 128-141

  38. Relaxed Actor-Critic with Convergence Guarantees for Continuous-Time Optimal Control of Nonlinear Systems

    Authors: **gliang Duan, Jie Li, Qiang Ge, Shengbo Eben Li, Monimoy Bujarbaruah, Fei Ma, Dezhao Zhang

    Abstract: This paper presents the Relaxed Continuous-Time Actor-critic (RCTAC) algorithm, a method for finding the nearly optimal policy for nonlinear continuous-time (CT) systems with known dynamics and infinite horizon, such as the path-tracking control of vehicles. RCTAC has several advantages over existing adaptive dynamic programming algorithms for CT systems. It does not require the ``admissibility" o… ▽ More

    Submitted 30 March, 2023; v1 submitted 11 September, 2019; originally announced September 2019.

    Journal ref: IEEE Transactions on Intelligent Vehicles, 2023 (Early Access)

  39. arXiv:1807.11874  [pdf, ps, other

    eess.SY

    Parallel Optimal Control for Cooperative Automation of Large-scale Connected Vehicles via ADMM

    Authors: Zhitao Wang, Yang Zheng, Shengbo Eben Li, Keyou You, Keqiang Li

    Abstract: This paper proposes a parallel optimization algorithm for cooperative automation of large-scale connected vehicles. The task of cooperative automation is formulated as a centralized optimization problem taking the whole decision space of all vehicles into account. Considering the uncertainty of the environment, the problem is solved in a receding horizon fashion. Then, we employ the alternating di… ▽ More

    Submitted 31 July, 2018; originally announced July 2018.