Search | arXiv e-print repository

An Actor Critic Method for Free Terminal Time Optimal Control

Authors: Evan Burton, Tenavi Nakamura-Zimmerer, Qi Gong, Wei Kang

Abstract: Optimal control problems with free terminal time present many challenges including nonsmooth and discontinuous control laws, irregular value functions, many local optima, and the curse of dimensionality. To overcome these issues, we propose an adaptation of the model-based actor-critic paradigm from the field of Reinforcement Learning via an exponential transformation to learn an approximate feedb… ▽ More Optimal control problems with free terminal time present many challenges including nonsmooth and discontinuous control laws, irregular value functions, many local optima, and the curse of dimensionality. To overcome these issues, we propose an adaptation of the model-based actor-critic paradigm from the field of Reinforcement Learning via an exponential transformation to learn an approximate feedback control and value function pair. We demonstrate the algorithm's effectiveness on prototypical examples featuring each of the main pathological issues present in problems of this type. △ Less

Submitted 4 August, 2022; v1 submitted 29 July, 2022; originally announced August 2022.

arXiv:2205.00394 [pdf]

doi 10.1109/OJCSYS.2022.3205863

Neural Network Optimal Feedback Control with Guaranteed Local Stability

Authors: Tenavi Nakamura-Zimmerer, Qi Gong, Wei Kang

Abstract: Recent research shows that supervised learning can be an effective tool for designing near-optimal feedback controllers for high-dimensional nonlinear dynamic systems. But the behavior of neural network controllers is still not well understood. In particular, some neural networks with high test accuracy can fail to even locally stabilize the dynamic system. To address this challenge we propose sev… ▽ More Recent research shows that supervised learning can be an effective tool for designing near-optimal feedback controllers for high-dimensional nonlinear dynamic systems. But the behavior of neural network controllers is still not well understood. In particular, some neural networks with high test accuracy can fail to even locally stabilize the dynamic system. To address this challenge we propose several novel neural network architectures, which we show guarantee local asymptotic stability while retaining the approximation capacity to learn the optimal feedback policy semi-globally. The proposed architectures are compared against standard neural network feedback controllers through numerical simulations of two high-dimensional nonlinear optimal control problems: stabilization of an unstable Burgers-type partial differential equation, and altitude and course tracking for an unmanned aerial vehicle. The simulations demonstrate that standard neural networks can fail to stabilize the dynamics even when trained well, while the proposed architectures are always at least locally stabilizing and can achieve near-optimal performance. △ Less

Submitted 6 October, 2022; v1 submitted 1 May, 2022; originally announced May 2022.

Comments: arXiv admin note: text overlap with arXiv:2109.07466

Journal ref: IEEE Open Journal of Control Systems, 1 (2022) 210-222

arXiv:2109.07466 [pdf, ps, other]

doi 10.23919/ACC53348.2022.9867619

Neural network optimal feedback control with enhanced closed loop stability

Authors: Tenavi Nakamura-Zimmerer, Qi Gong, Wei Kang

Abstract: Recent research has shown that supervised learning can be an effective tool for designing optimal feedback controllers for high-dimensional nonlinear dynamic systems. But the behavior of these neural network (NN) controllers is still not well understood. In this paper we use numerical simulations to demonstrate that typical test accuracy metrics do not effectively capture the ability of an NN cont… ▽ More Recent research has shown that supervised learning can be an effective tool for designing optimal feedback controllers for high-dimensional nonlinear dynamic systems. But the behavior of these neural network (NN) controllers is still not well understood. In this paper we use numerical simulations to demonstrate that typical test accuracy metrics do not effectively capture the ability of an NN controller to stabilize a system. In particular, some NNs with high test accuracy can fail to stabilize the dynamics. To address this we propose two NN architectures which locally approximate a linear quadratic regulator (LQR). Numerical simulations confirm our intuition that the proposed architectures reliably produce stabilizing feedback controllers without sacrificing optimality. In addition, we introduce a preliminary theoretical result describing some stability properties of such NN-controlled systems. △ Less

Submitted 17 November, 2021; v1 submitted 15 September, 2021; originally announced September 2021.

Report number: American Control Conference (2022) 2373-2378

arXiv:2009.05686 [pdf]

doi 10.1109/LCSYS.2020.3034415

QRnet: optimal regulator design with LQR-augmented neural networks

Authors: Tenavi Nakamura-Zimmerer, Qi Gong, Wei Kang

Abstract: In this paper we propose a new computational method for designing optimal regulators for high-dimensional nonlinear systems. The proposed approach leverages physics-informed machine learning to solve high-dimensional Hamilton-Jacobi-Bellman equations arising in optimal feedback control. Concretely, we augment linear quadratic regulators with neural networks to handle nonlinearities. We train the a… ▽ More In this paper we propose a new computational method for designing optimal regulators for high-dimensional nonlinear systems. The proposed approach leverages physics-informed machine learning to solve high-dimensional Hamilton-Jacobi-Bellman equations arising in optimal feedback control. Concretely, we augment linear quadratic regulators with neural networks to handle nonlinearities. We train the augmented models on data generated without discretizing the state space, enabling application to high-dimensional problems. We use the proposed method to design a candidate optimal regulator for an unstable Burgers' equation, and through this example, demonstrate improved robustness and accuracy compared to existing neural network formulations. △ Less

Submitted 16 November, 2020; v1 submitted 11 September, 2020; originally announced September 2020.

Comments: Added IEEE accepted manuscript with copyright notice

Journal ref: IEEE Control Systems Letters 5 (2021) 1303-1308

arXiv:1912.00492 [pdf, ps, other]

doi 10.1016/j.physd.2021.132955

Algorithms of Data Development For Deep Learning and Feedback Design

Authors: Wei Kang, Qi Gong, Tenavi Nakamura-Zimmerer

Abstract: Recent research reveals that deep learning is an effective way of solving high dimensional Hamilton-Jacobi-Bellman equations. The resulting feedback control law in the form of a neural network is computationally efficient for real-time applications of optimal control. A critical part of this design method is to generate data for training the neural network and validating its accuracy. In this pape… ▽ More Recent research reveals that deep learning is an effective way of solving high dimensional Hamilton-Jacobi-Bellman equations. The resulting feedback control law in the form of a neural network is computationally efficient for real-time applications of optimal control. A critical part of this design method is to generate data for training the neural network and validating its accuracy. In this paper, we provide a survey of existing algorithms that can be used to generate data. All the algorithms surveyed in this paper are causality-free, i.e., the solution at a point is computed without using the value of the function at any other points. At the end of the paper, an illustrative example of optimal feedback design using deep learning is given. △ Less

Submitted 28 January, 2020; v1 submitted 1 December, 2019; originally announced December 2019.

Comments: 15 pages, 1 figure

Journal ref: Physica D: Nonlinear Phenomena 425 (2021) 132955

arXiv:1911.09311 [pdf, ps, other]

Density Propagation with Characteristics-based Deep Learning

Authors: Tenavi Nakamura-Zimmerer, Daniele Venturi, Qi Gong, Wei Kang

Abstract: Uncertainty propagation in nonlinear dynamic systems remains an outstanding problem in scientific computing and control. Numerous approaches have been developed, but are limited in their capability to tackle problems with more than a few uncertain variables or require large amounts of simulation data. In this paper, we propose a data-driven method for approximating joint probability density functi… ▽ More Uncertainty propagation in nonlinear dynamic systems remains an outstanding problem in scientific computing and control. Numerous approaches have been developed, but are limited in their capability to tackle problems with more than a few uncertain variables or require large amounts of simulation data. In this paper, we propose a data-driven method for approximating joint probability density functions (PDFs) of nonlinear dynamic systems with initial condition and parameter uncertainty. Our approach leverages on the power of deep learning to deal with high-dimensional inputs, but we overcome the need for huge quantities of training data by encoding PDF evolution equations directly into the optimization problem. We demonstrate the potential of the proposed method by applying it to evaluate the robustness of a feedback controller for a six-dimensional rigid body with parameter uncertainty. △ Less

Submitted 21 November, 2019; originally announced November 2019.

Comments: This work has been submitted to IFAC for possible publication

arXiv:1907.05317 [pdf, ps, other]

doi 10.1137/19M1288802

Adaptive Deep Learning for High-Dimensional Hamilton-Jacobi-Bellman Equations

Authors: Tenavi Nakamura-Zimmerer, Qi Gong, Wei Kang

Abstract: Computing optimal feedback controls for nonlinear systems generally requires solving Hamilton-Jacobi-Bellman (HJB) equations, which are notoriously difficult when the state dimension is large. Existing strategies for high-dimensional problems often rely on specific, restrictive problem structures, or are valid only locally around some nominal trajectory. In this paper, we propose a data-driven met… ▽ More Computing optimal feedback controls for nonlinear systems generally requires solving Hamilton-Jacobi-Bellman (HJB) equations, which are notoriously difficult when the state dimension is large. Existing strategies for high-dimensional problems often rely on specific, restrictive problem structures, or are valid only locally around some nominal trajectory. In this paper, we propose a data-driven method to approximate semi-global solutions to HJB equations for general high-dimensional nonlinear systems and compute candidate optimal feedback controls in real-time. To accomplish this, we model solutions to HJB equations with neural networks (NNs) trained on data generated without discretizing the state space. Training is made more effective and data-efficient by leveraging the known physics of the problem and using the partially-trained NN to aid in adaptive data generation. We demonstrate the effectiveness of our method by learning solutions to HJB equations corresponding to the attitude control of a six-dimensional nonlinear rigid body, and nonlinear systems of dimension up to 30 arising from the stabilization of a Burgers'-type partial differential equation. The trained NNs are then used for real-time feedback control of these systems. △ Less

Submitted 8 February, 2021; v1 submitted 11 July, 2019; originally announced July 2019.

Comments: Added section on validation error computation. Updated convergence test formula and associated results

Journal ref: SIAM Journal on Scientific Computing 43 (2021) A1221-A1247

Showing 1–7 of 7 results for author: Nakamura-Zimmerer, T