-
Machine learning based state observer for discrete time systems evolving on Lie groups
Authors:
Soham Shanbhag,
Dong Eui Chang
Abstract:
In this paper, a machine learning based observer for systems evolving on manifolds is designed such that the state of the observer is restricted to the Lie group on which the system evolves. Conventional techniques involving machine learning based observers on systems evolving on Lie groups involve designing charts for the Lie group, training a machine learning based observer for each chart, and s…
▽ More
In this paper, a machine learning based observer for systems evolving on manifolds is designed such that the state of the observer is restricted to the Lie group on which the system evolves. Conventional techniques involving machine learning based observers on systems evolving on Lie groups involve designing charts for the Lie group, training a machine learning based observer for each chart, and switching between the trained models based on the state of the system. We propose a novel deep learning based technique whose predictions are restricted to a measure 0 subset of Euclidean space without using charts. Using this network, we design an observer ensuring that the state of the observer is restricted to the Lie group, and predicting the state using only one trained algorithm. The deep learning network predicts an ``error term'' on the Lie algebra of the Lie group, uses the map from the Lie algebra to the group, and uses the group action and the present state to estimate the state at the next epoch. This model being purely data driven does not require the model of the system. The proposed algorithm provides a novel framework for constraining the output of machine learning networks to a measure 0 subset of a Euclidean space without chart specific training and without requiring switching. We show the validity of this method using Monte Carlo simulations performed of the rigid body rotation and translation system.
△ Less
Submitted 20 January, 2024;
originally announced January 2024.
-
Unscented Kalman filter with stable embedding for simple, accurate and computationally efficient state estimation of systems on manifolds in Euclidean space
Authors:
Jae-Hyeon Park,
Dong Eui Chang
Abstract:
This paper proposes a simple, accurate and computationally efficient method to apply the ordinary unscented Kalman filter developed in Euclidean space to systems whose dynamics evolve on manifolds.We use the mathematical theory called stable embedding to make a variant of unscented Kalman filter that keeps state estimates in closeproximity to the manifold while exhibiting excellent estimation perf…
▽ More
This paper proposes a simple, accurate and computationally efficient method to apply the ordinary unscented Kalman filter developed in Euclidean space to systems whose dynamics evolve on manifolds.We use the mathematical theory called stable embedding to make a variant of unscented Kalman filter that keeps state estimates in closeproximity to the manifold while exhibiting excellent estimation performance. We confirm the performance of our devised filter by applying it to the satellite system model and comparing the performance with other unscented Kalman filters devised specifically for systems on manifolds. Our devised filter has a low estimation error, keeps the state estimates in close proximity to the manifold as expected, and consumes a minor amount of computation time. Also our devised filter is simple and easy to use because our filter directly employs the off-the-shelf standard unscented Kalman filter devised in Euclidean space without any particular manifold-structure-preserving discretization method or coordinate transformation.
△ Less
Submitted 30 November, 2022; v1 submitted 22 August, 2022;
originally announced August 2022.
-
Feedback Gradient Descent: Efficient and Stable Optimization with Orthogonality for DNNs
Authors:
Fanchen Bu,
Dong Eui Chang
Abstract:
The optimization with orthogonality has been shown useful in training deep neural networks (DNNs). To impose orthogonality on DNNs, both computational efficiency and stability are important. However, existing methods utilizing Riemannian optimization or hard constraints can only ensure stability while those using soft constraints can only improve efficiency. In this paper, we propose a novel metho…
▽ More
The optimization with orthogonality has been shown useful in training deep neural networks (DNNs). To impose orthogonality on DNNs, both computational efficiency and stability are important. However, existing methods utilizing Riemannian optimization or hard constraints can only ensure stability while those using soft constraints can only improve efficiency. In this paper, we propose a novel method, named Feedback Gradient Descent (FGD), to our knowledge, the first work showing high efficiency and stability simultaneously. FGD induces orthogonality based on the simple yet indispensable Euler discretization of a continuous-time dynamical system on the tangent bundle of the Stiefel manifold. In particular, inspired by a numerical integration method on manifolds called Feedback Integrators, we propose to instantiate it on the tangent bundle of the Stiefel manifold for the first time. In the extensive image classification experiments, FGD comprehensively outperforms the existing state-of-the-art methods in terms of accuracy, efficiency, and stability.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
Robust Navigation for Racing Drones based on Imitation Learning and Modularization
Authors:
Tianqi Wang,
Dong Eui Chang
Abstract:
This paper presents a vision-based modularized drone racing navigation system that uses a customized convolutional neural network (CNN) for the perception module to produce high-level navigation commands and then leverages a state-of-the-art planner and controller to generate low-level control commands, thus exploiting the advantages of both data-based and model-based approaches. Unlike the state-…
▽ More
This paper presents a vision-based modularized drone racing navigation system that uses a customized convolutional neural network (CNN) for the perception module to produce high-level navigation commands and then leverages a state-of-the-art planner and controller to generate low-level control commands, thus exploiting the advantages of both data-based and model-based approaches. Unlike the state-of-the-art method which only takes the current camera image as the CNN input, we further add the latest three drone states as part of the inputs. Our method outperforms the state-of-the-art method in various track layouts and offers two switchable navigation behaviors with a single trained network. The CNN-based perception module is trained to imitate an expert policy that automatically generates ground truth navigation commands based on the pre-computed global trajectories. Owing to the extensive randomization and our modified dataset aggregation (DAgger) policy during data collection, our navigation system, which is purely trained in simulation with synthetic textures, successfully operates in environments with randomly-chosen photorealistic textures without further fine-tuning.
△ Less
Submitted 26 May, 2021;
originally announced May 2021.
-
The Adaptive Dynamic Programming Toolbox
Authors:
Xiaowei Xing,
Dong Eui Chang
Abstract:
The paper develops the Adaptive Dynamic Programming Toolbox (ADPT), which solves optimal control problems for continuous-time nonlinear systems. Based on the adaptive dynamic programming technique, the ADPT computes optimal feedback controls from the system dynamics in the model-based working mode, or from measurements of trajectories of the system in the model-free working mode without the requir…
▽ More
The paper develops the Adaptive Dynamic Programming Toolbox (ADPT), which solves optimal control problems for continuous-time nonlinear systems. Based on the adaptive dynamic programming technique, the ADPT computes optimal feedback controls from the system dynamics in the model-based working mode, or from measurements of trajectories of the system in the model-free working mode without the requirement of knowledge of the system model. Multiple options are provided such that the ADPT can accommodate various customized circumstances. Compared to other popular software toolboxes for optimal control, the ADPT enjoys its computational precision and speed, which is illustrated with its applications to a satellite attitude control problem.
△ Less
Submitted 29 December, 2020;
originally announced December 2020.
-
Double Prioritized State Recycled Experience Replay
Authors:
Fanchen Bu,
Dong Eui Chang
Abstract:
Experience replay enables online reinforcement learning agents to store and reuse the previous experiences of interacting with the environment. In the original method, the experiences are sampled and replayed uniformly at random. A prior work called prioritized experience replay was developed where experiences are prioritized, so as to replay experiences seeming to be more important more frequentl…
▽ More
Experience replay enables online reinforcement learning agents to store and reuse the previous experiences of interacting with the environment. In the original method, the experiences are sampled and replayed uniformly at random. A prior work called prioritized experience replay was developed where experiences are prioritized, so as to replay experiences seeming to be more important more frequently. In this paper, we develop a method called double-prioritized state-recycled (DPSR) experience replay, prioritizing the experiences in both training stage and storing stage, as well as replacing the experiences in the memory with state recycling to make the best of experiences that seem to have low priorities temporarily. We used this method in Deep Q-Networks (DQN), and achieved a state-of-the-art result, outperforming the original method and prioritized experience replay on many Atari games.
△ Less
Submitted 21 September, 2020; v1 submitted 8 July, 2020;
originally announced July 2020.
-
Deep Reinforcement Learning Based Robot Arm Manipulation with Efficient Training Data through Simulation
Authors:
Xiaowei Xing,
Dong Eui Chang
Abstract:
Deep reinforcement learning trains neural networks using experiences sampled from the replay buffer, which is commonly updated at each time step. In this paper, we propose a method to update the replay buffer adaptively and selectively to train a robot arm to accomplish a suction task in simulation. The response time of the agent is thoroughly taken into account. The state transitions that remain…
▽ More
Deep reinforcement learning trains neural networks using experiences sampled from the replay buffer, which is commonly updated at each time step. In this paper, we propose a method to update the replay buffer adaptively and selectively to train a robot arm to accomplish a suction task in simulation. The response time of the agent is thoroughly taken into account. The state transitions that remain stuck at the boundary of constraint are not stored. The policy trained with our method works better than the one with the common replay buffer update method. The result is demonstrated both by simulation and by experiment with a real robot arm.
△ Less
Submitted 5 September, 2019; v1 submitted 16 July, 2019;
originally announced July 2019.
-
Improved Reinforcement Learning through Imitation Learning Pretraining Towards Image-based Autonomous Driving
Authors:
Tianqi Wang,
Dong Eui Chang
Abstract:
We present a training pipeline for the autonomous driving task given the current camera image and vehicle speed as the input to produce the throttle, brake, and steering control output. The simulator Airsim's convenient weather and lighting API provides a sufficient diversity during training which can be very helpful to increase the trained policy's robustness. In order to not limit the possible p…
▽ More
We present a training pipeline for the autonomous driving task given the current camera image and vehicle speed as the input to produce the throttle, brake, and steering control output. The simulator Airsim's convenient weather and lighting API provides a sufficient diversity during training which can be very helpful to increase the trained policy's robustness. In order to not limit the possible policy's performance, we use a continuous and deterministic control policy setting. We utilize ResNet-34 as our actor and critic networks with some slight changes in the fully connected layers. Considering human's mastery of this task and the high-complexity nature of this task, we first use imitation learning to mimic the given human policy and leverage the trained policy and its weights to the reinforcement learning phase for which we use DDPG. This combination shows a considerable performance boost comparing to both pure imitation learning and pure DDPG for the autonomous driving task.
△ Less
Submitted 16 July, 2019;
originally announced July 2019.
-
A Dual Memory Structure for Efficient Use of Replay Memory in Deep Reinforcement Learning
Authors:
Wonshick Ko,
Dong Eui Chang
Abstract:
In this paper, we propose a dual memory structure for reinforcement learning algorithms with replay memory. The dual memory consists of a main memory that stores various data and a cache memory that manages the data and trains the reinforcement learning agent efficiently. Experimental results show that the dual memory structure achieves higher training and test scores than the conventional single…
▽ More
In this paper, we propose a dual memory structure for reinforcement learning algorithms with replay memory. The dual memory consists of a main memory that stores various data and a cache memory that manages the data and trains the reinforcement learning agent efficiently. Experimental results show that the dual memory structure achieves higher training and test scores than the conventional single memory structure in three selected environments of OpenAI Gym. This implies that the dual memory structure enables better and more efficient training than the single memory structure.
△ Less
Submitted 15 July, 2019;
originally announced July 2019.
-
Robotic Navigation using Entropy-Based Exploration
Authors:
Muhammad Usama,
Dong Eui Chang
Abstract:
Robotic navigation concerns the task in which a robot should be able to find a safe and feasible path and traverse between two points in a complex environment. We approach the problem of robotic navigation using reinforcement learning and use deep $Q$-networks to train agents to solve the task of robotic navigation. We compare the Entropy-Based Exploration (EBE) with the widely used $ε$-greedy exp…
▽ More
Robotic navigation concerns the task in which a robot should be able to find a safe and feasible path and traverse between two points in a complex environment. We approach the problem of robotic navigation using reinforcement learning and use deep $Q$-networks to train agents to solve the task of robotic navigation. We compare the Entropy-Based Exploration (EBE) with the widely used $ε$-greedy exploration strategy by training agents using both of them in simulation. The trained agents are then tested on different versions of the environment to test the generalization ability of the learned policies. We also implement the learned policies on a real robot in complex real environment without any fine tuning and compare the effectiveness of the above-mentioned exploration strategies in the real world setting. Video showing experiments on TurtleBot3 platform is available at \url{https://youtu.be/NHT-EiN_4n8}.
△ Less
Submitted 17 June, 2019;
originally announced June 2019.
-
Learning-Driven Exploration for Reinforcement Learning
Authors:
Muhammad Usama,
Dong Eui Chang
Abstract:
Effective and intelligent exploration has been an unresolved problem for reinforcement learning. Most contemporary reinforcement learning relies on simple heuristic strategies such as $ε$-greedy exploration or adding Gaussian noise to actions. These heuristics, however, are unable to intelligently distinguish the well explored and the unexplored regions of state space, which can lead to inefficien…
▽ More
Effective and intelligent exploration has been an unresolved problem for reinforcement learning. Most contemporary reinforcement learning relies on simple heuristic strategies such as $ε$-greedy exploration or adding Gaussian noise to actions. These heuristics, however, are unable to intelligently distinguish the well explored and the unexplored regions of state space, which can lead to inefficient use of training time. We introduce entropy-based exploration (EBE) that enables an agent to explore efficiently the unexplored regions of state space. EBE quantifies the agent's learning in a state using merely state-dependent action values and adaptively explores the state space, i.e. more exploration for the unexplored region of the state space. We perform experiments on a diverse set of environments and demonstrate that EBE enables efficient exploration that ultimately results in faster learning without having to tune any hyperparameter.
The code to reproduce the experiments is given at \url{https://github.com/Usama1002/EBE-Exploration} and the supplementary video is given at \url{https://youtu.be/nJggIjjzKic}.
△ Less
Submitted 16 October, 2020; v1 submitted 17 June, 2019;
originally announced June 2019.
-
Enhancement of Energy-Based Swing-Up Controller via Entropy Search
Authors:
Chang Sik Lee,
Dong Eui Chang
Abstract:
An energy based approach for stabilizing a mechanical system has offered a simple yet powerful control scheme. However, since it does not impose such strong constraints on parameter space of the controller, finding appropriate parameter values for an optimal controller is known to be hard. This paper intends to generate an optimal energy-based controller for swinging up a rotary inverted pendulum,…
▽ More
An energy based approach for stabilizing a mechanical system has offered a simple yet powerful control scheme. However, since it does not impose such strong constraints on parameter space of the controller, finding appropriate parameter values for an optimal controller is known to be hard. This paper intends to generate an optimal energy-based controller for swinging up a rotary inverted pendulum, also known as the Furuta pendulum, by applying the Bayesian optimization called Entropy Search. Simulations and experiments show that the optimal controller has an improved performance compared to a nominal controller for various initial conditions.
△ Less
Submitted 3 April, 2019; v1 submitted 2 April, 2019;
originally announced April 2019.
-
Interaction-aware Kalman Neural Networks for Trajectory Prediction
Authors:
Ce Ju,
Zheng Wang,
Cheng Long,
Xiaoyu Zhang,
Dong Eui Chang
Abstract:
Forecasting the motion of surrounding obstacles (vehicles, bicycles, pedestrians and etc.) benefits the on-road motion planning for intelligent and autonomous vehicles. Complex scenes always yield great challenges in modeling the patterns of surrounding traffic. For example, one main challenge comes from the intractable interaction effects in a complex traffic system. In this paper, we propose a m…
▽ More
Forecasting the motion of surrounding obstacles (vehicles, bicycles, pedestrians and etc.) benefits the on-road motion planning for intelligent and autonomous vehicles. Complex scenes always yield great challenges in modeling the patterns of surrounding traffic. For example, one main challenge comes from the intractable interaction effects in a complex traffic system. In this paper, we propose a multi-layer architecture Interaction-aware Kalman Neural Networks (IaKNN) which involves an interaction layer for resolving high-dimensional traffic environmental observations as interaction-aware accelerations, a motion layer for transforming the accelerations to interaction aware trajectories, and a filter layer for estimating future trajectories with a Kalman filter network. Attributed to the multiple traffic data sources, our end-to-end trainable approach technically fuses dynamic and interaction-aware trajectories boosting the prediction performance. Experiments on the NGSIM dataset demonstrate that IaKNN outperforms the state-of-the-art methods in terms of effectiveness for traffic trajectory prediction.
△ Less
Submitted 25 January, 2021; v1 submitted 28 February, 2019;
originally announced February 2019.
-
Towards Robust Neural Networks with Lipschitz Continuity
Authors:
Muhammad Usama,
Dong Eui Chang
Abstract:
Deep neural networks have shown remarkable performance across a wide range of vision-based tasks, particularly due to the availability of large-scale datasets for training and better architectures. However, data seen in the real world are often affected by distortions that not accounted for by the training datasets. In this paper, we address the challenge of robustness and stability of neural netw…
▽ More
Deep neural networks have shown remarkable performance across a wide range of vision-based tasks, particularly due to the availability of large-scale datasets for training and better architectures. However, data seen in the real world are often affected by distortions that not accounted for by the training datasets. In this paper, we address the challenge of robustness and stability of neural networks and propose a general training method that can be used to make the existing neural network architectures more robust and stable to input visual perturbations while using only available datasets for training. Proposed training method is convenient to use as it does not require data augmentation or changes in the network architecture. We provide theoretical proof as well as empirical evidence for the efficiency of the proposed training method by performing experiments with existing neural network architectures and demonstrate that same architecture when trained with the proposed training method perform better than when trained with conventional training approach in the presence of noisy datasets.
△ Less
Submitted 21 November, 2018;
originally announced November 2018.
-
On Controller Design for Systems on Manifolds in Euclidean Space
Authors:
Dong Eui Chang
Abstract:
A new method is developed to design controllers in Euclidean space for systems defined on manifolds. The idea is to embed the state-space manifold $M$ of a given control system into some Euclidean space $\mathbb R^n$, extend the system from $M$ to the ambient space $\mathbb R^n$, and modify it outside $M$ to add transversal stability to $M$ in the final dynamics in $\mathbb R^n$. Controllers are d…
▽ More
A new method is developed to design controllers in Euclidean space for systems defined on manifolds. The idea is to embed the state-space manifold $M$ of a given control system into some Euclidean space $\mathbb R^n$, extend the system from $M$ to the ambient space $\mathbb R^n$, and modify it outside $M$ to add transversal stability to $M$ in the final dynamics in $\mathbb R^n$. Controllers are designed for the final system in the ambient space $\mathbb R^n$. Then, their restriction to $M$ produces controllers for the original system on $M$. This method has the merit that only one single global Cartesian coordinate system in the ambient space $\mathbb R^n$ is used for controller synthesis, and any controller design method in $\mathbb R^n$, such as the linearization method, can be globally applied for the controller synthesis. The proposed method is successfully applied to the tracking problem for the following two benchmark systems: the fully actuated rigid body system and the quadcopter drone system.
△ Less
Submitted 10 July, 2018;
originally announced July 2018.
-
A Novel Representation of Neural Networks
Authors:
Anthony Caterini,
Dong Eui Chang
Abstract:
Deep Neural Networks (DNNs) have become very popular for prediction in many areas. Their strength is in representation with a high number of parameters that are commonly learned via gradient descent or similar optimization methods. However, the representation is non-standardized, and the gradient calculation methods are often performed using component-based approaches that break parameters down in…
▽ More
Deep Neural Networks (DNNs) have become very popular for prediction in many areas. Their strength is in representation with a high number of parameters that are commonly learned via gradient descent or similar optimization methods. However, the representation is non-standardized, and the gradient calculation methods are often performed using component-based approaches that break parameters down into scalar units, instead of considering the parameters as whole entities. In this work, these problems are addressed. Standard notation is used to represent DNNs in a compact framework. Gradients of DNN loss functions are calculated directly over the inner product space on which the parameters are defined. This framework is general and is applied to two common network types: the Multilayer Perceptron and the Deep Autoencoder.
△ Less
Submitted 7 October, 2016; v1 submitted 5 October, 2016;
originally announced October 2016.
-
A Geometric Framework for Convolutional Neural Networks
Authors:
Anthony L. Caterini,
Dong Eui Chang
Abstract:
In this paper, a geometric framework for neural networks is proposed. This framework uses the inner product space structure underlying the parameter set to perform gradient descent not in a component-based form, but in a coordinate-free manner. Convolutional neural networks are described in this framework in a compact form, with the gradients of standard --- and higher-order --- loss functions cal…
▽ More
In this paper, a geometric framework for neural networks is proposed. This framework uses the inner product space structure underlying the parameter set to perform gradient descent not in a component-based form, but in a coordinate-free manner. Convolutional neural networks are described in this framework in a compact form, with the gradients of standard --- and higher-order --- loss functions calculated for each layer of the network. This approach can be applied to other network structures and provides a basis on which to create new networks.
△ Less
Submitted 5 October, 2016; v1 submitted 15 August, 2016;
originally announced August 2016.
-
Lyapunov-based Low-thrust Optimal Orbit Transfer: An approach in Cartesian coordinates
Authors:
Hantian Zhang,
Dong Eui Chang,
Qingjie Cao
Abstract:
This paper presents a simple approach to low-thrust optimal-fuel and optimal-time transfer problems between two elliptic orbits using the Cartesian coordinates system. In this case, an orbit is described by its specific angular momentum and Laplace vectors with a free injection point. Trajectory optimization with the pseudospectral method and nonlinear programming are supported by the initial gues…
▽ More
This paper presents a simple approach to low-thrust optimal-fuel and optimal-time transfer problems between two elliptic orbits using the Cartesian coordinates system. In this case, an orbit is described by its specific angular momentum and Laplace vectors with a free injection point. Trajectory optimization with the pseudospectral method and nonlinear programming are supported by the initial guess generated from the Chang-Chichka-Marsden Lyapunov-based transfer controller. This approach successfully solves several low-thrust optimal problems. Numerical results show that the Lyapunov-based initial guess overcomes the difficulty in optimization caused by the strong oscillation of variables in the Cartesian coordinates system. Furthermore, a comparison of the results shows that obtaining the optimal transfer solution through the polynomial approximation by utilizing Cartesian coordinates is easier than using orbital elements, which normally produce strongly nonlinear equations of motion. In this paper, the Earth's oblateness and shadow effect are not taken into account.
△ Less
Submitted 15 October, 2013;
originally announced October 2013.