-
A Decentralized Policy Gradient Approach to Multi-task Reinforcement Learning
Authors:
Sihan Zeng,
Aqeel Anwar,
Thinh Doan,
Arijit Raychowdhury,
Justin Romberg
Abstract:
We develop a mathematical framework for solving multi-task reinforcement learning (MTRL) problems based on a type of policy gradient method. The goal in MTRL is to learn a common policy that operates effectively in different environments; these environments have similar (or overlap**) state spaces, but have different rewards and dynamics. We highlight two fundamental challenges in MTRL that are…
▽ More
We develop a mathematical framework for solving multi-task reinforcement learning (MTRL) problems based on a type of policy gradient method. The goal in MTRL is to learn a common policy that operates effectively in different environments; these environments have similar (or overlap**) state spaces, but have different rewards and dynamics. We highlight two fundamental challenges in MTRL that are not present in its single task counterpart, and illustrate them with simple examples. We then develop a decentralized entropy-regularized policy gradient method for solving the MTRL problem, and study its finite-time convergence rate. We demonstrate the effectiveness of the proposed method using a series of numerical experiments. These experiments range from small-scale "GridWorld" problems that readily demonstrate the trade-offs involved in multi-task learning to large-scale problems, where common policies are learned to navigate an airborne drone in multiple (simulated) environments.
△ Less
Submitted 27 May, 2021; v1 submitted 7 June, 2020;
originally announced June 2020.
-
Autonomous Navigation via Deep Reinforcement Learning for Resource Constraint Edge Nodes using Transfer Learning
Authors:
Aqeel Anwar,
Arijit Raychowdhury
Abstract:
Smart and agile drones are fast becoming ubiquitous at the edge of the cloud. The usage of these drones are constrained by their limited power and compute capability. In this paper, we present a Transfer Learning (TL) based approach to reduce on-board computation required to train a deep neural network for autonomous navigation via Deep Reinforcement Learning for a target algorithmic performance.…
▽ More
Smart and agile drones are fast becoming ubiquitous at the edge of the cloud. The usage of these drones are constrained by their limited power and compute capability. In this paper, we present a Transfer Learning (TL) based approach to reduce on-board computation required to train a deep neural network for autonomous navigation via Deep Reinforcement Learning for a target algorithmic performance. A library of 3D realistic meta-environments is manually designed using Unreal Gaming Engine and the network is trained end-to-end. These trained meta-weights are then used as initializers to the network in a test environment and fine-tuned for the last few fully connected layers. Variation in drone dynamics and environmental characteristics is carried out to show robustness of the approach. Using NVIDIA GPU profiler it was shown that the energy consumption and training latency is reduced by 3.7x and 1.8x respectively without significant degradation in the performance in terms of average distance traveled before crash i.e. Mean Safe Flight (MSF). The approach is also tested on a real environment using DJI Tello drone and similar results were reported.
△ Less
Submitted 12 October, 2019;
originally announced October 2019.
-
Direct Feedback Alignment with Sparse Connections for Local Learning
Authors:
Brian Crafton,
Abhinav Parihar,
Evan Gebhardt,
Arijit Raychowdhury
Abstract:
Recent advances in deep neural networks (DNNs) owe their success to training algorithms that use backpropagation and gradient-descent. Backpropagation, while highly effective on von Neumann architectures, becomes inefficient when scaling to large networks. Commonly referred to as the weight transport problem, each neuron's dependence on the weights and errors located deeper in the network require…
▽ More
Recent advances in deep neural networks (DNNs) owe their success to training algorithms that use backpropagation and gradient-descent. Backpropagation, while highly effective on von Neumann architectures, becomes inefficient when scaling to large networks. Commonly referred to as the weight transport problem, each neuron's dependence on the weights and errors located deeper in the network require exhaustive data movement which presents a key problem in enhancing the performance and energy-efficiency of machine-learning hardware. In this work, we propose a bio-plausible alternative to backpropagation drawing from advances in feedback alignment algorithms in which the error computation at a single synapse reduces to the product of three scalar values. Using a sparse feedback matrix, we show that a neuron needs only a fraction of the information previously used by the feedback alignment algorithms. Consequently, memory and compute can be partitioned and distributed whichever way produces the most efficient forward pass so long as a single error can be delivered to each neuron. Our results show orders of magnitude improvement in data movement and $2\times$ improvement in multiply-and-accumulate operations over backpropagation. Like previous work, we observe that any variant of feedback alignment suffers significant losses in classification accuracy on deep convolutional neural networks. By transferring trained convolutional layers and training the fully connected layers using direct feedback alignment, we demonstrate that direct feedback alignment can obtain results competitive with backpropagation. Furthermore, we observe that using an extremely sparse feedback matrix, rather than a dense one, results in a small accuracy drop while yielding hardware advantages. All the code and results are available under https://github.com/bcrafton/ssdfa.
△ Less
Submitted 9 May, 2019; v1 submitted 30 January, 2019;
originally announced March 2019.
-
Appearance-based Gesture recognition in the compressed domain
Authors:
Shaojie Xu,
Anvesha Amaravati,
Justin Romberg,
Arijit Raychowdhury
Abstract:
We propose a novel appearance-based gesture recognition algorithm using compressed domain signal processing techniques. Gesture features are extracted directly from the compressed measurements, which are the block averages and the coded linear combinations of the image sensor's pixel values. We also improve both the computational efficiency and the memory requirement of the previous DTW-based K-NN…
▽ More
We propose a novel appearance-based gesture recognition algorithm using compressed domain signal processing techniques. Gesture features are extracted directly from the compressed measurements, which are the block averages and the coded linear combinations of the image sensor's pixel values. We also improve both the computational efficiency and the memory requirement of the previous DTW-based K-NN gesture classifiers. Both simulation testing and hardware implementation strongly support the proposed algorithm.
△ Less
Submitted 19 February, 2019;
originally announced March 2019.
-
NAVREN-RL: Learning to fly in real environment via end-to-end deep reinforcement learning using monocular images
Authors:
Malik Aqeel Anwar,
Arijit Raychowdhury
Abstract:
We present NAVREN-RL, an approach to NAVigate an unmanned aerial vehicle in an indoor Real ENvironment via end-to-end reinforcement learning RL. A suitable reward function is designed kee** in mind the cost and weight constraints for micro drone with minimum number of sensing modalities. Collection of small number of expert data and knowledge based data aggregation is integrated into the RL proc…
▽ More
We present NAVREN-RL, an approach to NAVigate an unmanned aerial vehicle in an indoor Real ENvironment via end-to-end reinforcement learning RL. A suitable reward function is designed kee** in mind the cost and weight constraints for micro drone with minimum number of sensing modalities. Collection of small number of expert data and knowledge based data aggregation is integrated into the RL process to aid convergence. Experimentation is carried out on a Parrot AR drone in different indoor arenas and the results are compared with other baseline technologies. We demonstrate how the drone successfully avoids obstacles and navigates across different arenas.
△ Less
Submitted 22 July, 2018;
originally announced July 2018.