Search | arXiv e-print repository

Continuous-Time Reinforcement Learning: New Design Algorithms with Theoretical Insights and Performance Guarantees

Abstract: Continuous-time nonlinear optimal control problems hold great promise in real-world applications. After decades of development, reinforcement learning (RL) has achieved some of the greatest successes as a general nonlinear control design method. However, a recent comprehensive analysis of state-of-the-art continuous-time RL (CT-RL) methods, namely, adaptive dynamic programming (ADP)-based CT-RL al… ▽ More Continuous-time nonlinear optimal control problems hold great promise in real-world applications. After decades of development, reinforcement learning (RL) has achieved some of the greatest successes as a general nonlinear control design method. However, a recent comprehensive analysis of state-of-the-art continuous-time RL (CT-RL) methods, namely, adaptive dynamic programming (ADP)-based CT-RL algorithms, reveals they face significant design challenges due to their complexity, numerical conditioning, and dimensional scaling issues. Despite advanced theoretical results, existing ADP CT-RL synthesis methods are inadequate in solving even small, academic problems. The goal of this work is thus to introduce a suite of new CT-RL algorithms for control of affine nonlinear systems. Our design approach relies on two important factors. First, our methods are applicable to physical systems that can be partitioned into smaller subproblems. This constructive consideration results in reduced dimensionality and greatly improved intuitiveness of design. Second, we introduce a new excitation framework to improve persistence of excitation (PE) and numerical conditioning performance via classical input/output insights. Such a design-centric approach is the first of its kind in the ADP CT-RL community. In this paper, we progressively introduce a suite of (decentralized) excitable integral reinforcement learning (EIRL) algorithms. We provide convergence and closed-loop stability guarantees, and we demonstrate these guarantees on a significant application problem of controlling an unstable, nonminimum phase hypersonic vehicle (HSV). △ Less

Submitted 17 July, 2023; originally announced July 2023.

arXiv:2306.16935 [pdf, other]

A Low-Power Hardware-Friendly Optimisation Algorithm With Absolute Numerical Stability and Convergence Guarantees

Authors: Anis Hamadouche, Yun Wu, Andrew M. Wallace, Joao F. C. Mota

Abstract: We propose Dual-Feedback Generalized Proximal Gradient Descent (DFGPGD) as a new, hardware-friendly, operator splitting algorithm. We then establish convergence guarantees under approximate computational errors and we derive theoretical criteria for the numerical stability of DFGPGD based on absolute stability of dynamical systems. We also propose a new generalized proximal ADMM that can be used t… ▽ More We propose Dual-Feedback Generalized Proximal Gradient Descent (DFGPGD) as a new, hardware-friendly, operator splitting algorithm. We then establish convergence guarantees under approximate computational errors and we derive theoretical criteria for the numerical stability of DFGPGD based on absolute stability of dynamical systems. We also propose a new generalized proximal ADMM that can be used to instantiate most of existing proximal-based composite optimization solvers. We implement DFGPGD and ADMM on FPGA ZCU106 board and compare them in light of FPGA's timing as well as resource utilization and power efficiency. We also perform a full-stack, application-to-hardware, comparison between approximate versions of DFGPGD and ADMM based on dynamic power/error rate trade-off, which is a new hardware-application combined metric. △ Less

Submitted 29 June, 2023; originally announced June 2023.

MSC Class: 65G50; 90C25 ACM Class: B.6.1; B.6.2; B.6.3; B.2.4; C.5.0

arXiv:2203.02204 [pdf, other]

Sharper Bounds for Proximal Gradient Algorithms with Errors

Authors: Anis Hamadouche, Yun Wu, Andrew M. Wallace, Joao F. C. Mota

Abstract: We analyse the convergence of the proximal gradient algorithm for convex composite problems in the presence of gradient and proximal computational inaccuracies. We derive new tighter deterministic and probabilistic bounds that we use to verify a simulated (MPC) and a synthetic (LASSO) optimization problems solved on a reduced-precision machine in combination with an inaccurate proximal operator. W… ▽ More We analyse the convergence of the proximal gradient algorithm for convex composite problems in the presence of gradient and proximal computational inaccuracies. We derive new tighter deterministic and probabilistic bounds that we use to verify a simulated (MPC) and a synthetic (LASSO) optimization problems solved on a reduced-precision machine in combination with an inaccurate proximal operator. We also show how the probabilistic bounds are more robust for algorithm verification and more accurate for application performance guarantees. Under some statistical assumptions, we also prove that some cumulative error terms follow a martingale property. And conforming to observations, e.g., in \cite{schmidt2011convergence}, we also show how the acceleration of the algorithm amplifies the gradient and proximal computational errors. △ Less

Submitted 4 March, 2022; originally announced March 2022.

arXiv:2104.05347 [pdf, other]

Radar SLAM: A Robust SLAM System for All Weather Conditions

Authors: Ziyang Hong, Yvan Petillot, Andrew Wallace, Sen Wang

Abstract: A Simultaneous Localization and Map** (SLAM) system must be robust to support long-term mobile vehicle and robot applications. However, camera and LiDAR based SLAM systems can be fragile when facing challenging illumination or weather conditions which degrade their imagery and point cloud data. Radar, whose operating electromagnetic spectrum is less affected by environmental changes, is promisin… ▽ More A Simultaneous Localization and Map** (SLAM) system must be robust to support long-term mobile vehicle and robot applications. However, camera and LiDAR based SLAM systems can be fragile when facing challenging illumination or weather conditions which degrade their imagery and point cloud data. Radar, whose operating electromagnetic spectrum is less affected by environmental changes, is promising although its distinct sensing geometry and noise characteristics bring open challenges when being exploited for SLAM. % However, there are still open challenges since most existing visual and LiDAR SLAM systems do not operate in bad weathers. This paper studies the use of a Frequency Modulated Continuous Wave radar for SLAM in large-scale outdoor environments. We propose a full radar SLAM system, including a novel radar motion tracking algorithm that leverages radar geometry for reliable feature tracking. It also optimally compensates motion distortion and estimates pose by joint optimization. Its loop closure component is designed to be simple yet efficient for radar imagery by capturing and exploiting structural information of the surrounding environment. % while a scheme to reject ambiguous loop closure candidates is also designed specifically for radar. Extensive experiments on three public radar datasets, ranging from city streets and residential areas to countryside and highways, show competitive accuracy and reliability performance of the proposed radar SLAM system compared to the state-of-the-art LiDAR, vision and radar methods. The results show that our system is technically viable in achieving reliable SLAM in extreme weather conditions, e.g. heavy snow and dense fog, demonstrating the promising potential of using radar for all-weather localization and map**. △ Less

Submitted 12 April, 2021; originally announced April 2021.

Comments: Under review

arXiv:2101.11589 [pdf, other]

doi 10.1088/1748-0221/16/07/P07041

A Convolutional Neural Network based Cascade Reconstruction for the IceCube Neutrino Observatory

Authors: R. Abbasi, M. Ackermann, J. Adams, J. A. Aguilar, M. Ahlers, M. Ahrens, C. Alispach, A. A. Alves Jr., N. M. Amin, R. An, K. Andeen, T. Anderson, I. Ansseau, G. Anton, C. Argüelles, S. Axani, X. Bai, A. Balagopal V., A. Barbano, S. W. Barwick, B. Bastian, V. Basu, V. Baum, S. Baur, R. Bay , et al. (343 additional authors not shown)

Abstract: Continued improvements on existing reconstruction methods are vital to the success of high-energy physics experiments, such as the IceCube Neutrino Observatory. In IceCube, further challenges arise as the detector is situated at the geographic South Pole where computational resources are limited. However, to perform real-time analyses and to issue alerts to telescopes around the world, powerful an… ▽ More Continued improvements on existing reconstruction methods are vital to the success of high-energy physics experiments, such as the IceCube Neutrino Observatory. In IceCube, further challenges arise as the detector is situated at the geographic South Pole where computational resources are limited. However, to perform real-time analyses and to issue alerts to telescopes around the world, powerful and fast reconstruction methods are desired. Deep neural networks can be extremely powerful, and their usage is computationally inexpensive once the networks are trained. These characteristics make a deep learning-based approach an excellent candidate for the application in IceCube. A reconstruction method based on convolutional architectures and hexagonally shaped kernels is presented. The presented method is robust towards systematic uncertainties in the simulation and has been tested on experimental data. In comparison to standard reconstruction methods in IceCube, it can improve upon the reconstruction accuracy, while reducing the time necessary to run the reconstruction by two to three orders of magnitude. △ Less

Submitted 26 July, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

Comments: 39 pages, 15 figures, submitted to Journal of Instrumentation; added references

Journal ref: JINST 16 (2021) P07041

arXiv:2010.09076 [pdf, other]

RADIATE: A Radar Dataset for Automotive Perception in Bad Weather

Authors: Marcel Sheeny, Emanuele De Pellegrin, Saptarshi Mukherjee, Alireza Ahrabian, Sen Wang, Andrew Wallace

Abstract: Datasets for autonomous cars are essential for the development and benchmarking of perception systems. However, most existing datasets are captured with camera and LiDAR sensors in good weather conditions. In this paper, we present the RAdar Dataset In Adverse weaThEr (RADIATE), aiming to facilitate research on object detection, tracking and scene understanding using radar sensing for safe autonom… ▽ More Datasets for autonomous cars are essential for the development and benchmarking of perception systems. However, most existing datasets are captured with camera and LiDAR sensors in good weather conditions. In this paper, we present the RAdar Dataset In Adverse weaThEr (RADIATE), aiming to facilitate research on object detection, tracking and scene understanding using radar sensing for safe autonomous driving. RADIATE includes 3 hours of annotated radar images with more than 200K labelled road actors in total, on average about 4.6 instances per radar image. It covers 8 different categories of actors in a variety of weather conditions (e.g., sun, night, rain, fog and snow) and driving scenarios (e.g., parked, urban, motorway and suburban), representing different levels of challenge. To the best of our knowledge, this is the first public radar dataset which provides high-resolution radar images on public roads with a large amount of road actors labelled. The data collected in adverse weather, e.g., fog and snowfall, is unique. Some baseline results of radar based object detection and recognition are given to show that the use of radar data is promising for automotive applications in bad weather, where vision and LiDAR can fail. RADIATE also has stereo images, 32-channel LiDAR and GPS data, directed at other applications such as sensor fusion, localisation and map**. The public dataset can be accessed at http://pro.hw.ac.uk/radiate/. △ Less

Submitted 5 April, 2021; v1 submitted 18 October, 2020; originally announced October 2020.

Comments: Accepted at IEEE International Conference on Robotics and Automation 2021 (ICRA 2021)

arXiv:1912.03157 [pdf, other]

300 GHz Radar Object Recognition based on Deep Neural Networks and Transfer Learning

Authors: Marcel Sheeny, Andrew Wallace, Sen Wang

Abstract: For high resolution scene map** and object recognition, optical technologies such as cameras and LiDAR are the sensors of choice. However, for robust future vehicle autonomy and driver assistance in adverse weather conditions, improvements in automotive radar technology, and the development of algorithms and machine learning for robust map** and recognition are essential. In this paper, we des… ▽ More For high resolution scene map** and object recognition, optical technologies such as cameras and LiDAR are the sensors of choice. However, for robust future vehicle autonomy and driver assistance in adverse weather conditions, improvements in automotive radar technology, and the development of algorithms and machine learning for robust map** and recognition are essential. In this paper, we describe a methodology based on deep neural networks to recognise objects in 300GHz radar images, investigating robustness to changes in range, orientation and different receivers in a laboratory environment. As the training data is limited, we have also investigated the effects of transfer learning. As a necessary first step before road trials, we have also considered detection and classification in multiple object scenes. △ Less

Submitted 6 December, 2019; originally announced December 2019.

Comments: This paper is a preprint of a paper submitted to IET Radar, Sonar and Navigation. If accepted, the copy of record will be available at the IET Digital Library

arXiv:1910.14377 [pdf, other]

Image-Guided Depth Upsampling via Hessian and TV Priors

Authors: Alireza Ahrabian, Joao F. C. Mota, Andrew M. Wallace

Abstract: We propose a method that combines sparse depth (LiDAR) measurements with an intensity image and to produce a dense high-resolution depth image. As there are few, but accurate, depth measurements from the scene, our method infers the remaining depth values by incorporating information from the intensity image, namely the magnitudes and directions of the identified edges, and by assuming that the sc… ▽ More We propose a method that combines sparse depth (LiDAR) measurements with an intensity image and to produce a dense high-resolution depth image. As there are few, but accurate, depth measurements from the scene, our method infers the remaining depth values by incorporating information from the intensity image, namely the magnitudes and directions of the identified edges, and by assuming that the scene is composed mostly of flat surfaces. Such inference is achieved by solving a convex optimisation problem with properly weighted regularisers that are based on the `1-norm (specifically, on total variation). We solve the resulting problem with a computationally efficient ADMM-based algorithm. Using the SYNTHIA and KITTI datasets, our experiments show that the proposed method achieves a depth reconstruction performance comparable to or better than other model-based methods. △ Less

Submitted 31 October, 2019; originally announced October 2019.

arXiv:1804.02576 [pdf, other]

POL-LWIR Vehicle Detection: Convolutional Neural Networks Meet Polarised Infrared Sensors

Authors: Marcel Sheeny, Andrew Wallace, Mehryar Emambakhsh, Sen Wang, Barry Connor

Abstract: For vehicle autonomy, driver assistance and situational awareness, it is necessary to operate at day and night, and in all weather conditions. In particular, long wave infrared (LWIR) sensors that receive predominantly emitted radiation have the capability to operate at night as well as during the day. In this work, we employ a polarised LWIR (POL-LWIR) camera to acquire data from a mobile vehicle… ▽ More For vehicle autonomy, driver assistance and situational awareness, it is necessary to operate at day and night, and in all weather conditions. In particular, long wave infrared (LWIR) sensors that receive predominantly emitted radiation have the capability to operate at night as well as during the day. In this work, we employ a polarised LWIR (POL-LWIR) camera to acquire data from a mobile vehicle, to compare and contrast four different convolutional neural network (CNN) configurations to detect other vehicles in video sequences. We evaluate two distinct and promising approaches, two-stage detection (Faster-RCNN) and one-stage detection (SSD), in four different configurations. We also employ two different image decompositions: the first based on the polarisation ellipse and the second on the Stokes parameters themselves. To evaluate our approach, the experimental trials were quantified by mean average precision (mAP) and processing time, showing a clear trade-off between the two factors. For example, the best mAP result of 80.94% was achieved using Faster-RCNN, but at a frame rate of 6.4 fps. In contrast, MobileNet SSD achieved only 64.51% mAP, but at 53.4 fps. △ Less

Submitted 7 April, 2018; originally announced April 2018.

Comments: Computer Vision and Pattern Recognition Workshop 2018

arXiv:1706.00672 [pdf, other]

Development of a N-type GM-PHD Filter for Multiple Target, Multiple Type Visual Tracking

Authors: Nathanael L. Baisa, Andrew Wallace

Abstract: We propose a new framework that extends the standard Probability Hypothesis Density (PHD) filter for multiple targets having $N\geq2$ different types based on Random Finite Set theory, taking into account not only background clutter, but also confusions among detections of different target types, which are in general different in character from background clutter. Under Gaussianity and linearity a… ▽ More We propose a new framework that extends the standard Probability Hypothesis Density (PHD) filter for multiple targets having $N\geq2$ different types based on Random Finite Set theory, taking into account not only background clutter, but also confusions among detections of different target types, which are in general different in character from background clutter. Under Gaussianity and linearity assumptions, our framework extends the existing Gaussian mixture (GM) implementation of the standard PHD filter to create a N-type GM-PHD filter. The methodology is applied to real video sequences by integrating object detectors' information into this filter for two scenarios. For both cases, Munkres's variant of the Hungarian assignment algorithm is used to associate tracked target identities between frames. This approach is evaluated and compared to both raw detection and independent GM-PHD filters using the Optimal Sub-pattern Assignment metric and discrimination rate. This shows the improved performance of our strategy on real video sequences. △ Less

Submitted 3 February, 2019; v1 submitted 31 May, 2017; originally announced June 2017.

Comments: arXiv admin note: text overlap with arXiv:1705.04757

arXiv:1705.11175 [pdf, other]

Long-term Correlation Tracking using Multi-layer Hybrid Features in Sparse and Dense Environments

Authors: Nathanael L. Baisa, Deepayan Bhowmik, Andrew Wallace

Abstract: Tracking a target of interest in both sparse and crowded environments is a challenging problem, not yet successfully addressed in the literature. In this paper, we propose a new long-term visual tracking algorithm, learning discriminative correlation filters and using an online classifier, to track a target of interest in both sparse and crowded video sequences. First, we learn a translation corre… ▽ More Tracking a target of interest in both sparse and crowded environments is a challenging problem, not yet successfully addressed in the literature. In this paper, we propose a new long-term visual tracking algorithm, learning discriminative correlation filters and using an online classifier, to track a target of interest in both sparse and crowded video sequences. First, we learn a translation correlation filter using a multi-layer hybrid of convolutional neural networks (CNN) and traditional hand-crafted features. We combine advantages of both the lower convolutional layer which retains more spatial details for precise localization and the higher convolutional layer which encodes semantic information for handling appearance variations, and then integrate these with histogram of oriented gradients (HOG) and color-naming traditional features. Second, we include a re-detection module for overcoming tracking failures due to long-term occlusions by training an incremental (online) SVM on the most confident frames using hand-engineered features. This re-detection module is activated only when the correlation response of the object is below some pre-defined threshold. This generates high score detection proposals which are temporally filtered using a Gaussian mixture probability hypothesis density (GM-PHD) filter to find the detection proposal with the maximum weight as the target state estimate by removing the other detection proposals as clutter. Finally, we learn a scale correlation filter for estimating the scale of a target by constructing a target pyramid around the estimated or re-detected position using the HOG features. We carry out extensive experiments on both sparse and dense data sets which show that our method significantly outperforms state-of-the-art methods. △ Less

Submitted 3 February, 2019; v1 submitted 31 May, 2017; originally announced May 2017.

arXiv:1602.05264 [pdf, ps, other]

Anomaly Detection in Clutter using Spectrally Enhanced Ladar

Authors: Puneet S Chhabra, Andrew M Wallace, James R Hopgood

Abstract: Discrete return (DR) Laser Detection and Ranging (Ladar) systems provide a series of echoes that reflect from objects in a scene. These can be first, last or multi-echo returns. In contrast, Full-Waveform (FW)-Ladar systems measure the intensity of light reflected from objects continuously over a period of time. In a camouflaged scenario, e.g., objects hidden behind dense foliage, a FW-Ladar penet… ▽ More Discrete return (DR) Laser Detection and Ranging (Ladar) systems provide a series of echoes that reflect from objects in a scene. These can be first, last or multi-echo returns. In contrast, Full-Waveform (FW)-Ladar systems measure the intensity of light reflected from objects continuously over a period of time. In a camouflaged scenario, e.g., objects hidden behind dense foliage, a FW-Ladar penetrates such foliage and returns a sequence of echoes including buried faint echoes. The aim of this paper is to learn local-patterns of co-occurring echoes characterised by their measured spectra. A deviation from such patterns defines an abnormal event in a forest/tree depth profile. As far as the authors know, neither DR or FW-Ladar, along with several spectral measurements, has not been applied to anomaly detection. This work presents an algorithm that allows detection of spectral and temporal anomalies in FW-Multi Spectral Ladar (FW-MSL) data samples. An anomaly is defined as a full waveform temporal and spectral signature that does not conform to a prior expectation, represented using a learnt subspace (dictionary) and set of coefficients that capture co-occurring local-patterns using an overlap** temporal window. A modified optimization scheme is proposed for subspace learning based on stochastic approximations. The objective function is augmented with a discriminative term that represents the subspace's separability properties and supports anomaly characterisation. The algorithm detects several man-made objects and anomalous spectra hidden in a dense clutter of vegetation and also allows tree species classification. △ Less

Submitted 16 February, 2016; originally announced February 2016.

arXiv:1508.07136 [pdf]

doi 10.1017/S0956796807006296

RIPL: An Efficient Image Processing DSL for FPGAs

Authors: Robert Stewart, Deepayan Bhowmik, Greg Michaelson, Andrew Wallace

Abstract: Field programmable gate arrays (FPGAs) can accelerate image processing by exploiting fine-grained parallelism opportunities in image operations. FPGA language designs are often subsets or extensions of existing languages, though these typically lack suitable hardware computation models so compiling them to FPGAs leads to inefficient designs. Moreover, these languages lack image processing domain s… ▽ More Field programmable gate arrays (FPGAs) can accelerate image processing by exploiting fine-grained parallelism opportunities in image operations. FPGA language designs are often subsets or extensions of existing languages, though these typically lack suitable hardware computation models so compiling them to FPGAs leads to inefficient designs. Moreover, these languages lack image processing domain specificity. Our solution is RIPL, an image processing domain specific language (DSL) for FPGAs. It has algorithmic skeletons to express image processing, and these are exploited to generate deep pipelines of highly concurrent and memory-efficient image processing components. △ Less

Submitted 28 August, 2015; originally announced August 2015.

Comments: Presented at Second International Workshop on FPGAs for Software Programmers (FSP 2015) (arXiv:1508.06320)

Report number: FSP/2015/16

Journal ref: J. Funct. Prog. 17 (2007) 428-429

arXiv:1502.04154 [pdf, ps, other]

Extracting Hidden Groups and their Structure from Streaming Interaction Data

Authors: Mark K. Goldberg, Mykola Hayvanovych, Malik Magdon-Ismail, William A. Wallace

Abstract: When actors in a social network interact, it usually means they have some general goal towards which they are collaborating. This could be a research collaboration in a company or a foursome planning a golf game. We call such groups \emph{planning groups}. In many social contexts, it might be possible to observe the \emph{dyadic interactions} between actors, even if the actors do not explicitly de… ▽ More When actors in a social network interact, it usually means they have some general goal towards which they are collaborating. This could be a research collaboration in a company or a foursome planning a golf game. We call such groups \emph{planning groups}. In many social contexts, it might be possible to observe the \emph{dyadic interactions} between actors, even if the actors do not explicitly declare what groups they belong too. When groups are not explicitly declared, we call them \emph{hidden groups}. Our particular focus is hidden planning groups. By virtue of their need to further their goal, the actors within such groups must interact in a manner which differentiates their communications from random background communications. In such a case, one can infer (from these interactions) the composition and structure of the hidden planning groups. We formulate the problem of hidden group discovery from streaming interaction data, and we propose efficient algorithms for identifying the hidden group structures by isolating the hidden group's non-random, planning-related, communications from the random background communications. We validate our algorithms on real data (the Enron email corpus and Blog communication data). Analysis of the results reveals that our algorithms extract meaningful hidden group structures. △ Less

Submitted 13 February, 2015; originally announced February 2015.

Showing 1–14 of 14 results for author: Wallace, A