Search | arXiv e-print repository

Extended Kalman Filtering for Recursive Online Discrete-Time Inverse Optimal Control

Abstract: We formulate the discrete-time inverse optimal control problem of inferring unknown parameters in the objective function of an optimal control problem from measurements of optimal states and controls as a nonlinear filtering problem. This formulation enables us to propose a novel extended Kalman filter (EKF) for solving inverse optimal control problems in a computationally efficient recursive onli… ▽ More We formulate the discrete-time inverse optimal control problem of inferring unknown parameters in the objective function of an optimal control problem from measurements of optimal states and controls as a nonlinear filtering problem. This formulation enables us to propose a novel extended Kalman filter (EKF) for solving inverse optimal control problems in a computationally efficient recursive online manner that requires only a single pass through the measurement data. Importantly, we show that the Jacobians required to implement our EKF can be computed efficiently by exploiting recent Pontryagin differentiable programming results, and that our consideration of an EKF enables the development of first-of-their-kind theoretical error guarantees for online inverse optimal control with noisy incomplete measurements. Our proposed EKF is shown to be significantly faster than an alternative unscented Kalman filter-based approach. △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: 7 pages, 2 figures, accepted for presentation at 2024 American Control Conference

arXiv:2302.10411 [pdf, other]

Regret Analysis of Online LQR Control via Trajectory Prediction and Tracking: Extended Version

Authors: Yitian Chen, Timothy L. Molloy, Tyler Summers, Iman Shames

Abstract: In this paper, we propose and analyze a new method for online linear quadratic regulator (LQR) control with a priori unknown time-varying cost matrices. The cost matrices are revealed sequentially with the potential for future values to be previewed over a short window. Our novel method involves using the available cost matrices to predict the optimal trajectory, and a tracking controller to drive… ▽ More In this paper, we propose and analyze a new method for online linear quadratic regulator (LQR) control with a priori unknown time-varying cost matrices. The cost matrices are revealed sequentially with the potential for future values to be previewed over a short window. Our novel method involves using the available cost matrices to predict the optimal trajectory, and a tracking controller to drive the system towards it. We adopted the notion of dynamic regret to measure the performance of this proposed online LQR control method, with our main result being that the (dynamic) regret of our method is upper bounded by a constant. Moreover, the regret upper bound decays exponentially with the preview window length, and is extendable to systems with disturbances. We show in simulations that our proposed method offers improved performance compared to other previously proposed online LQR methods. △ Less

Submitted 20 February, 2023; originally announced February 2023.

Comments: Submitted to L4DC2023

MSC Class: 49N10; 49M05

arXiv:2211.01706 [pdf, other]

Minimum-Time Escape from a Circular Region for a Dubins Car

Authors: Timothy L. Molloy, Iman Shames

Abstract: We investigate the problem of finding paths that enable a robot modeled as a Dubins car (i.e., a constant-speed finite-turn-rate unicycle) to escape from a circular region of space in minimum time. This minimum-time escape problem arises in marine, aerial, and ground robotics in situations where a safety region has been violated and must be exited before a potential negative consequence occurs (e.… ▽ More We investigate the problem of finding paths that enable a robot modeled as a Dubins car (i.e., a constant-speed finite-turn-rate unicycle) to escape from a circular region of space in minimum time. This minimum-time escape problem arises in marine, aerial, and ground robotics in situations where a safety region has been violated and must be exited before a potential negative consequence occurs (e.g., a collision). Using the tools of nonlinear optimal control theory, we show that a surprisingly simple closed-form feedback control law solves this minimum-time escape problem, and that the minimum-time paths have an elegant geometric interpretation. △ Less

Submitted 3 November, 2022; originally announced November 2022.

Comments: 7 pages, 5 figures, accepted for 12th IFAC Symposium on Nonlinear Control Systems (NOLCOS)

arXiv:2112.12255 [pdf, other]

doi 10.1109/TAC.2023.3264177

Entropy-Regularized Partially Observed Markov Decision Processes

Authors: Timothy L. Molloy, Girish N. Nair

Abstract: We investigate partially observed Markov decision processes (POMDPs) with cost functions regularized by entropy terms describing state, observation, and control uncertainty. Standard POMDP techniques are shown to offer bounded-error solutions to these entropy-regularized POMDPs, with exact solutions possible when the regularization involves the joint entropy of the state, observation, and control… ▽ More We investigate partially observed Markov decision processes (POMDPs) with cost functions regularized by entropy terms describing state, observation, and control uncertainty. Standard POMDP techniques are shown to offer bounded-error solutions to these entropy-regularized POMDPs, with exact solutions possible when the regularization involves the joint entropy of the state, observation, and control trajectories. Our joint-entropy result is particularly surprising since it constitutes a novel, tractable formulation of active state estimation. △ Less

Submitted 3 February, 2023; v1 submitted 22 December, 2021; originally announced December 2021.

Comments: 20 pages, 2 figures, submitted

arXiv:2108.10227 [pdf, other]

doi 10.1109/TAC.2023.3250159

Smoother Entropy for Active State Trajectory Estimation and Obfuscation in POMDPs

Authors: Timothy L. Molloy, Girish N. Nair

Abstract: We study the problem of controlling a partially observed Markov decision process (POMDP) to either aid or hinder the estimation of its state trajectory. We encode the estimation objectives via the smoother entropy, which is the conditional entropy of the state trajectory given measurements and controls. Consideration of the smoother entropy contrasts with previous approaches that instead resort to… ▽ More We study the problem of controlling a partially observed Markov decision process (POMDP) to either aid or hinder the estimation of its state trajectory. We encode the estimation objectives via the smoother entropy, which is the conditional entropy of the state trajectory given measurements and controls. Consideration of the smoother entropy contrasts with previous approaches that instead resort to marginal (or instantaneous) state entropies due to tractability concerns. By establishing novel expressions for the smoother entropy in terms of the POMDP belief state, we show that both the problems of minimising and maximising the smoother entropy in POMDPs can surprisingly be reformulated as belief-state Markov decision processes with concave cost and value functions. The significance of these reformulations is that they render the smoother entropy a tractable optimisation objective, with structural properties amenable to the use of standard POMDP solution techniques for both active estimation and obfuscation. Simulations illustrate that optimisation of the smoother entropy leads to superior trajectory estimation and obfuscation compared to alternative approaches. △ Less

Submitted 12 February, 2023; v1 submitted 18 August, 2021; originally announced August 2021.

Comments: 41 pages, 3 figures, accepted for publication in IEEE Transactions on Automatic Control

arXiv:2104.01545 [pdf, other]

Active Trajectory Estimation for Partially Observed Markov Decision Processes via Conditional Entropy

Authors: Timothy L. Molloy, Girish N. Nair

Abstract: In this paper, we consider the problem of controlling a partially observed Markov decision process (POMDP) in order to actively estimate its state trajectory over a fixed horizon with minimal uncertainty. We pose a novel active smoothing problem in which the objective is to directly minimise the smoother entropy, that is, the conditional entropy of the (joint) state trajectory distribution of conc… ▽ More In this paper, we consider the problem of controlling a partially observed Markov decision process (POMDP) in order to actively estimate its state trajectory over a fixed horizon with minimal uncertainty. We pose a novel active smoothing problem in which the objective is to directly minimise the smoother entropy, that is, the conditional entropy of the (joint) state trajectory distribution of concern in fixed-interval Bayesian smoothing. Our formulation contrasts with prior active approaches that minimise the sum of conditional entropies of the (marginal) state estimates provided by Bayesian filters. By establishing a novel form of the smoother entropy in terms of the POMDP belief (or information) state, we show that our active smoothing problem can be reformulated as a (fully observed) Markov decision process with a value function that is concave in the belief state. The concavity of the value function is of particular importance since it enables the approximate solution of our active smoothing problem using piecewise-linear function approximations in conjunction with standard POMDP solvers. We illustrate the approximate solution of our active smoothing problem in simulation and compare its performance to alternative approaches based on minimising marginal state estimate uncertainties. △ Less

Submitted 4 April, 2021; originally announced April 2021.

Comments: 7 pages, 3 figures, accepted for presentation at 2021 European Control Conference

arXiv:2103.12881 [pdf, other]

Smoothing-Averse Control: Covertness and Privacy from Smoothers

Authors: Timothy L. Molloy, Girish N. Nair

Abstract: In this paper we investigate the problem of controlling a partially observed stochastic dynamical system such that its state is difficult to infer using a (fixed-interval) Bayesian smoother. This problem arises naturally in applications in which it is desirable to keep the entire state trajectory of a system concealed. We pose our smoothing-averse control problem as the problem of maximising the (… ▽ More In this paper we investigate the problem of controlling a partially observed stochastic dynamical system such that its state is difficult to infer using a (fixed-interval) Bayesian smoother. This problem arises naturally in applications in which it is desirable to keep the entire state trajectory of a system concealed. We pose our smoothing-averse control problem as the problem of maximising the (joint) entropy of smoother state estimates (i.e., the joint conditional entropy of the state trajectory given the history of measurements and controls). We show that the entropy of Bayesian smoother estimates for general nonlinear state-space models can be expressed as the sum of entropies of marginal state estimates given by Bayesian filters. This novel additive form allows us to reformulate the smoothing-averse control problem as a fully observed stochastic optimal control problem in terms of the usual concept of the information (or belief) state, and solve the resulting problem via dynamic programming. We illustrate the applicability of smoothing-averse control to privacy in cloud-based control and covert robotic navigation. △ Less

Submitted 23 March, 2021; originally announced March 2021.

Comments: 8 pages, 2 figures, accepted for presentation at 2021 American Control Conference

arXiv:2009.00150 [pdf, ps, other]

Exactly Optimal Bayesian Quickest Change Detection for Hidden Markov Models

Authors: Jason J. Ford, Jasmin James, Timothy L. Molloy

Abstract: This paper considers the quickest detection problem for hidden Markov models (HMMs) in a Bayesian setting. We construct an augmented HMM representation of the problem that allows the application of a dynamic programming approach to prove that Shiryaev's rule is an (exact) optimal solution. This augmented representation highlights the problem's fundamental information structure and suggests possibl… ▽ More This paper considers the quickest detection problem for hidden Markov models (HMMs) in a Bayesian setting. We construct an augmented HMM representation of the problem that allows the application of a dynamic programming approach to prove that Shiryaev's rule is an (exact) optimal solution. This augmented representation highlights the problem's fundamental information structure and suggests possible relaxations to more exotic change event priors not appearing in the literature. Finally, this augmented representation also allows us to present an efficient computational method for implementing the optimal solution. △ Less

Submitted 15 March, 2023; v1 submitted 31 August, 2020; originally announced September 2020.

arXiv:2005.06153 [pdf, ps, other]

Online Inverse Optimal Control for Control-Constrained Discrete-Time Systems on Finite and Infinite Horizons

Authors: Timothy L. Molloy, Jason J. Ford, Tristan Perez

Abstract: In this paper, we consider the problem of computing parameters of an objective function for a discrete-time optimal control problem from state and control trajectories with active control constraints. We propose a novel method of inverse optimal control that has a computationally efficient online form in which pairs of states and controls from given state and control trajectories are processed seq… ▽ More In this paper, we consider the problem of computing parameters of an objective function for a discrete-time optimal control problem from state and control trajectories with active control constraints. We propose a novel method of inverse optimal control that has a computationally efficient online form in which pairs of states and controls from given state and control trajectories are processed sequentially without being stored or processed in batches. We establish conditions guaranteeing the uniqueness of the objective-function parameters computed by our proposed method from trajectories with active control constraints. We illustrate our proposed method in simulation. △ Less

Submitted 13 May, 2020; originally announced May 2020.

Comments: 10 pages, 4 figures, Accepted for publication in Automatica

arXiv:2004.09748 [pdf, other]

doi 10.1109/TAC.2020.2985975

Misspecified and Asymptotically Minimax Robust Quickest Change Diagnosis

Authors: Timothy L. Molloy

Abstract: The problem of quickly diagnosing an unknown change in a stochastic process is studied. We establish novel bounds on the performance of misspecified diagnosis algorithms designed for changes that differ from those of the process, and pose and solve a new robust quickest change diagnosis problem in the asymptotic regime of few false alarms and false isolations. Simulations suggest that our asymptot… ▽ More The problem of quickly diagnosing an unknown change in a stochastic process is studied. We establish novel bounds on the performance of misspecified diagnosis algorithms designed for changes that differ from those of the process, and pose and solve a new robust quickest change diagnosis problem in the asymptotic regime of few false alarms and false isolations. Simulations suggest that our asymptotically robust solution offers a computationally efficient alternative to generalised likelihood ratio algorithms. △ Less

Submitted 21 April, 2020; originally announced April 2020.

Comments: 19 pages, 2 figures, Accepted for publication in IEEE Transactions on Automatic Control

arXiv:2004.09744 [pdf, other]

An Optimal Bearing-Only-Information Strategy for Unmanned Aircraft Collision Avoidance

Authors: Timothy L. Molloy, Tristan Perez, Brendan P. Williams

Abstract: This paper presents a novel collision avoidance strategy for unmanned aircraft detect and avoid that requires only information about the relative bearing angle between an aircraft and hazard. It is shown that this bearing-only strategy can be conceived as the solution to a novel differential game formulation of collision avoidance, and has several intuitive properties including maximising the inst… ▽ More This paper presents a novel collision avoidance strategy for unmanned aircraft detect and avoid that requires only information about the relative bearing angle between an aircraft and hazard. It is shown that this bearing-only strategy can be conceived as the solution to a novel differential game formulation of collision avoidance, and has several intuitive properties including maximising the instantaneous range acceleration in situations where the hazard is stationary or has a finite turn rate. The performance of the bearing-only strategy is illustrated in simulations based on test cases drawn from draft minimum operating performance standards for unmanned aircraft detect and avoid systems. △ Less

Submitted 21 April, 2020; originally announced April 2020.

Comments: 35 pages, 12 figures, Accepted for publication in the Journal of Guidance, Control and Dynamics

arXiv:1903.03283 [pdf, ps, other]

On the Informativeness of Measurements in Shiryaev's Bayesian Quickest Change Detection

Authors: Jason J. Ford, Jasmin James, Timothy L. Molloy

Abstract: This paper provides the first description of a weak practical super-martingale phenomenon that can emerge in the test statistic in Shiryaev's Bayesian quickest change detection (QCD) problem. We establish that this super-martingale phenomenon can emerge under a condition on the relative entropy between pre and post change densities when the measurements are insufficiently informative to overcome t… ▽ More This paper provides the first description of a weak practical super-martingale phenomenon that can emerge in the test statistic in Shiryaev's Bayesian quickest change detection (QCD) problem. We establish that this super-martingale phenomenon can emerge under a condition on the relative entropy between pre and post change densities when the measurements are insufficiently informative to overcome the change time's geometric prior. We illustrate this super-martingale phenomenon in a simple Bayesian QCD problem which highlights the unsuitability of Shiryaev's test statistic for detecting subtle change events. △ Less

Submitted 16 September, 2019; v1 submitted 7 March, 2019; originally announced March 2019.

arXiv:1903.03270 [pdf, other]

A Novel Technique for Rejecting Non-Aircraft Artefacts in Above Horizon Vision-Based Aircraft Detection

Authors: Jasmin James, Jason J. Ford, Timothy L. Molloy

Abstract: Unmanned aerial vehicle (UAV) operations are steadily expanding into many important applications. A key technology for better enabling their commercial use is an onboard sense and avoid (SAA) technology which can detect potential mid-air collision threats in the same manner expected from a human pilot. Ideally, aircraft should be detected as early as possible whilst maintaining a low false alarm r… ▽ More Unmanned aerial vehicle (UAV) operations are steadily expanding into many important applications. A key technology for better enabling their commercial use is an onboard sense and avoid (SAA) technology which can detect potential mid-air collision threats in the same manner expected from a human pilot. Ideally, aircraft should be detected as early as possible whilst maintaining a low false alarm rate, however, textured clouds and other unstructured terrain make this trade-off a challenge. In this paper we present a new technique for the modelling and detection of aircraft above the horizon that is able to penalise non-aircraft artefacts (such as textured clouds and other unstructured terrain). We evaluate the performance of our proposed system on flight data of a Cessna 172 on a near collision course encounter with a ScanEagle UAV data collection aircraft. By penalising non-aircraft artefacts we are able to demonstrate, for a zero false alarm rate, a mean detection range of 2445m corresponding to an improvement in detection ranges by 9.8% (218m). △ Less

Submitted 10 February, 2020; v1 submitted 7 March, 2019; originally announced March 2019.

arXiv:1804.09846 [pdf, ps, other]

Quickest Detection of Intermittent Signals With Application to Vision Based Aircraft Detection

Authors: Jasmin James, Jason J. Ford, Timothy L. Molloy

Abstract: In this paper we consider the problem of quickly detecting changes in an intermittent signal that can (repeatedly) switch between a normal and an anomalous state. We pose this intermittent signal detection problem as an optimal stop** problem and establish a quickest intermittent signal detection (ISD) rule with a threshold structure. We develop bounds to characterise the performance of our ISD… ▽ More In this paper we consider the problem of quickly detecting changes in an intermittent signal that can (repeatedly) switch between a normal and an anomalous state. We pose this intermittent signal detection problem as an optimal stop** problem and establish a quickest intermittent signal detection (ISD) rule with a threshold structure. We develop bounds to characterise the performance of our ISD rule and establish a new filter for estimating its detection delays. Finally, we examine the performance of our ISD rule in both a simulation study and an important vision based aircraft detection application where the ISD rule demonstrates improvements in detection range and false alarm rates relative to the current state of the art aircraft detection techniques. △ Less

Submitted 25 April, 2018; originally announced April 2018.

Showing 1–14 of 14 results for author: Molloy, T L