-
Risk-Aware Robotics: Tail Risk Measures in Planning, Control, and Verification
Authors:
Prithvi Akella,
Anushri Dixit,
Mohamadreza Ahmadi,
Lars Lindemann,
Margaret P. Chapman,
George J. Pappas,
Aaron D. Ames,
Joel W. Burdick
Abstract:
The need for a systematic approach to risk assessment has increased in recent years due to the ubiquity of autonomous systems that alter our day-to-day experiences and their need for safety, e.g., for self-driving vehicles, mobile service robots, and bipedal robots. These systems are expected to function safely in unpredictable environments and interact seamlessly with humans, whose behavior is no…
▽ More
The need for a systematic approach to risk assessment has increased in recent years due to the ubiquity of autonomous systems that alter our day-to-day experiences and their need for safety, e.g., for self-driving vehicles, mobile service robots, and bipedal robots. These systems are expected to function safely in unpredictable environments and interact seamlessly with humans, whose behavior is notably challenging to forecast. We present a survey of risk-aware methodologies for autonomous systems. We adopt a contemporary risk-aware approach to mitigate rare and detrimental outcomes by advocating the use of tail risk measures, a concept borrowed from financial literature. This survey will introduce these measures and explain their relevance in the context of robotic systems for planning, control, and verification applications.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Rollover Prevention for Mobile Robots with Control Barrier Functions: Differentiator-Based Adaptation and Projection-to-State Safety
Authors:
Ersin Das,
Aaron D. Ames,
Joel W. Burdick
Abstract:
This paper develops rollover prevention guarantees for mobile robots using control barrier function (CBF) theory, and demonstrates the method experimentally. We consider a safety measure based on a zero moment point condition through the lens of CBFs. However, these conditions depend on time-varying and noisy parameters. To address this issue, we present a differentiator-based safety-critical cont…
▽ More
This paper develops rollover prevention guarantees for mobile robots using control barrier function (CBF) theory, and demonstrates the method experimentally. We consider a safety measure based on a zero moment point condition through the lens of CBFs. However, these conditions depend on time-varying and noisy parameters. To address this issue, we present a differentiator-based safety-critical controller that estimates these parameters and pairs Input-to-State Stable (ISS) differentiator dynamics with CBFs to achieve rigorous safety guarantees. Additionally, to ensure safety in the presence of disturbances, we utilize a time-varying extension of Projection-to-State Safety (PSSf). The effectiveness of the proposed method is demonstrated via experiments on a tracked robot with a rollover potential on steep slopes.
△ Less
Submitted 15 June, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
Robust Control Barrier Functions using Uncertainty Estimation with Application to Mobile Robots
Authors:
Ersin Das,
Joel W. Burdick
Abstract:
Model uncertainty poses a significant challenge to the implementation of safety-critical control systems. With this as motivation, this paper proposes a safe control design approach that guarantees the robustness of nonlinear feedback systems in the presence of matched or unmatched unmodelled system dynamics and external disturbances. Our approach couples control barrier functions (CBFs) with a ne…
▽ More
Model uncertainty poses a significant challenge to the implementation of safety-critical control systems. With this as motivation, this paper proposes a safe control design approach that guarantees the robustness of nonlinear feedback systems in the presence of matched or unmatched unmodelled system dynamics and external disturbances. Our approach couples control barrier functions (CBFs) with a new uncertainty/disturbance estimator to ensure robust safety against input and state-dependent model uncertainties. We prove upper bounds on the estimator's error and estimated outputs. We use an uncertainty estimator-based composite feedback control law to adaptively improve robust control performance under hard safety constraints by compensating for the matched uncertainty. Then, we robustify existing CBF constraints with this uncertainty estimate and the estimation error bounds to ensure robust safety via a quadratic program (CBF-QP). We also extend our method to higher-order CBFs (HOCBFs) to achieve safety under unmatched uncertainty, which causes relative degree differences with respect to control input and disturbance. We assume the relative degree difference is at most one, resulting in a second-order cone (SOC) condition. The proposed robust HOCBFs method is demonstrated in a simulation of an uncertain elastic actuator control problem. Finally, the efficacy of our method is experimentally demonstrated on a tracked robot with slope-induced matched and unmatched perturbations.
△ Less
Submitted 3 January, 2024;
originally announced January 2024.
-
A Learning-Based Framework for Safe Human-Robot Collaboration with Multiple Backup Control Barrier Functions
Authors:
Neil C. Janwani,
Ersin Daş,
Thomas Touma,
Skylar X. Wei,
Tamas G. Molnar,
Joel W. Burdick
Abstract:
Ensuring robot safety in complex environments is a difficult task due to actuation limits, such as torque bounds. This paper presents a safety-critical control framework that leverages learning-based switching between multiple backup controllers to formally guarantee safety under bounded control inputs while satisfying driver intention. By leveraging backup controllers designed to uphold safety an…
▽ More
Ensuring robot safety in complex environments is a difficult task due to actuation limits, such as torque bounds. This paper presents a safety-critical control framework that leverages learning-based switching between multiple backup controllers to formally guarantee safety under bounded control inputs while satisfying driver intention. By leveraging backup controllers designed to uphold safety and input constraints, backup control barrier functions (BCBFs) construct implicitly defined control invariance sets via a feasible quadratic program (QP). However, BCBF performance largely depends on the design and conservativeness of the chosen backup controller, especially in our setting of human-driven vehicles in complex, e.g, off-road, conditions. While conservativeness can be reduced by using multiple backup controllers, determining when to switch is an open problem. Consequently, we develop a broadcast scheme that estimates driver intention and integrates BCBFs with multiple backup strategies for human-robot interaction. An LSTM classifier uses data inputs from the robot, human, and safety algorithms to continually choose a backup controller in real-time. We demonstrate our method's efficacy on a dual-track robot in obstacle avoidance scenarios. Our framework guarantees robot safety while adhering to driver intention.
△ Less
Submitted 7 March, 2024; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Robust Control Barrier Functions with Uncertainty Estimation
Authors:
Ersin Daş,
Skylar X. Wei,
Joel W. Burdick
Abstract:
This paper proposes a safety controller for control-affine nonlinear systems with unmodelled dynamics and disturbances to improve closed-loop robustness. Uncertainty estimation-based control barrier functions (CBFs) are utilized to ensure robust safety in the presence of model uncertainties, which may depend on control input and states. We present a new uncertainty/disturbance estimator with theor…
▽ More
This paper proposes a safety controller for control-affine nonlinear systems with unmodelled dynamics and disturbances to improve closed-loop robustness. Uncertainty estimation-based control barrier functions (CBFs) are utilized to ensure robust safety in the presence of model uncertainties, which may depend on control input and states. We present a new uncertainty/disturbance estimator with theoretical upper bounds on estimation error and estimated outputs, which are used to ensure robust safety by formulating a convex optimization problem using a high-order CBF. The possibly unsafe nominal feedback controller is augmented with the proposed estimator in two frameworks (1) an uncertainty compensator and (2) a robustifying reformulation of CBF constraint with respect to the estimator outputs. The former scheme ensures safety with performance improvement by adaptively rejecting the matched uncertainty. The second method uses uncertainty estimation to robustify higher-order CBFs for safety-critical control. The proposed methods are demonstrated in simulations of an uncertain adaptive cruise control problem and a multirotor obstacle avoidance situation.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
STEP: Stochastic Traversability Evaluation and Planning for Risk-Aware Off-road Navigation; Results from the DARPA Subterranean Challenge
Authors:
Anushri Dixit,
David D. Fan,
Kyohei Otsu,
Sharmita Dey,
Ali-Akbar Agha-Mohammadi,
Joel W. Burdick
Abstract:
Although autonomy has gained widespread usage in structured and controlled environments, robotic autonomy in unknown and off-road terrain remains a difficult problem. Extreme, off-road, and unstructured environments such as undeveloped wilderness, caves, rubble, and other post-disaster sites pose unique and challenging problems for autonomous navigation. Based on our participation in the DARPA Sub…
▽ More
Although autonomy has gained widespread usage in structured and controlled environments, robotic autonomy in unknown and off-road terrain remains a difficult problem. Extreme, off-road, and unstructured environments such as undeveloped wilderness, caves, rubble, and other post-disaster sites pose unique and challenging problems for autonomous navigation. Based on our participation in the DARPA Subterranean Challenge, we propose an approach to improve autonomous traversal of robots in subterranean environments that are perceptually degraded and completely unknown through a traversability and planning framework called STEP (Stochastic Traversability Evaluation and Planning). We present 1) rapid uncertainty-aware map** and traversability evaluation, 2) tail risk assessment using the Conditional Value-at-Risk (CVaR), 3) efficient risk and constraint-aware kinodynamic motion planning using sequential quadratic programming-based (SQP) model predictive control (MPC), 4) fast recovery behaviors to account for unexpected scenarios that may cause failure, and 5) risk-based gait adaptation for quadrupedal robots. We illustrate and validate extensive results from our experiments on wheeled and legged robotic platforms in field studies at the Valentine Cave, CA (cave environment), Kentucky Underground, KY (mine environment), and Louisville Mega Cavern, KY (final competition site for the DARPA Subterranean Challenge with tunnel, urban, and cave environments).
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
Learning Disturbances Online for Risk-Aware Control: Risk-Aware Flight with Less Than One Minute of Data
Authors:
Prithvi Akella,
Skylar X. Wei,
Joel W. Burdick,
Aaron D. Ames
Abstract:
Recent advances in safety-critical risk-aware control are predicated on apriori knowledge of the disturbances a system might face. This paper proposes a method to efficiently learn these disturbances online, in a risk-aware context. First, we introduce the concept of a Surface-at-Risk, a risk measure for stochastic processes that extends Value-at-Risk -- a commonly utilized risk measure in the ris…
▽ More
Recent advances in safety-critical risk-aware control are predicated on apriori knowledge of the disturbances a system might face. This paper proposes a method to efficiently learn these disturbances online, in a risk-aware context. First, we introduce the concept of a Surface-at-Risk, a risk measure for stochastic processes that extends Value-at-Risk -- a commonly utilized risk measure in the risk-aware controls community. Second, we model the norm of the state discrepancy between the model and the true system evolution as a scalar-valued stochastic process and determine an upper bound to its Surface-at-Risk via Gaussian Process Regression. Third, we provide theoretical results on the accuracy of our fitted surface subject to mild assumptions that are verifiable with respect to the data sets collected during system operation. Finally, we experimentally verify our procedure by augmenting a drone's controller and highlight performance increases achieved via our risk-aware approach after collecting less than a minute of operating data.
△ Less
Submitted 12 December, 2022;
originally announced December 2022.
-
Adaptive Conformal Prediction for Motion Planning among Dynamic Agents
Authors:
Anushri Dixit,
Lars Lindemann,
Skylar Wei,
Matthew Cleaveland,
George J. Pappas,
Joel W. Burdick
Abstract:
This paper proposes an algorithm for motion planning among dynamic agents using adaptive conformal prediction. We consider a deterministic control system and use trajectory predictors to predict the dynamic agents' future motion, which is assumed to follow an unknown distribution. We then leverage ideas from adaptive conformal prediction to dynamically quantify prediction uncertainty from an onlin…
▽ More
This paper proposes an algorithm for motion planning among dynamic agents using adaptive conformal prediction. We consider a deterministic control system and use trajectory predictors to predict the dynamic agents' future motion, which is assumed to follow an unknown distribution. We then leverage ideas from adaptive conformal prediction to dynamically quantify prediction uncertainty from an online data stream. Particularly, we provide an online algorithm uses delayed agent observations to obtain uncertainty sets for multistep-ahead predictions with probabilistic coverage. These uncertainty sets are used within a model predictive controller to safely navigate among dynamic agents. While most existing data-driven prediction approached quantify prediction uncertainty heuristically, we quantify the true prediction uncertainty in a distribution-free, adaptive manner that even allows to capture changes in prediction quality and the agents' motion. We empirically evaluate of our algorithm on a simulation case studies where a drone avoids a flying frisbee.
△ Less
Submitted 30 November, 2022;
originally announced December 2022.
-
Sample-Based Bounds for Coherent Risk Measures: Applications to Policy Synthesis and Verification
Authors:
Prithvi Akella,
Anushri Dixit,
Mohamadreza Ahmadi,
Joel W. Burdick,
Aaron D. Ames
Abstract:
The dramatic increase of autonomous systems subject to variable environments has given rise to the pressing need to consider risk in both the synthesis and verification of policies for these systems. This paper aims to address a few problems regarding risk-aware verification and policy synthesis, by first develo** a sample-based method to bound the risk measure evaluation of a random variable wh…
▽ More
The dramatic increase of autonomous systems subject to variable environments has given rise to the pressing need to consider risk in both the synthesis and verification of policies for these systems. This paper aims to address a few problems regarding risk-aware verification and policy synthesis, by first develo** a sample-based method to bound the risk measure evaluation of a random variable whose distribution is unknown. These bounds permit us to generate high-confidence verification statements for a large class of robotic systems. Second, we develop a sample-based method to determine solutions to non-convex optimization problems that outperform a large fraction of the decision space of possible solutions. Both sample-based approaches then permit us to rapidly synthesize risk-aware policies that are guaranteed to achieve a minimum level of system performance. To showcase our approach in simulation, we verify a cooperative multi-agent system and develop a risk-aware controller that outperforms the system's baseline controller. We also mention how our approach can be extended to account for any $g$-entropic risk measure - the subset of coherent risk measures on which we focus.
△ Less
Submitted 20 April, 2022;
originally announced April 2022.
-
Risk-Averse Receding Horizon Motion Planning for Obstacle Avoidance using Coherent Risk Measures
Authors:
Anushri Dixit,
Mohamadreza Ahmadi,
Joel W. Burdick
Abstract:
This paper studies the problem of risk-averse receding horizon motion planning for agents with uncertain dynamics, in the presence of stochastic, dynamic obstacles. We propose a model predictive control (MPC) scheme that formulates the obstacle avoidance constraint using coherent risk measures. To handle disturbances, or process noise, in the state dynamics, the state constraints are tightened in…
▽ More
This paper studies the problem of risk-averse receding horizon motion planning for agents with uncertain dynamics, in the presence of stochastic, dynamic obstacles. We propose a model predictive control (MPC) scheme that formulates the obstacle avoidance constraint using coherent risk measures. To handle disturbances, or process noise, in the state dynamics, the state constraints are tightened in a risk-aware manner to provide a disturbance feedback policy. We also propose a waypoint following algorithm that uses the proposed MPC scheme for discrete distributions and prove its risk-sensitive recursive feasibility while guaranteeing finite-time task completion. We further investigate some commonly used coherent risk metrics, namely, conditional value-at-risk (CVaR), entropic value-at-risk (EVaR), and g-entropic risk measures, and propose a tractable incorporation within MPC. We illustrate our framework via simulation studies.
△ Less
Submitted 28 September, 2023; v1 submitted 20 April, 2022;
originally announced April 2022.
-
Distributionally Robust Model Predictive Control with Total Variation Distance
Authors:
Anushri Dixit,
Mohamadreza Ahmadi,
Joel W. Burdick
Abstract:
This paper studies the problem of distributionally robust model predictive control (MPC) using total variation distance ambiguity sets. For a discrete-time linear system with additive disturbances, we provide a conditional value-at-risk reformulation of the MPC optimization problem that is distributionally robust in the expected cost and chance constraints. The distributionally robust chance const…
▽ More
This paper studies the problem of distributionally robust model predictive control (MPC) using total variation distance ambiguity sets. For a discrete-time linear system with additive disturbances, we provide a conditional value-at-risk reformulation of the MPC optimization problem that is distributionally robust in the expected cost and chance constraints. The distributionally robust chance constraint is over-approximated as a simpler, tightened chance constraint that reduces the computational burden. Numerical experiments support our results on probabilistic guarantees and computational efficiency.
△ Less
Submitted 24 June, 2022; v1 submitted 22 March, 2022;
originally announced March 2022.
-
Koopman NMPC: Koopman-based Learning and Nonlinear Model Predictive Control of Control-affine Systems
Authors:
Carl Folkestad,
Joel W. Burdick
Abstract:
Koopman-based learning methods can potentially be practical and powerful tools for dynamical robotic systems. However, common methods to construct Koopman representations seek to learn lifted linear models that cannot capture nonlinear actuation effects inherent in many robotic systems. This paper presents a learning and control methodology that is a first step towards overcoming this limitation.…
▽ More
Koopman-based learning methods can potentially be practical and powerful tools for dynamical robotic systems. However, common methods to construct Koopman representations seek to learn lifted linear models that cannot capture nonlinear actuation effects inherent in many robotic systems. This paper presents a learning and control methodology that is a first step towards overcoming this limitation. Using the Koopman canonical transform, control-affine dynamics can be expressed by a lifted bilinear model. The learned model is used for nonlinear model predictive control (NMPC) design where the bilinear structure can be exploited to improve computational efficiency. The benefits for control-affine dynamics compared to existing Koopman-based methods are highlighted through an example of a simulated planar quadrotor. Prediction error is greatly reduced and closed loop performance similar to NMPC with full model knowledge is achieved.
△ Less
Submitted 17 May, 2021;
originally announced May 2021.
-
Risk-Averse Stochastic Shortest Path Planning
Authors:
Mohamadreza Ahmadi,
Anushri Dixit,
Joel W. Burdick,
Aaron D. Ames
Abstract:
We consider the stochastic shortest path planning problem in MDPs, i.e., the problem of designing policies that ensure reaching a goal state from a given initial state with minimum accrued cost. In order to account for rare but important realizations of the system, we consider a nested dynamic coherent risk total cost functional rather than the conventional risk-neutral total expected cost. Under…
▽ More
We consider the stochastic shortest path planning problem in MDPs, i.e., the problem of designing policies that ensure reaching a goal state from a given initial state with minimum accrued cost. In order to account for rare but important realizations of the system, we consider a nested dynamic coherent risk total cost functional rather than the conventional risk-neutral total expected cost. Under some assumptions, we show that optimal, stationary, Markovian policies exist and can be found via a special Bellman's equation. We propose a computational technique based on difference convex programs (DCPs) to find the associated value functions and therefore the risk-averse policies. A rover navigation MDP is used to illustrate the proposed methodology with conditional-value-at-risk (CVaR) and entropic-value-at-risk (EVaR) coherent risk measures.
△ Less
Submitted 26 March, 2021;
originally announced March 2021.
-
Limits of Probabilistic Safety Guarantees when Considering Human Uncertainty
Authors:
Richard Cheng,
Richard M. Murray,
Joel W. Burdick
Abstract:
When autonomous robots interact with humans, such as during autonomous driving, explicit safety guarantees are crucial in order to avoid potentially life-threatening accidents. Many data-driven methods have explored learning probabilistic bounds over human agents' trajectories (i.e. confidence tubes that contain trajectories with probability $δ$), which can then be used to guarantee safety with pr…
▽ More
When autonomous robots interact with humans, such as during autonomous driving, explicit safety guarantees are crucial in order to avoid potentially life-threatening accidents. Many data-driven methods have explored learning probabilistic bounds over human agents' trajectories (i.e. confidence tubes that contain trajectories with probability $δ$), which can then be used to guarantee safety with probability $1-δ$. However, almost all existing works consider $δ\geq 0.001$. The purpose of this paper is to argue that (1) in safety-critical applications, it is necessary to provide safety guarantees with $δ< 10^{-8}$, and (2) current learning-based methods are ill-equipped to compute accurate confidence bounds at such low $δ$. Using human driving data (from the highD dataset), as well as synthetically generated data, we show that current uncertainty models use inaccurate distributional assumptions to describe human behavior and/or require infeasible amounts of data to accurately learn confidence bounds for $δ\leq 10^{-8}$. These two issues result in unreliable confidence bounds, which can have dangerous implications if deployed on safety-critical systems.
△ Less
Submitted 24 March, 2021; v1 submitted 4 March, 2021;
originally announced March 2021.
-
Risk-Sensitive Motion Planning using Entropic Value-at-Risk
Authors:
Anushri Dixit,
Mohamadreza Ahmadi,
Joel W. Burdick
Abstract:
We consider the problem of risk-sensitive motion planning in the presence of randomly moving obstacles. To this end, we adopt a model predictive control (MPC) scheme and pose the obstacle avoidance constraint in the MPC problem as a distributionally robust constraint with a KL divergence ambiguity set. This constraint is the dual representation of the Entropic Value-at-Risk (EVaR). Building upon t…
▽ More
We consider the problem of risk-sensitive motion planning in the presence of randomly moving obstacles. To this end, we adopt a model predictive control (MPC) scheme and pose the obstacle avoidance constraint in the MPC problem as a distributionally robust constraint with a KL divergence ambiguity set. This constraint is the dual representation of the Entropic Value-at-Risk (EVaR). Building upon this viewpoint, we propose an algorithm to follow waypoints and discuss its feasibility and completion in finite time. We compare the policies obtained using EVaR with those obtained using another common coherent risk measure, Conditional Value-at-Risk (CVaR), via numerical experiments for a 2D system. We also implement the waypoint following algorithm on a 3D quadcopter simulation.
△ Less
Submitted 10 April, 2021; v1 submitted 23 November, 2020;
originally announced November 2020.
-
Safe Multi-Agent Interaction through Robust Control Barrier Functions with Learned Uncertainties
Authors:
Richard Cheng,
Mohammad Javad Khojasteh,
Aaron D. Ames,
Joel W. Burdick
Abstract:
Robots operating in real world settings must navigate and maintain safety while interacting with many heterogeneous agents and obstacles. Multi-Agent Control Barrier Functions (CBF) have emerged as a computationally efficient tool to guarantee safety in multi-agent environments, but they assume perfect knowledge of both the robot dynamics and other agents' dynamics. While knowledge of the robot's…
▽ More
Robots operating in real world settings must navigate and maintain safety while interacting with many heterogeneous agents and obstacles. Multi-Agent Control Barrier Functions (CBF) have emerged as a computationally efficient tool to guarantee safety in multi-agent environments, but they assume perfect knowledge of both the robot dynamics and other agents' dynamics. While knowledge of the robot's dynamics might be reasonably well known, the heterogeneity of agents in real-world environments means there will always be considerable uncertainty in our prediction of other agents' dynamics. This work aims to learn high-confidence bounds for these dynamic uncertainties using Matrix-Variate Gaussian Process models, and incorporates them into a robust multi-agent CBF framework. We transform the resulting min-max robust CBF into a quadratic program, which can be efficiently solved in real time. We verify via simulation results that the nominal multi-agent CBF is often violated during agent interactions, whereas our robust formulation maintains safety with a much higher probability and adapts to learned uncertainties
△ Less
Submitted 22 September, 2020; v1 submitted 10 April, 2020;
originally announced April 2020.
-
Episodic Koopman Learning of Nonlinear Robot Dynamics with Application to Fast Multirotor Landing
Authors:
Carl Folkestad,
Daniel Pastor,
Joel W. Burdick
Abstract:
This paper presents a novel episodic method to learn a robot's nonlinear dynamics model and an increasingly optimal control sequence for a set of tasks. The method is based on the {\em Koopman operator} approach to nonlinear dynamical systems analysis, which models the flow of {\em observables} in a function space, rather than a flow in a state space. Practically, this method estimates a nonlinear…
▽ More
This paper presents a novel episodic method to learn a robot's nonlinear dynamics model and an increasingly optimal control sequence for a set of tasks. The method is based on the {\em Koopman operator} approach to nonlinear dynamical systems analysis, which models the flow of {\em observables} in a function space, rather than a flow in a state space. Practically, this method estimates a nonlinear diffeomorphism that lifts the dynamics to a higher dimensional space where they are linear. Efficient Model Predictive Control methods can then be applied to the lifted model. This approach allows for real time implementation in on-board hardware, with rigorous incorporation of both input and state constraints during learning. We demonstrate the method in a real-time implementation of fast multirotor landing, where the nonlinear ground effect is learned and used to improve landing speed and quality.
△ Less
Submitted 3 April, 2020;
originally announced April 2020.
-
Barrier Functions for Multiagent-POMDPs with DTL Specifications
Authors:
Mohamadreza Ahmadi,
Andrew Singletary,
Joel W. Burdick,
Aaron D. Ames
Abstract:
Multi-agent partially observable Markov decision processes (MPOMDPs) provide a framework to represent heterogeneous autonomous agents subject to uncertainty and partial observation. In this paper, given a nominal policy provided by a human operator or a conventional planning method, we propose a technique based on barrier functions to design a minimally interfering safety-shield ensuring satisfact…
▽ More
Multi-agent partially observable Markov decision processes (MPOMDPs) provide a framework to represent heterogeneous autonomous agents subject to uncertainty and partial observation. In this paper, given a nominal policy provided by a human operator or a conventional planning method, we propose a technique based on barrier functions to design a minimally interfering safety-shield ensuring satisfaction of high-level specifications in terms of linear distribution temporal logic (LDTL). To this end, we use sufficient and necessary conditions for the invariance of a given set based on discrete-time barrier functions (DTBFs) and formulate sufficient conditions for finite time DTBF to study finite time convergence to a set. We then show that different LDTL mission/safety specifications can be cast as a set of invariance or finite time reachability problems. We demonstrate that the proposed method for safety-shield synthesis can be implemented online by a sequence of one-step greedy algorithms. We demonstrate the efficacy of the proposed method using experiments involving a team of robots.
△ Less
Submitted 18 March, 2020;
originally announced March 2020.
-
Temporal Segmentation of Surgical Sub-tasks through Deep Learning with Multiple Data Sources
Authors:
Yidan Qin,
Sahba Aghajani Pedram,
Seyedshams Feyzabadi,
Max Allan,
A. Jonathan McLeod,
Joel W. Burdick,
Mahdi Azizian
Abstract:
Many tasks in robot-assisted surgeries (RAS) can be represented by finite-state machines (FSMs), where each state represents either an action (such as picking up a needle) or an observation (such as bleeding). A crucial step towards the automation of such surgical tasks is the temporal perception of the current surgical scene, which requires a real-time estimation of the states in the FSMs. The ob…
▽ More
Many tasks in robot-assisted surgeries (RAS) can be represented by finite-state machines (FSMs), where each state represents either an action (such as picking up a needle) or an observation (such as bleeding). A crucial step towards the automation of such surgical tasks is the temporal perception of the current surgical scene, which requires a real-time estimation of the states in the FSMs. The objective of this work is to estimate the current state of the surgical task based on the actions performed or events occurred as the task progresses. We propose Fusion-KVE, a unified surgical state estimation model that incorporates multiple data sources including the Kinematics, Vision, and system Events. Additionally, we examine the strengths and weaknesses of different state estimation models in segmenting states with different representative features or levels of granularity. We evaluate our model on the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), as well as a more complex dataset involving robotic intra-operative ultrasound (RIOUS) imaging, created using the da Vinci Xi surgical system. Our model achieves a superior frame-wise state estimation accuracy up to 89.4%, which improves the state-of-the-art surgical state estimation models in both JIGSAWS suturing dataset and our RIOUS dataset.
△ Less
Submitted 7 February, 2020;
originally announced February 2020.
-
Stochastic Finite State Control of POMDPs with LTL Specifications
Authors:
Mohamadreza Ahmadi,
Rangoli Sharan,
Joel W. Burdick
Abstract:
Partially observable Markov decision processes (POMDPs) provide a modeling framework for autonomous decision making under uncertainty and imperfect sensing, e.g. robot manipulation and self-driving cars. However, optimal control of POMDPs is notoriously intractable. This paper considers the quantitative problem of synthesizing sub-optimal stochastic finite state controllers (sFSCs) for POMDPs such…
▽ More
Partially observable Markov decision processes (POMDPs) provide a modeling framework for autonomous decision making under uncertainty and imperfect sensing, e.g. robot manipulation and self-driving cars. However, optimal control of POMDPs is notoriously intractable. This paper considers the quantitative problem of synthesizing sub-optimal stochastic finite state controllers (sFSCs) for POMDPs such that the probability of satisfying a set of high-level specifications in terms of linear temporal logic (LTL) formulae is maximized. We begin by casting the latter problem into an optimization and use relaxations based on the Poisson equation and McCormick envelopes. Then, we propose an stochastic bounded policy iteration algorithm, leading to a controlled growth in sFSC size and an any time algorithm, where the performance of the controller improves with successive iterations, but can be stopped by the user based on time or memory considerations. We illustrate the proposed method by a robot navigation case study.
△ Less
Submitted 21 January, 2020;
originally announced January 2020.
-
Control Regularization for Reduced Variance Reinforcement Learning
Authors:
Richard Cheng,
Abhinav Verma,
Gabor Orosz,
Swarat Chaudhuri,
Yisong Yue,
Joel W. Burdick
Abstract:
Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of t…
▽ More
Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a policy prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the policy prior has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.
△ Less
Submitted 13 May, 2019;
originally announced May 2019.
-
End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks
Authors:
Richard Cheng,
Gabor Orosz,
Richard M. Murray,
Joel W. Burdick
Abstract:
Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break before an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2)…
▽ More
Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break before an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2) model-based controllers utilizing control barrier functions (CBFs) and (3) on-line learning of the unknown system dynamics, in order to ensure safety during learning. Our general framework leverages the success of RL algorithms to learn high-performance controllers, while the CBF-based controllers both guarantee safety and guide the learning process by constraining the set of explorable polices. We utilize Gaussian Processes (GPs) to model the system dynamics and its uncertainties.
Our novel controller synthesis algorithm, RL-CBF, guarantees safety with high probability during the learning process, regardless of the RL algorithm used, and demonstrates greater policy exploration efficiency. We test our algorithm on (1) control of an inverted pendulum and (2) autonomous car-following with wireless vehicle-to-vehicle communication, and show that our algorithm attains much greater sample efficiency in learning than other state-of-the-art algorithms and maintains safety during the entire learning process.
△ Less
Submitted 20 March, 2019;
originally announced March 2019.
-
Convex Model Predictive Control for Vehicular Systems
Authors:
Tiffany A. Huang,
Matanya B. Horowitz,
Joel W. Burdick
Abstract:
In this work, we present a method to perform Model Predictive Control (MPC) over systems whose state is an element of $SO(n)$ for $n=2,3$. This is done without charts or any local linearization, and instead is performed by operating over the orbitope of rotation matrices. This results in a novel MPC scheme without the drawbacks associated with conventional linearization techniques. Instead, second…
▽ More
In this work, we present a method to perform Model Predictive Control (MPC) over systems whose state is an element of $SO(n)$ for $n=2,3$. This is done without charts or any local linearization, and instead is performed by operating over the orbitope of rotation matrices. This results in a novel MPC scheme without the drawbacks associated with conventional linearization techniques. Instead, second order cone- or semidefinite-constraints on state variables are the only requirement beyond those of a QP-scheme typical for MPC of linear systems. Of particular emphasis is the application to aeronautical and vehicular systems, wherein the method removes many of the transcendental trigonometric terms associated with these systems' state space equations. Furthermore, the method is shown to be compatible with many existing variants of MPC, including obstacle avoidance via Mixed Integer Linear Programming (MILP).
△ Less
Submitted 10 October, 2014;
originally announced October 2014.
-
Linear Hamilton Jacobi Bellman Equations in High Dimensions
Authors:
Matanya B. Horowitz,
Anil Damle,
Joel W. Burdick
Abstract:
The Hamilton Jacobi Bellman Equation (HJB) provides the globally optimal solution to large classes of control problems. Unfortunately, this generality comes at a price, the calculation of such solutions is typically intractible for systems with more than moderate state space size due to the curse of dimensionality. This work combines recent results in the structure of the HJB, and its reduction to…
▽ More
The Hamilton Jacobi Bellman Equation (HJB) provides the globally optimal solution to large classes of control problems. Unfortunately, this generality comes at a price, the calculation of such solutions is typically intractible for systems with more than moderate state space size due to the curse of dimensionality. This work combines recent results in the structure of the HJB, and its reduction to a linear Partial Differential Equation (PDE), with methods based on low rank tensor representations, known as a separated representations, to address the curse of dimensionality. The result is an algorithm to solve optimal control problems which scales linearly with the number of states in a system, and is applicable to systems that are nonlinear with stochastic forcing in finite-horizon, average cost, and first-exit settings. The method is demonstrated on inverted pendulum, VTOL aircraft, and quadcopter models, with system dimension two, six, and twelve respectively.
△ Less
Submitted 21 September, 2014; v1 submitted 3 April, 2014;
originally announced April 2014.