HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.

  • failed: pbox

Authors: achieve the best HTML results from your LaTeX submissions by following these best practices.

License: CC BY 4.0
arXiv:2402.01617v1 [cs.RO] 02 Feb 2024

A GP-based Robust Motion Planning Framework for Agile Autonomous Robot Navigation and Recovery in Unknown Environments

Nicholas Mohammad, Jacob Higgins, Nicola Bezzo Nicholas Mohammad, Jacob Higgins, and Nicola Bezzo are with the Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, VA 22903, USA {nm9ur, jdh4je, nbezzo}@virginia.edu
Abstract

For autonomous mobile robots, uncertainties in the environment and system model can lead to failure in the motion planning pipeline, resulting in potential collisions. In order to achieve a high level of robust autonomy, these robots should be able to proactively predict and recover from such failures. To this end, we propose a Gaussian Process (GP) based model for proactively detecting the risk of future motion planning failure. When this risk exceeds a certain threshold, a recovery behavior is triggered that leverages the same GP model to find a safe state from which the robot may continue towards the goal. The proposed approach is trained in simulation only and can generalize to real world environments on different robotic platforms. Simulations and physical experiments demonstrate that our framework is capable of both predicting planner failures and recovering the robot to states where planner success is likely, all while producing agile motion.

Note: Videos of the simulations and experiments are provided in the supplementary material and at https://www.bezzorobotics.com/nm-icra24.

I Introduction

Robust motion planning for autonomous mobile robots (AMR) remains an open problem for the robotics community. One of the main challenges is to navigate through environments in the presence of uncertainty, like an unknown map a priori or inaccurate system models. For example, this lack of robustness was clearly evidenced at the ICRA BARN challenge [1, 2], in which no team was able to navigate a robot through an unknown, cluttered environment without any collisions111our team placed second in this competition. Within the navigation stack, the cause of such runtime failures and possible collisions is typically attributed to the motion planning pipeline.

To prevent such situations, reactive approaches have been developed that detect potentially risky states as they occur [3]. These reactive approaches, however, suffer from poor performance because they are often tuned to be conservative and overly cautious, since it is better to actively avoid unsafe states before they occur. They also do not perform well for high-inertial systems which need an appropriately large reaction time in order to avoid collision. Alternatively, proactive approaches classify future robot states as safe or unsafe based on the current sensor readings and motion plan [4, 5]. These approaches often rely on complex deep learning models which require exhaustive real world training data to detect the safety of future states. Furthermore, these proactive approaches do not solve the problem of recovery after detecting such risky states in the planning horizon.

Consider the case in Fig. 1, where the robot is tasked with navigating quickly through an unknown, occluded environment. Without any proactive scheme, the robot suddenly recognizes the dead end, and does not have enough time to stop before collision. If these motion planning failures are proactively predicted for future states, then the robot can stop before crashing (𝒙Bsubscript𝒙𝐵\bm{x}_{B}bold_italic_x start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT in Fig. 1), maneuver to a safe recovery point 𝒙rsubscript𝒙𝑟\bm{x}_{r}bold_italic_x start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT, then return to nominal planning towards the final goal. This exact test case will be discussed in Sec. VI, along with the experimental results.

Refer to caption
Refer to caption
Figure 1: Example in which an unexpected dead end could cause a motion planning failure. The proposed approach predicts the risk of such a failure and recovers before it can occur.

To achieve this behavior, in this work we propose a proactive- and recovery-focused approach that seeks to predict the risk of failure for a receding horizon, safe corridor motion planner, as well as recover from these potential failures. Such a planner is chosen due to its effectiveness at navigating unknown environments as well as its ubiquity within the robotics community. Additionally, we have found that such a planner requires a relatively small number of features to correctly classify potential failures. A Gaussian Process (GP) is trained on simulated data to predict failures along the planned receding horizon trajectory. When the predicted risk meets a certain threshold criterion, the robot is stopped and a recovery behavior is engaged. This process leverages the same GP to find a nearby safe state from which the robot can safely negotiate its immediate environment and continue motion towards its ultimate goal.

The contribution of this work is a complete and robust motion planning pipeline for robot navigation in unknown environments with two main innovations: 1) a proactive planner failure detection scheme in which a model agnostic, proactive GP-based approach detects and predicts future planning failures and their risk within a horizon, without the need to retrain between simulation and the real world and 2) a robust recovery scheme in which a GP-based, sampling-based recovery method drives the robot to a safe recovery point in order to continue with nominal planning.

II Related Work

While motion planning is an active field of research within the robotics community, the problem of robust, agile navigation through cluttered, unknown environments remains unsolved [1]. Many state-of-the-art motion planners impose hard constraints within a nonlinear optimization problem and use numerical solvers to generate the final trajectories within safe corridors [6, 7, 8]. However, random disturbances and occluded obstacles may cause constraint violations at runtime, leading to an inability to generate updated trajectories. [9] considers the potential for planner failure by generating an additional safe trajectory which stops within known free space at each planning iteration. However, they do not provide any recovery behaviors in case the vehicle is unable to find feasible trajectories at the stop** point. A popular alternative to the hard-constrained methods are soft-constrained planners, where the hard constraints are converted into differentiable terms and put into the cost function of an unconstrained nonlinear optimization problem [10, 11]. While the soft-constrained methods generate trajectories even when constraints aren’t satisfied, conflicting terms within the cost function can lead to low quality solutions, i.e, unsafe or untrackable trajectories [12]. In this paper, we work with the hard-constrained motion planner paradigm and develop an algorithm to monitor for and recover from possible failures proactively at runtime.

Safety monitoring during runtime motion planning is a problem with a catalogue of potential solutions. One well studied technique is Hamilton-Jacobi-Isaacs (HJI) reachability analysis, where safe control is transformed into a formal verification method with theoretical safety guarantees. However, HJI reachability requires an accurate model of the system and suffers from the curse of dimensionality [13]. In order to overcome this problem, recent works have used machine learning techniques to approximate and learn from the generated reachable sets. [14] leverages Adaptive Kriging using a surrogate GP model and Monte Carlo sampling to approximate the sets at runtime. [15] uses a neural network trained on ground truth reachable sets to output binary safe/unsafe classifications for planned trajectories. While these works get around the intractability of runtime reachability analysis, they still rely on specific, accurate system models, limiting their generalizability.

Machine learning methods are also used to monitor vehicle safety outside of the reachability context, stop** the robot when anomalous states are detected. [4] and [5] proactively predict anomalous states which lead to collisions and stop the vehicle before reaching them. However, these works either implement trivial backup and rotate recovery behaviors with no consideration for planning success, or rely on humans to perform the recovery for them.

Our approach leverages machine learning techniques to monitor vehicle safety through planner failure detection. Specifically, we train on the distribution of failures over hand-selected input features which enable our approach to be model agnostic and require only training data from simulation. To the best of our knowledge, our work is the first to utilize a learning component to both proactively predict future planning failures and recover after prediction.

III Problem Formulation

Given a mobile robot system tasked to navigate an unknown environment, let 𝒙˙=g(𝒙,𝒖)˙𝒙𝑔𝒙𝒖\dot{\bm{x}}=g(\bm{x},\bm{u})over˙ start_ARG bold_italic_x end_ARG = italic_g ( bold_italic_x , bold_italic_u ) define the equations of motion for the system with state 𝒙nx𝒙superscriptsubscript𝑛𝑥\bm{x}\in\mathbb{R}^{n_{x}}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_POSTSUPERSCRIPT and control inputs 𝒖nu𝒖superscriptsubscript𝑛𝑢\bm{u}\in\mathbb{R}^{n_{u}}bold_italic_u ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_POSTSUPERSCRIPT. These controls are produced by a low-level controller that is tracking a time-based trajectory 𝝉(t;t0)nx𝝉𝑡subscript𝑡0superscriptsubscript𝑛𝑥\bm{\tau}(t;t_{0})\in\mathbb{R}^{n_{x}}bold_italic_τ ( italic_t ; italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_POSTSUPERSCRIPT generated at time t0subscript𝑡0t_{0}italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. The purpose of this trajectory is to provide a high-level path plan over a future horizon t[t0,t0+TH]𝑡subscript𝑡0subscript𝑡0subscript𝑇𝐻t\in[t_{0},t_{0}+T_{H}]italic_t ∈ [ italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_T start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ] from the current state of the robot 𝒙(t0)𝒙subscript𝑡0\bm{x}(t_{0})bold_italic_x ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) towards a goal 𝒙gsubscript𝒙𝑔\bm{x}_{g}bold_italic_x start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT while avoiding the state subset 𝒳O(t0)subscript𝒳𝑂subscript𝑡0\mathcal{X}_{O}(t_{0})caligraphic_X start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) occupied by obstacles currently known to the robot. While tracking this trajectory, information about obstacles in the environment are updated at runtime so that, in general, 𝒳O(t)𝒳O(t0)subscript𝒳𝑂𝑡subscript𝒳𝑂subscript𝑡0\mathcal{X}_{O}(t)\neq\mathcal{X}_{O}(t_{0})caligraphic_X start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT ( italic_t ) ≠ caligraphic_X start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). This means the trajectory 𝝉(t;t0)𝝉𝑡subscript𝑡0\bm{\tau}(t;t_{0})bold_italic_τ ( italic_t ; italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) has the potential to collide with these newly discovered obstacles; if this is the case, then a new trajectory must be re-planned. Practical path planners, however, suffer from planning failures within certain situations due to infeasible constraints for the current planning iteration. While a single path-planning failure may not be fatal, several failures within the planning horizon could lead to unsafe situations for the robot.

Problem 1: Proactive Planner Failure Detection: Let {𝒙^i}subscript^𝒙𝑖\left\{\hat{\bm{x}}_{i}\right\}{ over^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } be a set of predicted future states for the robot while tracking 𝝉𝝉\bm{\tau}bold_italic_τ over some horizon TFTHsubscript𝑇𝐹subscript𝑇𝐻T_{F}\leq T_{H}italic_T start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ≤ italic_T start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT. For a given motion planning policy ΠΠ\Piroman_Π, define the random variable Z𝝉𝕎subscript𝑍𝝉𝕎Z_{\bm{\tau}}\in\mathbb{W}italic_Z start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT ∈ blackboard_W as the number of motion planning failures that occurs from t0subscript𝑡0t_{0}italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to t0+TFsubscript𝑡0subscript𝑇𝐹t_{0}+T_{F}italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_T start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT while the robot tracks 𝝉𝝉\bm{\tau}bold_italic_τ, with P(Z𝝉)𝑃subscript𝑍𝝉P(Z_{\bm{\tau}})italic_P ( italic_Z start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT ) denoting their probabilities. We seek the creation of a risk metric ρ𝝉subscript𝜌𝝉\rho_{\bm{\tau}}\in\mathbb{R}italic_ρ start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT ∈ blackboard_R that maps from Z𝝉subscript𝑍𝝉Z_{\bm{\tau}}italic_Z start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT to a single real number that characterizes the risk of path planner failure over TFsubscript𝑇𝐹T_{F}italic_T start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT.

Problem 2: Recovery After Failure Detection: We seek a recovery strategy Πr(𝒙0)subscriptΠ𝑟subscript𝒙0\Pi_{r}(\bm{x}_{0})roman_Π start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) that, when the risk ρ𝝉subscript𝜌𝝉\rho_{\bm{\tau}}italic_ρ start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT exceeds a threshold ψρsubscript𝜓𝜌\psi_{\rho}italic_ψ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT, stops the robot and performs a recovery behavior to reduce the risk of planner failure back down to an acceptable level. Specifically, define Z𝒙{0,1}subscript𝑍𝒙01Z_{\bm{x}}\in\{0,1\}italic_Z start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT ∈ { 0 , 1 } to represent the success (00) or failure (1111) of ΠΠ\Piroman_Π from state 𝒙𝒙\bm{x}bold_italic_x. The objective of the recovery policy ΠrsubscriptΠ𝑟\Pi_{r}roman_Π start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT is to locate and control the vehicle to a nearby state 𝒙rsubscript𝒙𝑟\bm{x}_{r}bold_italic_x start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT which maximizes the expected success of ΠΠ\Piroman_Π:

𝒙r=argmin𝒙𝒳O𝔼[Z𝒙].subscript𝒙𝑟subscriptargmin𝒙subscript𝒳𝑂𝔼delimited-[]subscript𝑍𝒙\bm{x}_{r}=\operatorname*{arg\,min}_{\bm{x}\not\in\mathcal{X}_{O}}\mathbb{E}% \left[Z_{\bm{x}}\right].bold_italic_x start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT = start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT bold_italic_x ∉ caligraphic_X start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT end_POSTSUBSCRIPT blackboard_E [ italic_Z start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT ] . (1)

In the following section, we discuss in detail the design of ΠΠ\Piroman_Π and ΠrsubscriptΠ𝑟\Pi_{r}roman_Π start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT, and demonstrate that proactively detecting planner failure and recovering after detection can be achieved by leveraging the same data-informed model.

IV Approach

Refer to caption
Figure 2: Block diagram for the proposed approach.
Refer to caption
(a) front-end
Refer to caption
(b) back-end
Refer to caption
(c) failure detection
Refer to caption
(d) recovery
Figure 3: An example case study investigated in this work visualizing the complete navigation pipeline (a,b), in which the receding horizon, safe corridor motion planner fails (c), prompting the recovery pipeline to take over and recover the system (d). For 𝝉(t)𝝉𝑡\bm{\tau}(t)bold_italic_τ ( italic_t ), brighter colors denote higher speeds.

We propose a GP regression-based scheme to assess the risk of future motion planning failure while tracking a trajectory 𝝉(t)𝝉𝑡\bm{\tau}(t)bold_italic_τ ( italic_t ). Data were collected from simulations that record motion planning successes and failures in various states that the robot may encounter during typical operation. This data were used to train a GP regression model to predict the probability of motion planning failure for individual states over a future horizon. Fig. 2 shows the outline of our approach. The front-end of the motion planner policy ΠΠ\Piroman_Π generates a corridor 𝒞𝒞\mathcal{C}caligraphic_C of convex polytopes, illustrated in Fig. 3(a). The corridor is then sent to the back-end for final trajectory generation 𝝉(t)𝝉𝑡\bm{\tau}(t)bold_italic_τ ( italic_t ) (see Fig. 3(b)). A Model Predictive Controller (MPC) is then used to generate the control signal to track 𝝉(t)𝝉𝑡\bm{\tau}(t)bold_italic_τ ( italic_t ), generating a sequence of future robot states {𝒙^i}subscript^𝒙𝑖\{\hat{\bm{x}}_{i}\}{ over^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } over horizon TFsubscript𝑇𝐹T_{F}italic_T start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT. These states, along with the corridor 𝒞𝒞\mathcal{C}caligraphic_C, are used to predict the risk of motion planning failure ρ𝝉subscript𝜌𝝉\rho_{\bm{\tau}}italic_ρ start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT. Consider the case shown in Fig. 3(c), where the vehicle is driving towards a previously occluded dead-end. If the predicted risk from our GP-based failure detection model exceeds a user-defined threshold ψρsubscript𝜓𝜌\psi_{\rho}italic_ψ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT, then the recovery behavior is triggered, and a recovery goal 𝒙rsubscript𝒙𝑟\bm{x}_{r}bold_italic_x start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT is sent to our go-to-goal (GTG) MPC to bring the vehicle to a state where solver success is likely (Fig. 3(d)). In the following sections, we describe in detail our motion planner failure prediction and recovery framework, starting with a brief background of the base motion planner.

IV-A Motion Planner Preliminaries

1) Planner Front-End. The front-end starts with the global occupancy map \mathcal{M}caligraphic_M, which is generated by fusing data from an onboard depth sensor, along with the current state of the vehicle, 𝒙(t0)𝒙subscript𝑡0\bm{x}(t_{0})bold_italic_x ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), and the goal state 𝒙gsubscript𝒙𝑔\bm{x}_{g}bold_italic_x start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT. As shown in Fig. 3(a), an initial 0-order path within the free and unknown space of \mathcal{M}caligraphic_M is generated by using a graph-based, global planner. In this work, we use the Jump Point Search (JPS) algorithm [16], due to the reduced computational complexity when compared to other common algorithms like A*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT .

A corridor 𝒞𝒞\mathcal{C}caligraphic_C of intersecting convex polytopes is then established along this generated initial path, in order to connect 𝒙(t0)𝒙subscript𝑡0\bm{x}(t_{0})bold_italic_x ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) to 𝒙gsubscript𝒙𝑔\bm{x}_{g}bold_italic_x start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT. Each Ci𝒞subscript𝐶𝑖𝒞C_{i}\in\mathcal{C}italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_C is represented as an H-polytope defined by a matrix 𝑨isubscript𝑨𝑖\bm{A}_{i}bold_italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and vector 𝒃isubscript𝒃𝑖\bm{b}_{i}bold_italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT that define a convex set of points 𝒑2𝒑superscript2\bm{p}\in\mathbb{R}^{2}bold_italic_p ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT in the xy𝑥𝑦xyitalic_x italic_y plane

Ci={𝒑2|𝑨𝒊𝒑𝒃𝒊}.subscript𝐶𝑖conditional-set𝒑superscript2subscript𝑨𝒊𝒑subscript𝒃𝒊C_{i}=\{\bm{p}\in\mathbb{R}^{2}\,|\,\bm{A_{i}}\bm{p}\leq\bm{b_{i}}\}.italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = { bold_italic_p ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT | bold_italic_A start_POSTSUBSCRIPT bold_italic_i end_POSTSUBSCRIPT bold_italic_p ≤ bold_italic_b start_POSTSUBSCRIPT bold_italic_i end_POSTSUBSCRIPT } . (2)

In order to generate each Cisubscript𝐶𝑖C_{i}italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT of the corridor 𝒞𝒞\mathcal{C}caligraphic_C, we rely on the gradient-based optimization approach in [10]. With 𝒞𝒞\mathcal{C}caligraphic_C constructed, the corridor is sent along with 𝒙(t0)𝒙subscript𝑡0\bm{x}(t_{0})bold_italic_x ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) and 𝒙gsubscript𝒙𝑔\bm{x}_{g}bold_italic_x start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT to the back-end optimization to find the final trajectory 𝝉(t)𝝉𝑡\bm{\tau}(t)bold_italic_τ ( italic_t ).

2) Planner Back-End. We represent the trajectory 𝝉(t)𝝉𝑡\bm{\tau}(t)bold_italic_τ ( italic_t ) (shown in Fig. 3(b)) as a collection of N𝑁Nitalic_N cubic (n=3𝑛3n=3italic_n = 3) Bézier curves. We use these curves for the trajectory formulation as they are a commonly utilized basis with several salient properties for corridor-based motion planners [7]. One useful property of the Bézier curve 𝝉j(t)subscript𝝉𝑗𝑡\bm{\tau}_{j}(t)bold_italic_τ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_t ) is that it is fully contained within the simplex formed by the control points 𝒒ji,i[0,n]subscriptsuperscript𝒒𝑖𝑗𝑖0𝑛\bm{q}^{i}_{j},\,i\in[0,n]bold_italic_q start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_i ∈ [ 0 , italic_n ]. Thus, for 𝝉j(t)subscript𝝉𝑗𝑡\bm{\tau}_{j}(t)bold_italic_τ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_t ) to be contained within a convex polytope C𝐶Citalic_C, it is sufficient to ensure that 𝒒jiC,i[0,n]formulae-sequencesuperscriptsubscript𝒒𝑗𝑖𝐶for-all𝑖0𝑛\bm{q}_{j}^{i}\in C,\,\forall i\in[0,n]bold_italic_q start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ∈ italic_C , ∀ italic_i ∈ [ 0 , italic_n ]. To generate the final trajectory, we leverage the FASTER solver [9], altered to convert the Bézier control points of each trajectory segment 𝝉j(t)subscript𝝉𝑗𝑡\bm{\tau}_{j}(t)bold_italic_τ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_t ) into the MINVO basis [17] during optimization to improve solver success rate. Once 𝝉(t)𝝉𝑡\bm{\tau}(t)bold_italic_τ ( italic_t ) has been found, it is sent to the tracking MPC to be executed on the robot.

IV-B Failure Modes: Front-End vs Back-End

There are two distinct failure modes of the motion planner described in Sec. IV-A, both of which will result in Π=Π\Pi=\emptysetroman_Π = ∅: (i) a front-end failure, in which an intersecting corridor 𝒞𝒞\mathcal{C}caligraphic_C between 𝒙(t0)𝒙subscript𝑡0\bm{x}(t_{0})bold_italic_x ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) and 𝒙gsubscript𝒙𝑔\bm{x}_{g}bold_italic_x start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT cannot be found, or (ii) a back-end failure, in which the numerical solver fails to generate a trajectory along the JPS search path. Front-end failures can occur when a feasible search path doesn’t exist (e.g., either 𝒙(t0)𝒙subscript𝑡0\bm{x}(t_{0})bold_italic_x ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) or 𝒙gsubscript𝒙𝑔\bm{x}_{g}bold_italic_x start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT overlap occupied space within \mathcal{M}caligraphic_M), or when parameters of the JPS are poorly conditioned for generating a corridor 𝒞𝒞\mathcal{C}caligraphic_C (e.g., |𝒞|𝒞|\mathcal{C}|| caligraphic_C | is high because the planning horizon distance is too large). The front-end of the motion planner implemented in Sec. IV-A typically runs in <1absent1<1< 1ms, thus for a given state 𝒙𝒙\bm{x}bold_italic_x and map \mathcal{M}caligraphic_M, front-end failures are easily determined by simply running the JPS and corridor generation at that state.

Much more difficult to predict, however, are failures at the back-end of the motion planner due to the fact that the environment is unknown a priori and the optimization is based only on current observations in \mathcal{M}caligraphic_M. Since the back-end is based on a nonlinear optimizer, it can be difficult to characterize success or failure prior to actually running the back-end solver. Additionally, the time to run the back-end is typically >100absent100>100> 100ms, which is too large to directly test multiple future points for failure. Fig. 3(c) shows an example back-end failure, in which the discovery of a previously unknown wall (shown as undiscovered space in Fig. 3(b)) requires a new avoidant trajectory to be generated. While the front-end is able to generate a corridor 𝒞𝒞\mathcal{C}caligraphic_C, the back-end is unable to find a feasible trajectory.

To concretely define these ideas, let Z𝒙,𝒞{0,1}subscript𝑍𝒙𝒞01Z_{\bm{x},\mathcal{C}}\in\left\{0,1\right\}italic_Z start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT ∈ { 0 , 1 } represent a success (00) or failure (1111) of the motion planner pipeline, with Zf{0,1}superscript𝑍𝑓01Z^{f}\in\left\{0,1\right\}italic_Z start_POSTSUPERSCRIPT italic_f end_POSTSUPERSCRIPT ∈ { 0 , 1 } representing a front-end failure and Zb{0,1}superscript𝑍𝑏01Z^{b}\in\left\{0,1\right\}italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ∈ { 0 , 1 } representing a back-end failure. Success of the back-end is dependent on success at the front-end, and failure of the front-end is interpreted as a failure of the back-end as well, so that P(Zb=1|Zf=1)=1𝑃superscript𝑍𝑏conditional1superscript𝑍𝑓11P\left(Z^{b}=1|Z^{f}=1\right)=1italic_P ( italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT = 1 | italic_Z start_POSTSUPERSCRIPT italic_f end_POSTSUPERSCRIPT = 1 ) = 1. The probability of entire pipeline failure can be written as

P(Z𝒙,𝒞)=P(Zb|Zf)P(Zf).𝑃subscript𝑍𝒙𝒞𝑃conditionalsuperscript𝑍𝑏superscript𝑍𝑓𝑃superscript𝑍𝑓P(Z_{\bm{x},\mathcal{C}})=P\left(Z^{b}|Z^{f}\right)P\left(Z^{f}\right).italic_P ( italic_Z start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT ) = italic_P ( italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT | italic_Z start_POSTSUPERSCRIPT italic_f end_POSTSUPERSCRIPT ) italic_P ( italic_Z start_POSTSUPERSCRIPT italic_f end_POSTSUPERSCRIPT ) . (3)

The probability of front-end failure is easily and rapidly checked by running the JPS for a given 𝒙𝒙\bm{x}bold_italic_x and 𝒞𝒞\mathcal{C}caligraphic_C, so that effectively P(Zf){0,1}𝑃superscript𝑍𝑓01P(Z^{f})\in\left\{0,1\right\}italic_P ( italic_Z start_POSTSUPERSCRIPT italic_f end_POSTSUPERSCRIPT ) ∈ { 0 , 1 }. Our contribution is in estimating the probability of back-end failure after a front-end success, P(Zb|Zf=0)𝑃conditionalsuperscript𝑍𝑏superscript𝑍𝑓0P(Z^{b}|Z^{f}=0)italic_P ( italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT | italic_Z start_POSTSUPERSCRIPT italic_f end_POSTSUPERSCRIPT = 0 ). For simplicity in notation, in the rest of the paper we will write this probability as P(Zb)𝑃superscript𝑍𝑏P(Z^{b})italic_P ( italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ) and drop the dependence on the front-end outcome.

IV-C Gaussian Process for Predicting Back-End Failure

To accurately predict back-end failures, we propose a GP-based regression model trained on statistics inferred from simulated data. We choose GPs due to their non-parametric form and ability to accurately infer from a small dataset. These data relate the robot and map state to the probability of back-end failure P(Z𝒙,𝒞b)𝑃subscriptsuperscript𝑍𝑏𝒙𝒞P(Z^{b}_{\bm{x},\mathcal{C}})italic_P ( italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT ). A GP model P^(Z𝒙,𝒞b|)^𝑃conditionalsubscriptsuperscript𝑍𝑏𝒙𝒞\hat{P}(Z^{b}_{\bm{x},\mathcal{C}}|\cdot)over^ start_ARG italic_P end_ARG ( italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT | ⋅ ) can be trained to predict back-end failure probability at run time over future states {𝒙^i}subscript^𝒙𝑖\{\hat{\bm{x}}_{i}\}{ over^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT }. These probabilities can then be used to assess the risk of future motion planning failure ρ𝝉subscript𝜌𝝉\rho_{\bm{\tau}}italic_ρ start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT over the entire prediction horizon.

A navigation stack comprising of both the planning policy ΠΠ\Piroman_Π and the MPC can be deployed in simulation to gather training examples for the GP model. To generate the training dataset, 𝑫𝑫\bm{D}bold_italic_D, we use the Poisson random forest dataset from [18], which contains 10 forest worlds, each with a collection of 90 start and goal positions for navigation. A Clearpath Jackal UGV was then tasked to navigate through the worlds in each of the start and goal configurations, collecting back-end success and failure data points at each planning iteration. With these data collected, features which correlate with back-end failure can be found. To promote generality, the chosen features should only depend on the corridor set 𝒞𝒞\mathcal{C}caligraphic_C, regardless of the sensing modality used to generate it (LiDAR, RGBd, etc.), along with the robot position and its time derivatives, which are common state features for most AMR.

1) Feature Selection. Each training tuple contains three pieces of information: (i) robot state 𝒙𝒙\bm{x}bold_italic_x, (ii) corridor 𝒞𝒞\mathcal{C}caligraphic_C, and (iii) binary variable Z𝒙,𝒞bsubscriptsuperscript𝑍𝑏𝒙𝒞Z^{b}_{\bm{x},\mathcal{C}}italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT which encodes a success or failure of the back-end. With these data, statistical inferences can be made that relate the robot state and corridor to the probability of back-end failure P(Z𝒙,𝒞b)𝑃subscriptsuperscript𝑍𝑏𝒙𝒞P(Z^{b}_{\bm{x},\mathcal{C}})italic_P ( italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT ). Through study of various possible features that could be used, we found two which were particularly well-suited to predicting the probability of back-end failure: the minimum time-to-intersect (TTI), tCsubscript𝑡𝐶t_{C}italic_t start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT, from robot state 𝒙𝒙\bm{x}bold_italic_x to corridor 𝒞𝒞\mathcal{C}caligraphic_C, and the number of polytopes that define the corridor |𝒞|𝒞|\mathcal{C}|| caligraphic_C |.

The minimum TTI can be found by using the xy𝑥𝑦xyitalic_x italic_y position 𝒑2𝒑superscript2\bm{p}\in\mathbb{R}^{2}bold_italic_p ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and velocity 𝒗2𝒗superscript2\bm{v}\in\mathbb{R}^{2}bold_italic_v ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT of the robot state 𝒙𝒙\bm{x}bold_italic_x, then using kinematic equations to find the minimum TTI of the hyperplanes that define the polytope C𝐶Citalic_C containing 𝒑(t0)𝒑subscript𝑡0\bm{p}(t_{0})bold_italic_p ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). Formally, if row 𝒓i𝑨subscript𝒓𝑖𝑨\bm{r}_{i}\in\bm{A}bold_italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ bold_italic_A and bi𝒃subscript𝑏𝑖𝒃b_{i}\in\bm{b}italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ bold_italic_b form a hyperplane 𝒓ix,y=bisubscript𝒓𝑖𝑥𝑦subscript𝑏𝑖\bm{r}_{i}\cdot\langle x,y\rangle=b_{i}bold_italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ ⟨ italic_x , italic_y ⟩ = italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT of polytope C𝐶Citalic_C, then the time to intersect the hyperplane tHsubscript𝑡𝐻t_{H}italic_t start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT can be calculated as:

tH(𝒓i,bi,𝒙)={bi𝒓i𝒑𝒓i𝒗if 𝒓i𝒗>0γtotherwisesubscript𝑡𝐻subscript𝒓𝑖subscript𝑏𝑖𝒙casessubscript𝑏𝑖subscript𝒓𝑖𝒑subscript𝒓𝑖𝒗if subscript𝒓𝑖𝒗0subscript𝛾𝑡otherwiset_{H}\left(\bm{r}_{i},b_{i},\bm{x}\right)=\begin{cases}\frac{b_{i}-\bm{r}_{i}% \cdot\bm{p}}{\bm{r}_{i}\cdot\bm{v}}&\text{if }\bm{r}_{i}\cdot\bm{v}>0\\ \gamma_{t}&\text{otherwise}\end{cases}italic_t start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( bold_italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_italic_x ) = { start_ROW start_CELL divide start_ARG italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ bold_italic_p end_ARG start_ARG bold_italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ bold_italic_v end_ARG end_CELL start_CELL if bold_italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ bold_italic_v > 0 end_CELL end_ROW start_ROW start_CELL italic_γ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_CELL start_CELL otherwise end_CELL end_ROW (4)

where γtsubscript𝛾𝑡\gamma_{t}italic_γ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is a user-defined maximum value for tHsubscript𝑡𝐻t_{H}italic_t start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT when the vehicle is stationary or moving away from the hyperplane. With tHsubscript𝑡𝐻t_{H}italic_t start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT, tCsubscript𝑡𝐶t_{C}italic_t start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT is calculated as the minimum TTI to the hyperplanes of C𝐶Citalic_C:

tC=mini{tH(𝒓i,bi,𝒙)}.subscript𝑡𝐶subscript𝑖subscript𝑡𝐻subscript𝒓𝑖subscript𝑏𝑖𝒙t_{C}=\min_{i}\{t_{H}\left(\bm{r}_{i},b_{i},\bm{x}\right)\}.italic_t start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT = roman_min start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT { italic_t start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( bold_italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_italic_x ) } . (5)

One of the biggest factors that affect the ability of the back-end solver to find a feasible solution is how close the current robot position 𝒑(t0)𝒑subscript𝑡0\bm{p}(t_{0})bold_italic_p ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is located to the boundary of the feasible set 𝒞𝒞\mathcal{C}caligraphic_C. Intuitively, TTI is an effective predictor of back-end failure because it captures several factors that determine success: (i) The physical distance between 𝒑(t0)𝒑subscript𝑡0\bm{p}(t_{0})bold_italic_p ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) and the free space boundary, (ii) the velocity of the robot 𝒗(t0)𝒗subscript𝑡0\bm{v}(t_{0})bold_italic_v ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), and (iii) the heading of the robot.

In addition to TTI, the cardinality |𝒞|𝒞|\mathcal{C}|| caligraphic_C | of the corridor also plays a role in failure of the back-end solver: if 𝒞𝒞\mathcal{C}caligraphic_C is defined by many polytopes, then obstacles in the environment necessitate a very non-direct path to be planned for the robot, further complicating the search for a feasible path. Together, these two features were used inside a feature vector 𝒅(𝒞,𝒙)=[tC,|𝒞|]𝒅𝒞𝒙subscript𝑡𝐶𝒞\bm{d}(\mathcal{C},\bm{x})=\left[t_{C},|\mathcal{C}|\right]bold_italic_d ( caligraphic_C , bold_italic_x ) = [ italic_t start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT , | caligraphic_C | ] to infer the probability of back-end failure. To find this probability, the back-end failure training data {Z𝒙,𝒞b}subscriptsuperscript𝑍𝑏𝒙𝒞\left\{Z^{b}_{\bm{x},\mathcal{C}}\right\}{ italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT } were binned based on feature vector value 𝒅𝒅\bm{d}bold_italic_d, and ground-truth probability of failure P(Z𝒙,𝒞b)𝑃subscriptsuperscript𝑍𝑏𝒙𝒞P(Z^{b}_{\bm{x},\mathcal{C}})italic_P ( italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT ) was found within each bin. To validate the choice of input features for training, we plot the probability of back-end failure P(Z𝒙,𝒞b)𝑃subscriptsuperscript𝑍𝑏𝒙𝒞P(Z^{b}_{\bm{x},\mathcal{C}})italic_P ( italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT ) over tCsubscript𝑡𝐶t_{C}italic_t start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT and |𝒞|𝒞|\mathcal{C}|| caligraphic_C |, where the correlations are clearly seen in Figs. 4(a) and (b). As tCsubscript𝑡𝐶t_{C}italic_t start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT decreases, the probability of back-end failure increases. Furthermore, as the corridor length |𝒞|𝒞|\mathcal{C}|| caligraphic_C | increases, the probability of failure also increases.

2) GP Regression. The underlying GP model input is defined by a collection of M𝑀Mitalic_M input training features, 𝑫=[𝒅0,,𝒅M]𝑫subscript𝒅0subscript𝒅𝑀\bm{D}=\left[\bm{d}_{0},\dots,\bm{d}_{M}\right]bold_italic_D = [ bold_italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , bold_italic_d start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ], and values 𝑷=[P0,,PM]𝑷subscript𝑃0subscript𝑃𝑀\bm{P}=\left[P_{0},\dots,P_{M}\right]bold_italic_P = [ italic_P start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_P start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ], with an output defined by a joint Gaussian distribution [19]:

[PP^]𝒩([μ(𝒅)μ(𝒅*)],[𝑲𝑲*𝑲*T𝑲**]),similar-tomatrix𝑃^𝑃𝒩matrix𝜇𝒅𝜇subscript𝒅matrix𝑲subscript𝑲superscriptsubscript𝑲𝑇subscript𝑲absent\begin{bmatrix}P\\ \hat{P}\end{bmatrix}\sim\mathcal{N}\left(\begin{bmatrix}\mu(\bm{d})\\ \mu(\bm{d}_{*})\end{bmatrix},\begin{bmatrix}\bm{K}&\bm{K_{*}}\\ \bm{K}_{*}^{T}&\bm{K_{**}}\end{bmatrix}\right),[ start_ARG start_ROW start_CELL italic_P end_CELL end_ROW start_ROW start_CELL over^ start_ARG italic_P end_ARG end_CELL end_ROW end_ARG ] ∼ caligraphic_N ( [ start_ARG start_ROW start_CELL italic_μ ( bold_italic_d ) end_CELL end_ROW start_ROW start_CELL italic_μ ( bold_italic_d start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARG ] , [ start_ARG start_ROW start_CELL bold_italic_K end_CELL start_CELL bold_italic_K start_POSTSUBSCRIPT bold_* end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL bold_italic_K start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL bold_italic_K start_POSTSUBSCRIPT bold_* bold_* end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] ) , (6)

where 𝑲=κ(𝑫,𝑫)𝑲𝜅𝑫𝑫\bm{K}=\kappa(\bm{D},\bm{D})bold_italic_K = italic_κ ( bold_italic_D , bold_italic_D ), 𝑲*=κ(𝑫,𝑫*)subscript𝑲𝜅𝑫subscript𝑫\bm{K_{*}}=\kappa(\bm{D},\bm{D}_{*})bold_italic_K start_POSTSUBSCRIPT bold_* end_POSTSUBSCRIPT = italic_κ ( bold_italic_D , bold_italic_D start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) and 𝑲**=κ(𝑫*,𝑫*)subscript𝑲absent𝜅subscript𝑫subscript𝑫\bm{K_{**}}=\kappa(\bm{D}_{*},\bm{D}_{*})bold_italic_K start_POSTSUBSCRIPT bold_* bold_* end_POSTSUBSCRIPT = italic_κ ( bold_italic_D start_POSTSUBSCRIPT * end_POSTSUBSCRIPT , bold_italic_D start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ), μ𝜇\muitalic_μ is the mean function, 𝑫*subscript𝑫\bm{D}_{*}bold_italic_D start_POSTSUBSCRIPT * end_POSTSUBSCRIPT is the test input, and κ𝜅\kappaitalic_κ is a positive definite kernel function, which is the Radial Basis Function (RBF) in this work. From this, the predictive posterior distribution of P^^𝑃\hat{P}over^ start_ARG italic_P end_ARG given 𝑫𝑫\bm{D}bold_italic_D can be expressed as another Gaussian distribution:

P^𝒩(μ*,σ*2),similar-to^𝑃𝒩subscript𝜇superscriptsubscript𝜎2\hat{P}\sim\mathcal{N}(\mu_{*},\sigma_{*}^{2}),over^ start_ARG italic_P end_ARG ∼ caligraphic_N ( italic_μ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) , (7)

with μ*subscript𝜇\mu_{*}italic_μ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT and σ*2superscriptsubscript𝜎2\sigma_{*}^{2}italic_σ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT defined as:

μ*=μ(𝑫*)+𝑲*T𝑲(Pμ(𝑫))subscript𝜇𝜇subscript𝑫superscriptsubscript𝑲𝑇𝑲𝑃𝜇𝑫\mu_{*}=\mu(\bm{D_{*}})+\bm{K}_{*}^{T}\bm{K}\left(P-\mu(\bm{D})\right)italic_μ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT = italic_μ ( bold_italic_D start_POSTSUBSCRIPT bold_* end_POSTSUBSCRIPT ) + bold_italic_K start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_K ( italic_P - italic_μ ( bold_italic_D ) ) (8)
σ*2=𝑲**𝑲*T𝑲1𝑲*.superscriptsubscript𝜎2subscript𝑲absentsuperscriptsubscript𝑲𝑇superscript𝑲1subscript𝑲\sigma_{*}^{2}=\bm{K}_{**}-\bm{K}_{*}^{T}\bm{K}^{-1}\bm{K}_{*}.italic_σ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = bold_italic_K start_POSTSUBSCRIPT * * end_POSTSUBSCRIPT - bold_italic_K start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_K start_POSTSUBSCRIPT * end_POSTSUBSCRIPT . (9)

With this, the estimated probability of back-end failure is taken as the mean values of this posterior:

P^(Z𝒙,𝒞b|𝒅)=μ*.^𝑃conditionalsubscriptsuperscript𝑍𝑏𝒙𝒞𝒅subscript𝜇\hat{P}\left(Z^{b}_{\bm{x},\mathcal{C}}|\bm{d}\right)=\mu_{*}.over^ start_ARG italic_P end_ARG ( italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT | bold_italic_d ) = italic_μ start_POSTSUBSCRIPT * end_POSTSUBSCRIPT . (10)

To validate the quality of the trained GP models, the distribution of failures over tCsubscript𝑡𝐶t_{C}italic_t start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT was collected from test worlds outside the forest dataset, and the resulting test set distribution was compared with the learned distribution P^(Z𝒙,𝒞b|𝒅)^𝑃conditionalsubscriptsuperscript𝑍𝑏𝒙𝒞𝒅\hat{P}(Z^{b}_{\bm{x},\mathcal{C}}|\bm{d})over^ start_ARG italic_P end_ARG ( italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT | bold_italic_d ) for |𝒞|=2𝒞2|\mathcal{C}|=2| caligraphic_C | = 2 (Fig. 4(c)) and |𝒞|=3𝒞3|\mathcal{C}|=3| caligraphic_C | = 3 (Fig. 4(d)). These plots show the learned distributions closely match the test distribution, demonstrating that the GP models generalize well to new environments.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 4: Solver failure trends for (a) tCsubscript𝑡𝐶t_{C}italic_t start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT and |𝒞|𝒞|\mathcal{C}|| caligraphic_C | along with learned distribution vs test distribution for (c) |𝒞|=2𝒞2|\mathcal{C}|=2| caligraphic_C | = 2 and (d) |𝒞|=3𝒞3|\mathcal{C}|=3| caligraphic_C | = 3.

3) Defining Planning Risk. With P^(Z𝒙,𝒞b|𝒅)^𝑃conditionalsubscriptsuperscript𝑍𝑏𝒙𝒞𝒅\hat{P}(Z^{b}_{\bm{x},\mathcal{C}}|\bm{d})over^ start_ARG italic_P end_ARG ( italic_Z start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT | bold_italic_d ) estimating back-end failure, the probability of failure for the entire motion planning pipeline P(Z𝒙,𝒞)𝑃subscript𝑍𝒙𝒞P(Z_{\bm{x},\mathcal{C}})italic_P ( italic_Z start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT ) can be calculated using (3). These probabilities can be used to infer the risk of motion planning failure along the future states {𝒙^i}subscript^𝒙𝑖\left\{\hat{\bm{x}}_{i}\right\}{ over^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } predicted by the MPC. To formulate this risk, we consider the total number of future motion planning failures Z𝝉subscript𝑍𝝉Z_{\bm{\tau}}italic_Z start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT as the salient outcome to track, defined as

Z𝝉=𝒙^{𝒙^i}Z𝒙^,𝒞.subscript𝑍𝝉subscript^𝒙subscript^𝒙𝑖subscript𝑍^𝒙𝒞Z_{\bm{\tau}}=\sum_{\hat{\bm{x}}\in\{\hat{\bm{x}}_{i}\}}Z_{\hat{\bm{x}},% \mathcal{C}}.italic_Z start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT over^ start_ARG bold_italic_x end_ARG ∈ { over^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } end_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT over^ start_ARG bold_italic_x end_ARG , caligraphic_C end_POSTSUBSCRIPT . (11)

Because each Z𝒙^,𝒞subscript𝑍^𝒙𝒞Z_{\hat{\bm{x}},\mathcal{C}}italic_Z start_POSTSUBSCRIPT over^ start_ARG bold_italic_x end_ARG , caligraphic_C end_POSTSUBSCRIPT is a stochastic variable, Z𝝉subscript𝑍𝝉Z_{\bm{\tau}}italic_Z start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT is also a stochastic variable. The risk metric chosen in our approach is the expected number of collisions over the future horizon, ρ𝝉=𝔼(Z𝝉)subscript𝜌𝝉𝔼subscript𝑍𝝉\rho_{\bm{\tau}}=\mathbb{E}(Z_{\bm{\tau}})italic_ρ start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT = blackboard_E ( italic_Z start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT ). The expected value is chosen here for its simplicity and speed to calculate, although other risk metrics may be used as well [20]. Since each Z𝒙^,𝒞subscript𝑍^𝒙𝒞Z_{\hat{\bm{x}},\mathcal{C}}italic_Z start_POSTSUBSCRIPT over^ start_ARG bold_italic_x end_ARG , caligraphic_C end_POSTSUBSCRIPT is a Bernoulli random variable with predicted probability P(Z𝒙^,𝒞)𝑃subscript𝑍^𝒙𝒞P(Z_{\hat{\bm{x}},\mathcal{C}})italic_P ( italic_Z start_POSTSUBSCRIPT over^ start_ARG bold_italic_x end_ARG , caligraphic_C end_POSTSUBSCRIPT ) of failure, the expectation is calculated as:

ρ𝝉=𝒙^{𝒙^i}P(Z𝒙^,𝒞).subscript𝜌𝝉subscript^𝒙subscript^𝒙𝑖𝑃subscript𝑍^𝒙𝒞\rho_{\bm{\tau}}=\sum_{\hat{\bm{x}}\in\{\hat{\bm{x}}_{i}\}}P(Z_{\hat{\bm{x}},% \mathcal{C}}).italic_ρ start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT over^ start_ARG bold_italic_x end_ARG ∈ { over^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } end_POSTSUBSCRIPT italic_P ( italic_Z start_POSTSUBSCRIPT over^ start_ARG bold_italic_x end_ARG , caligraphic_C end_POSTSUBSCRIPT ) . (12)

A risk threshold ψρsubscript𝜓𝜌\psi_{\rho}italic_ψ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT may be set so that anytime the risk of planner failure over future states {𝒙^i}subscript^𝒙𝑖\left\{\hat{\bm{x}}_{i}\right\}{ over^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } exceeds this value, the recovery behavior is triggered.

IV-D Recovering After Predicted Failures

When ρ𝝉>ψρsubscript𝜌𝝉subscript𝜓𝜌\rho_{\bm{\tau}}>\psi_{\rho}italic_ρ start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT > italic_ψ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT is satisfied, it means that there is a collection of states in the vehicle’s future horizon that the planner is likely unable to successfully operate. As such, the vehicle must stop or perform other recovery maneuvers in order to avoid collisions and navigate successfully through said regions. Unlike prior works where human operators intervene to recover the vehicle once failures are detected [4], [21], our framework includes a recovery planner ΠrsubscriptΠ𝑟\Pi_{r}roman_Π start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT which enables the vehicle to find and execute safe recovery maneuvers autonomously, as illustrated in Fig. 3(d).

Once the vehicle has stopped after switching to the recovery mode, the objective is to locate a nearby region where the planner will succeed, i.e., Z𝒙,𝒞=0subscript𝑍𝒙𝒞0Z_{\bm{x},\mathcal{C}}=0italic_Z start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT = 0. The first step is to sample points uniformly in free space around the current vehicle position 𝒑(t0)𝒑subscript𝑡0\bm{p}(t_{0})bold_italic_p ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). To do so, an H-polytope Crsubscript𝐶𝑟C_{r}italic_C start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT is generated around 𝒑(t0)𝒑subscript𝑡0\bm{p}(t_{0})bold_italic_p ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), where hit-and-run Markov-chain Monte Carlo sampling [22] is used to find Npsubscript𝑁𝑝N_{p}italic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT candidate positions 𝒫c={𝒑0,,𝒑Np}subscript𝒫𝑐subscript𝒑0subscript𝒑subscript𝑁𝑝\mathcal{P}_{c}=\{\bm{p}_{0},\dots,\bm{p}_{N_{p}}\}caligraphic_P start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT = { bold_italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , bold_italic_p start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_POSTSUBSCRIPT }, where Npsubscript𝑁𝑝N_{p}italic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is a user-defined parameter. 𝒫csubscript𝒫𝑐\mathcal{P}_{c}caligraphic_P start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT is then converted to states 𝒳csubscript𝒳𝑐\mathcal{X}_{c}caligraphic_X start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT by assuming the vehicle starts from rest. We make this choice because it significantly reduces the sample space and sampling only positions was enough to find recovery states in practice. With 𝒳csubscript𝒳𝑐\mathcal{X}_{c}caligraphic_X start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT, we find the probability of planner failure, P(Z𝒙i,𝒞i)𝑃subscript𝑍subscript𝒙𝑖subscript𝒞𝑖P(Z_{\bm{x}_{i},\mathcal{C}_{i}})italic_P ( italic_Z start_POSTSUBSCRIPT bold_italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , caligraphic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ), at each 𝒙isubscript𝒙𝑖\bm{x}_{i}bold_italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, as well as neighboring states in close proximity for consistency. If all predictions have failure probability greater than η𝜂\etaitalic_η, the samples are thrown away and the sampling process is repeated. Here η𝜂\etaitalic_η is a user-defined parameter which controls how risk averse the recovery behavior should be. 𝒙rsubscript𝒙𝑟\bm{x}_{r}bold_italic_x start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT is then chosen to be the state with lowest expected failure:

𝒙r=argmin𝒙i𝒳c𝔼[Z𝒙𝒊,𝒞i].subscript𝒙𝑟subscriptargminsubscript𝒙𝑖subscript𝒳𝑐𝔼delimited-[]subscript𝑍subscript𝒙𝒊subscript𝒞𝑖\bm{x}_{r}=\operatorname*{arg\,min}_{\bm{x}_{i}\in\mathcal{X}_{c}}\mathbb{E}% \left[Z_{\bm{x_{i}},\mathcal{C}_{i}}\right].bold_italic_x start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT = start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT bold_italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_X start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT end_POSTSUBSCRIPT blackboard_E [ italic_Z start_POSTSUBSCRIPT bold_italic_x start_POSTSUBSCRIPT bold_italic_i end_POSTSUBSCRIPT , caligraphic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] . (13)

After determining 𝒙rsubscript𝒙𝑟\bm{x}_{r}bold_italic_x start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT, the vehicle navigates to the recovery point using the GTG MPC with an added constraint, formulated as in (2), where 𝒑(t0)𝒑subscript𝑡0\bm{p}(t_{0})bold_italic_p ( italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) must remain in Crsubscript𝐶𝑟C_{r}italic_C start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT in order to avoid obstacles. Once the vehicle reaches 𝒙rsubscript𝒙𝑟\bm{x}_{r}bold_italic_x start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT, the planner switches back to the nominal safe corridor policy ΠΠ\Piroman_Π to generate trajectories 𝝉(t)𝝉𝑡\bm{\tau}(t)bold_italic_τ ( italic_t ) and the entire process repeats.

V Simulations

Simulations were performed to both train the GP classification model described in (6) and validate the proposed approach to detect and recover from motion planning failures. All simulations were performed in Gazebo using Ubuntu 20.04 and ROS Noetic. The robot used in simulation is a Clearpath Robotics Jackal UGV equipped with a 270superscript270270^{\circ}270 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT 2D Lidar depth sensor. Data were collected as described in Sec. IV-C and sent to the GP regressions for training.

With the models trained, we then validated our approach in four gazebo worlds of varying difficulty. The base world is a series of connected rooms with either sparse or dense obstacle density and 1m or 2222m wide doorways. In each world we use the same start configuration 𝒙(0)𝒙0\bm{x}(0)bold_italic_x ( 0 ) and three goals 𝒙g0superscriptsubscript𝒙𝑔0\bm{x}_{g}^{0}bold_italic_x start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT, 𝒙g1superscriptsubscript𝒙𝑔1\bm{x}_{g}^{1}bold_italic_x start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT, and 𝒙g2superscriptsubscript𝒙𝑔2\bm{x}_{g}^{2}bold_italic_x start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Fig. 5(a) shows the world with 1111m doorways and dense obstacle configuration, along with an example navigation failure without our approach (Fig. 5(b) and (c)) and success with (Fig. 5(d)). In Fig. 5(b), the vehicle plans a trajectory 𝝉(t)𝝉𝑡\bm{\tau}(t)bold_italic_τ ( italic_t ) which intersects a part of the wall occluded by an obstacle. Since an avoiding trajectory cannot be computed in time, the vehicle collides with the wall at 𝒙Asubscript𝒙𝐴\bm{x}_{A}bold_italic_x start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT in Fig. 5(c). If instead we use our approach, as shown in Fig. 5(d), the robot detects the planner failure proactively and stops at 𝒙Bsubscript𝒙𝐵\bm{x}_{B}bold_italic_x start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT. A reverse maneuver (green line) is then executed to reach the recovery state 𝒙rsubscript𝒙𝑟\bm{x}_{r}bold_italic_x start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT found using (13). The vehicle then switches back to the nominal planner and continues towards 𝒙g0superscriptsubscript𝒙𝑔0\bm{x}_{g}^{0}bold_italic_x start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 5: (a) Gazebo world with 1111m doorway and 3 different goals. In (b) an obstacle hides an occluded wall leading to a collision without our framework (c) vs a successful navigation toward xg0superscriptsubscript𝑥𝑔0x_{g}^{0}italic_x start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT in (d) with our approach.

The remaining 3333 test worlds are generated by varying the doorway width between 1111m and 2222m, as well as varying the obstacle layout between a sparse and dense configuration. For each world tested, the robot is tasked to navigate 10 times to the goals 𝒙gsubscript𝒙𝑔\bm{x}_{g}bold_italic_x start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT, creating 30303030 test points per world, for 120120120120 simulations total. The resulting success rates for the motion planner with and without our approach are shown in Fig. 6 for each goal and world combination, where it can be seen that using our failure detection and recovery framework improves the nominal planner’s performance.

Refer to caption
Figure 6: Navigation success rates with (blue) and without (orange) failure detection and recovery for different simulation world and goal combinations.

VI Physical Experiments

The proposed approach was validated with multiple robots across several experiments, all of which are shown in the supplementary material and website. Presented in this paper are two experiments with two real robotics platforms: a Boston Dynamics Spot quadruped, and the same Jackal differential drive UGV used in simulations. For each platform, the same motion planning pipeline was used to generate trajectories 𝝉𝝉\bm{\tau}bold_italic_τ to follow, using an MPC to generate the control signal 𝒖𝒖\bm{u}bold_italic_u to track these trajectories. Lidar sensor readings were provided by Ouster for the Spot, and Velodyne for the Jackal. These were used by the SLAM package Gmap** in order to create a map \mathcal{M}caligraphic_M and estimate the state of the robot at run-time as each platform traveled through an environment unknown a priori. To emphasize the generality of the proposed approach, the GP model P^(Z𝒙,𝒞b|𝒅)^𝑃conditionalsuperscriptsubscript𝑍𝒙𝒞𝑏𝒅\hat{P}(Z_{\bm{x},\mathcal{C}}^{b}|\bm{d})over^ start_ARG italic_P end_ARG ( italic_Z start_POSTSUBSCRIPT bold_italic_x , caligraphic_C end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT | bold_italic_d ) that was used to predict motion planning back-end failures was trained entirely on data collected in simulation, demonstrating how the approach is both sensor- and model-agnostic.

Two test cases were setup to test the approach. Fig. 7 shows the first case in which the Jackal is tasked to move towards a goal around an occluding corner, behind which are occluded obstacles previously unknown to the robot. Fig. 1 shows the second case in which the Spot is tasked with a similar mission, except it must negotiate an unexpected dead-end. Without the proposed approach, both cases lead to path-planning failures, which in turn lead to collisions. Both Fig. 7 and Fig. 1 show snapshots of the proposed approach being used to proactively detect risk of path planning failure ρ𝝉subscript𝜌𝝉\rho_{\bm{\tau}}italic_ρ start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT, recovering at 𝒙Bsubscript𝒙𝐵\bm{x}_{B}bold_italic_x start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT when ρ𝝉>ψρsubscript𝜌𝝉subscript𝜓𝜌\rho_{\bm{\tau}}>\psi_{\rho}italic_ρ start_POSTSUBSCRIPT bold_italic_τ end_POSTSUBSCRIPT > italic_ψ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT, moving to a recovery point 𝒙rsubscript𝒙𝑟\bm{x}_{r}bold_italic_x start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT, then continuing moving towards 𝒙gsubscript𝒙𝑔\bm{x}_{g}bold_italic_x start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT. For these experiments, the risk threshold was ψρ=3subscript𝜓𝜌3\psi_{\rho}=3italic_ψ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT = 3 expected failures over the predicted MPC future trajectory.

Refer to caption
Refer to caption
Figure 7: Experiment case in which unexpected occluded obstacles would have caused a motion planner failure without our approach.

VII Conclusions and Future Work

In this work, we have presented a novel GP-based, proactive failure detection and recovery scheme to prevent a mobile robot system from colliding with obstacles. Our approach is shown to improve the performance over a traditional safe corridor motion planner in both simulation and experimental case studies. Furthermore, our approach is model- and sensor-agnostic and can be applied without prior real-world training data due to the careful selection of features.

Future work aims to enhance the system by incorporating distributional learning for failure detection, eliminating the need for multiple GP regressions. Additionally, we would like to utilize this approach for planner switching within a Simplex Architecture and incorporate dynamic obstacles.

VIII Acknowledgements

Funding for this research are provided by an Amazon Research Award and by CoStar group.

References

  • [1] X. Xiao, Z. Xu, Z. Wang, Y. Song, G. Warnell, P. Stone, T. Zhang, S. Ravi, G. Wang, H. Karnan, J. Biswas, N. Mohammad, L. Bramblett, R. Peddi, N. Bezzo, Z. Xie, and P. Dames, “Autonomous ground navigation in highly constrained spaces: Lessons learned from the benchmark autonomous robot navigation challenge at icra 2022 [competitions],” IEEE Robotics & Automation Magazine, vol. 29, no. 4, pp. 148–156, 2022.
  • [2] N. Mohammad and N. Bezzo, “A robust and fast occlusion-based frontier method for autonomous navigation in unknown cluttered environments,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022, pp. 6324–6331.
  • [3] X. Zhang, Y. Shu, Y. Chen, G. Chen, J. Ye, X. Li, and X. Li, “Multi-modal learning and relaxation of physical conflict for an exoskeleton robot with proprioceptive perception,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 10 490–10 496.
  • [4] T. Ji, A. N. Sivakumar, G. Chowdhary, and K. Driggs-Campbell, “Proactive anomaly detection for robot navigation with multi-sensor fusion,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 4975–4982, 2022.
  • [5] G. Kahn, P. Abbeel, and S. Levine, “Badgr: An autonomous self-supervised learning-based navigation system,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 1312–1319, 2021.
  • [6] S. Liu, M. Watterson, K. Mohta, K. Sun, S. Bhattacharya, C. J. Taylor, and V. Kumar, “Planning dynamically feasible trajectories for quadrotors using safe flight corridors in 3-d complex environments,” IEEE Robotics and Automation Letters, vol. 2, no. 3, pp. 1688–1695, 2017.
  • [7] F. Gao, W. Wu, Y. Lin, and S. Shen, “Online safe trajectory generation for quadrotors using fast marching method and bernstein basis polynomial,” in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 344–351.
  • [8] L. Wang and Y. Guo, “Speed adaptive robot trajectory generation based on derivative property of b-spline curve,” IEEE Robotics and Automation Letters, vol. 8, no. 4, pp. 1905–1911, 2023.
  • [9] J. Tordesillas and J. P. How, “FASTER: Fast and safe trajectory planner for navigation in unknown environments,” IEEE Transactions on Robotics, 2021.
  • [10] Z. Wang, X. Zhou, C. Xu, and F. Gao, “Geometrically constrained trajectory optimization for multicopters,” IEEE Transactions on Robotics, vol. 38, no. 5, pp. 3259–3278, 2022.
  • [11] Y. Ren, F. Zhu, W. Liu, Z. Wang, Y. Lin, F. Gao, and F. Zhang, “Bubble planner: Planning high-speed smooth quadrotor trajectories using receding corridors,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022, pp. 6332–6339.
  • [12] M. J. R. R. A. and S. A. Ning, Unconstrained Gradient-Based Optimization.   Cambridge University Press, 2022.
  • [13] S. Bansal, M. Chen, S. Herbert, and C. J. Tomlin, “Hamilton-jacobi reachability: A brief overview and recent advances,” in 2017 IEEE 56th Annual Conference on Decision and Control (CDC), 2017, pp. 2242–2253.
  • [14] A. Devonport and M. Arcak, “Data-driven reachable set computation using adaptive gaussian process classification and monte carlo methods,” in 2020 American Control Conference (ACC), 2020, pp. 2629–2634.
  • [15] E. Yel, T. J. Carpenter, C. Di Franco, R. Ivanov, Y. Kantaros, I. Lee, J. Weimer, and N. Bezzo, “Assured runtime monitoring and planning: Toward verification of neural networks for safe autonomous operations,” IEEE Robotics & Automation Magazine, vol. 27, no. 2, pp. 102–116, 2020.
  • [16] D. Harabor and A. Grastien, “Online graph pruning for pathfinding on grid maps,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 25, no. 1, 2011, pp. 1114–1119.
  • [17] J. Tordesillas and J. P. How, “Minvo basis: Finding simplexes with minimum volume enclosing polynomial curves,” Computer-Aided Design, vol. 151, p. 103341, 2022.
  • [18] H. Oleynikova, M. Burri, Z. Taylor, J. Nieto, R. Siegwart, and E. Galceran, “Continuous-time trajectory optimization for online uav replanning,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016.
  • [19] C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning).   The MIT Press, 2005.
  • [20] A. Majumdar and M. Pavone, “How should a robot assess risk? towards an axiomatic theory of risk in robotics,” in Robotics Research: The 18th International Symposium ISRR.   Springer, 2020, pp. 75–84.
  • [21] G. Kahn, P. Abbeel, and S. Levine, “Land: Learning to navigate from disengagements,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 1872–1879, 2021.
  • [22] R. Tedrake and the Drake Development Team, “Drake: Model-based design and verification for robotics,” 2019. [Online]. Available: https://drake.mit.edu