-
Verification-Aided Learning of Neural Network Barrier Functions with Termination Guarantees
Authors:
Shaoru Chen,
Lekan Molu,
Mahyar Fazlyab
Abstract:
Barrier functions are a general framework for establishing a safety guarantee for a system. However, there is no general method for finding these functions. To address this shortcoming, recent approaches use self-supervised learning techniques to learn these functions using training data that are periodically generated by a verification procedure, leading to a verification-aided learning framework…
▽ More
Barrier functions are a general framework for establishing a safety guarantee for a system. However, there is no general method for finding these functions. To address this shortcoming, recent approaches use self-supervised learning techniques to learn these functions using training data that are periodically generated by a verification procedure, leading to a verification-aided learning framework. Despite its immense potential in automating barrier function synthesis, the verification-aided learning framework does not have termination guarantees and may suffer from a low success rate of finding a valid barrier function in practice. In this paper, we propose a holistic approach to address these drawbacks. With a convex formulation of the barrier function synthesis, we propose to first learn an empirically well-behaved NN basis function and then apply a fine-tuning algorithm that exploits the convexity and counterexamples from the verification failure to find a valid barrier function with finite-step termination guarantees: if there exist valid barrier functions, the fine-tuning algorithm is guaranteed to find one in a finite number of iterations. We demonstrate that our fine-tuning method can significantly boost the performance of the verification-aided learning framework on examples of different scales and using various neural network verifiers.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Singularly Perturbed Layered Control of Deformable Bodies
Authors:
Lekan Molu
Abstract:
Variable curvature modeling tools provide an accurate means of controlling infinite degrees-of-freedom deformable bodies and structures. However, their forward and inverse Newton-Euler dynamics are fraught with high computational costs. Assuming piecewise constant strains across discretized Cosserat rods imposed on the soft material, a composite two time-scale singularly perturbed nonlinear backst…
▽ More
Variable curvature modeling tools provide an accurate means of controlling infinite degrees-of-freedom deformable bodies and structures. However, their forward and inverse Newton-Euler dynamics are fraught with high computational costs. Assuming piecewise constant strains across discretized Cosserat rods imposed on the soft material, a composite two time-scale singularly perturbed nonlinear backstep** control scheme is here introduced. This is to alleviate the long computational times of the recursive Newton-Euler dynamics for soft structures. Our contribution is three-pronged: (i) we decompose the system's Newton-Euler dynamics to a two coupled sub-dynamics by introducing a perturbation parameter; (ii) we then prescribe a set of stabilizing controllers for regulating each subsystem's dynamics; and (iii) we study the interconnected singularly perturbed system and analyze its stability.
△ Less
Submitted 21 December, 2023; v1 submitted 10 December, 2023;
originally announced December 2023.
-
Lagrangian Properties and Control of Soft Robots Modeled with Discrete Cosserat Rods
Authors:
Lekan Molu,
Shaoru Chen,
Audrey Sedal
Abstract:
The characteristic ``in-plane" bending associated with soft robots' deformation make them preferred over rigid robots in sophisticated manipulation and movement tasks. Executing such motion strategies to precision in soft deformable robots and structures is however fraught with modeling and control challenges given their infinite degrees-of-freedom. Imposing \textit{piecewise constant strains} (PC…
▽ More
The characteristic ``in-plane" bending associated with soft robots' deformation make them preferred over rigid robots in sophisticated manipulation and movement tasks. Executing such motion strategies to precision in soft deformable robots and structures is however fraught with modeling and control challenges given their infinite degrees-of-freedom. Imposing \textit{piecewise constant strains} (PCS) across (discretized) Cosserat microsolids on the continuum material however, their dynamics become amenable to tractable mathematical analysis. While this PCS model handles the characteristic difficult-to-model ``in-plane" bending well, its Lagrangian properties are not exploited for control in literature neither is there a rigorous study on the dynamic performance of multisection deformable materials for ``in-plane" bending that guarantees steady-state convergence. In this sentiment, we first establish the PCS model's structural Lagrangian properties. Second, we exploit these for control on various strain goal states. Third, we benchmark our hypotheses against an Octopus-inspired robot arm under different constant tip loads. These induce non-constant ``in-plane" deformation and we regulate strain states throughout the continuum in these configurations. Our numerical results establish convergence to desired equilibrium throughout the continuum in all of our tests. Within the bounds here set, we conjecture that our methods can find wide adoption in the control of cable- and fluid-driven multisection soft robotic arms; and may be extensible to the (learning-based) control of deformable agents employed in simulated, mixed, or augmented reality.
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
PcLast: Discovering Plannable Continuous Latent States
Authors:
Anurag Koul,
Shivakanth Sujit,
Shaoru Chen,
Ben Evans,
Lili Wu,
Byron Xu,
Rajan Chari,
Riashat Islam,
Raihan Seraj,
Yonathan Efroni,
Lekan Molu,
Miro Dudik,
John Langford,
Alex Lamb
Abstract:
Goal-conditioned planning benefits from learned low-dimensional representations of rich observations. While compact latent representations typically learned from variational autoencoders or inverse dynamics enable goal-conditioned decision making, they ignore state reachability, hampering their performance. In this paper, we learn a representation that associates reachable states together for effe…
▽ More
Goal-conditioned planning benefits from learned low-dimensional representations of rich observations. While compact latent representations typically learned from variational autoencoders or inverse dynamics enable goal-conditioned decision making, they ignore state reachability, hampering their performance. In this paper, we learn a representation that associates reachable states together for effective planning and goal-conditioned policy learning. We first learn a latent representation with multi-step inverse dynamics (to remove distracting information), and then transform this representation to associate reachable states together in $\ell_2$ space. Our proposals are rigorously tested in various simulation testbeds. Numerical results in reward-based settings show significant improvements in sampling efficiency. Further, in reward-free settings this approach yields layered state abstractions that enable computationally efficient hierarchical planning for reaching ad hoc goals with zero additional samples.
△ Less
Submitted 10 June, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
Mixed $\mathcal{H}_2/\mathcal{H}_\infty$-Policy Learning Synthesis
Authors:
Lekan Molu
Abstract:
A robustly stabilizing optimal control policy in a model-free mixed $\mathcal{H}_2/\mathcal{H}_\infty$-control setting is here put forward for counterbalancing the slow convergence and non-robustness of traditional high-variance policy optimization (and by extension policy gradient) algorithms. Leveraging Itô's stochastic differential calculus, we iteratively solve the system's continuous-time clo…
▽ More
A robustly stabilizing optimal control policy in a model-free mixed $\mathcal{H}_2/\mathcal{H}_\infty$-control setting is here put forward for counterbalancing the slow convergence and non-robustness of traditional high-variance policy optimization (and by extension policy gradient) algorithms. Leveraging Itô's stochastic differential calculus, we iteratively solve the system's continuous-time closed-loop generalized algebraic Riccati equation whilst updating its admissible controllers in a two-player, zero-sum differential game setting. Our new results are illustrated by learning-enabled control systems which gather previously disseminated results in this field in one holistic data-driven presentation with greater simplification, improvement, and clarity.
△ Less
Submitted 17 April, 2023; v1 submitted 17 February, 2023;
originally announced February 2023.
-
Robust Policy Optimization in Continuous-time Mixed $\mathcal{H}_2/\mathcal{H}_\infty$ Stochastic Control
Authors:
Leilei Cui,
Lekan Molu
Abstract:
Following the recent resurgence in establishing linear control theoretic benchmarks for reinforcement leaning (RL)-based policy optimization (PO) for complex dynamical systems with continuous state and action spaces, an optimal control problem for a continuous-time infinite-dimensional linear stochastic system possessing additive Brownian motion is optimized on a cost that is an exponent of the qu…
▽ More
Following the recent resurgence in establishing linear control theoretic benchmarks for reinforcement leaning (RL)-based policy optimization (PO) for complex dynamical systems with continuous state and action spaces, an optimal control problem for a continuous-time infinite-dimensional linear stochastic system possessing additive Brownian motion is optimized on a cost that is an exponent of the quadratic form of the state, input, and disturbance terms. We lay out a model-based and model-free algorithm for RL-based stochastic PO. For the model-based algorithm, we establish rigorous convergence guarantees. For the sampling-based algorithm, over trajectory arcs that emanate from the phase space, we find that the Hamilton-Jacobi Bellman equation parameterizes trajectory costs -- resulting in a discrete-time (input and state-based) sampling scheme accompanied by unknown nonlinear dynamics with continuous-time policy iterates. The need for known dynamics operators is circumvented and we arrive at a reinforced PO algorithm (via policy iteration) where an upper bound on the $\mathcal{H}_2$ norm is minimized (to guarantee stability) and a robustness metric is enforced by maximizing the cost with respect to a controller that includes the level of noise attenuation specified by the system's $H_\infty$ norm. Rigorous robustness analyses is prescribed in an input-to-state stability formalism. Our analyses and contributions are distinguished by many natural systems characterized by additive Wiener process, amenable to Îto's stochastic differential calculus in dynamic game settings.
△ Less
Submitted 29 June, 2023; v1 submitted 9 September, 2022;
originally announced September 2022.
-
Comments on "Time-Varying Lyapunov Functions for Tracking Control of Mechanical Systems With and Without Frictions"
Authors:
Lekan Molu
Abstract:
In the article$^a$, the authors introduced a time-varying Lyapunov function for the stability analysis of nonlinear systems whose motion is governed by standard Newton-Euler equations. The authors established asymptotic stability with the choice of two symmetric positive definite matrices restricted by certain eigenvalue bounds in the control law. Exponential stability in the sense of Lyapunov usi…
▽ More
In the article$^a$, the authors introduced a time-varying Lyapunov function for the stability analysis of nonlinear systems whose motion is governed by standard Newton-Euler equations. The authors established asymptotic stability with the choice of two symmetric positive definite matrices restricted by certain eigenvalue bounds in the control law. Exponential stability in the sense of Lyapunov using integrator backstep** and Lyapunov redesign is established in this note using just one matrix in the derived controller. We do not impose minimum eigenvalue bound requirements on the symmetric positive definite matrix introduced in our analysis to guarantee stability. Reducing the parameters needed in the control law, our analysis improves the stability and convergence rates of tracking errors reported in the article$^a$.
$^a$Ren, W., Zhang, B, Li, H, and Yan L. IEEE Access. vol. 8. pp. 51510-51517. 2020.
△ Less
Submitted 9 September, 2022; v1 submitted 30 August, 2022;
originally announced August 2022.
-
Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models
Authors:
Alex Lamb,
Riashat Islam,
Yonathan Efroni,
Aniket Didolkar,
Dipendra Misra,
Dylan Foster,
Lekan Molu,
Rajan Chari,
Akshay Krishnamurthy,
John Langford
Abstract:
In many sequential decision-making tasks, the agent is not able to model the full complexity of the world, which consists of multitudes of relevant and irrelevant information. For example, a person walking along a city street who tries to model all aspects of the world would quickly be overwhelmed by a multitude of shops, cars, and people moving in and out of view, each following their own complex…
▽ More
In many sequential decision-making tasks, the agent is not able to model the full complexity of the world, which consists of multitudes of relevant and irrelevant information. For example, a person walking along a city street who tries to model all aspects of the world would quickly be overwhelmed by a multitude of shops, cars, and people moving in and out of view, each following their own complex and inscrutable dynamics. Is it possible to turn the agent's firehose of sensory information into a minimal latent state that is both necessary and sufficient for an agent to successfully act in the world? We formulate this question concretely, and propose the Agent Control-Endogenous State Discovery algorithm (AC-State), which has theoretical guarantees and is practically demonstrated to discover the minimal control-endogenous latent state which contains all of the information necessary for controlling the agent, while fully discarding all irrelevant information. This algorithm consists of a multi-step inverse model (predicting actions from distant observations) with an information bottleneck. AC-State enables localization, exploration, and navigation without reward or demonstrations. We demonstrate the discovery of the control-endogenous latent state in three domains: localizing a robot arm with distractions (e.g., changing lighting conditions and background), exploring a maze alongside other agents, and navigating in the Matterport house simulator.
△ Less
Submitted 27 December, 2022; v1 submitted 17 July, 2022;
originally announced July 2022.
-
Interaction-Grounded Learning with Action-inclusive Feedback
Authors:
Tengyang Xie,
Akanksha Saran,
Dylan J. Foster,
Lekan Molu,
Ida Momennejad,
Nan Jiang,
Paul Mineiro,
John Langford
Abstract:
Consider the problem setting of Interaction-Grounded Learning (IGL), in which a learner's goal is to optimally interact with the environment with no explicit reward to ground its policies. The agent observes a context vector, takes an action, and receives a feedback vector, using this information to effectively optimize a policy with respect to a latent reward function. Prior analyzed approaches f…
▽ More
Consider the problem setting of Interaction-Grounded Learning (IGL), in which a learner's goal is to optimally interact with the environment with no explicit reward to ground its policies. The agent observes a context vector, takes an action, and receives a feedback vector, using this information to effectively optimize a policy with respect to a latent reward function. Prior analyzed approaches fail when the feedback vector contains the action, which significantly limits IGL's success in many potential scenarios such as Brain-computer interface (BCI) or Human-computer interface (HCI) applications. We address this by creating an algorithm and analysis which allows IGL to work even when the feedback vector contains the action, encoded in any fashion. We provide theoretical guarantees and large-scale experiments based on supervised datasets to demonstrate the effectiveness of the new approach.
△ Less
Submitted 12 October, 2022; v1 submitted 16 June, 2022;
originally announced June 2022.
-
A Second-Order Reachable Sets Computation Scheme via a Cauchy-Type Variational Hamilton-Jacobi-Isaacs Equation
Authors:
Lekan Molu,
Ian Abraham,
Sylvia Herbert
Abstract:
Motivated by the scalability limitations of Eulerian methods for variational Hamilton-Jacobi-Isaacs (HJI) formulations that provide a least restrictive controller in problems that involve state or input constraints under a worst-possible disturbance, we introduce a second-order, successive sweep algorithm for computing the zero sublevel sets of a popular reachability value functional. Under suffic…
▽ More
Motivated by the scalability limitations of Eulerian methods for variational Hamilton-Jacobi-Isaacs (HJI) formulations that provide a least restrictive controller in problems that involve state or input constraints under a worst-possible disturbance, we introduce a second-order, successive sweep algorithm for computing the zero sublevel sets of a popular reachability value functional. Under sufficient HJI partial differential equation regularity and continuity assumption throughout the state space, we show that with state feedback control under the worst-possible disturbance, we can compute the state set that are reachable within a prescribed verification time bound.
△ Less
Submitted 22 June, 2022; v1 submitted 8 March, 2022;
originally announced March 2022.