Search | arXiv e-print repository

Exponentially Stable Projector-based Control of Lagrangian Systems with Gaussian Processes

Authors: Giulio Evangelisti, Cosimo Della Santina, Sandra Hirche

Abstract: Designing accurate yet robust tracking controllers with tight performance guarantees for Lagrangian systems is challenging due to nonlinear modeling uncertainties and conservative stability criteria. This article proposes a structure-preserving projector-based tracking control law for uncertain Euler-Lagrange (EL) systems using physically consistent Lagrangian-Gaussian Processes (L-GPs). We levera… ▽ More Designing accurate yet robust tracking controllers with tight performance guarantees for Lagrangian systems is challenging due to nonlinear modeling uncertainties and conservative stability criteria. This article proposes a structure-preserving projector-based tracking control law for uncertain Euler-Lagrange (EL) systems using physically consistent Lagrangian-Gaussian Processes (L-GPs). We leverage the uncertainty quantification of the L-GP for adaptive feedforward-feedback balancing. In particular, an accurate probabilistic guarantee for exponential stability is derived by leveraging matrix analysis results and contraction theory, where the benefit of the proposed controller is proven and shown in the closed-form expressions for convergence rate and radius. Extensive numerical simulations not only demonstrate the controller's efficacy based on a two-link and a soft robotic manipulator but also all theoretical results are explicitly analyzed and validated. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: author-submitted electronic preprint version

arXiv:2406.02272 [pdf, other]

Computation-Aware Learning for Stable Control with Gaussian Process

Authors: Wenhan Cao, Alexandre Capone, Rishabh Yadav, Sandra Hirche, Wei Pan

Abstract: In Gaussian Process (GP) dynamical model learning for robot control, particularly for systems constrained by computational resources like small quadrotors equipped with low-end processors, analyzing stability and designing a stable controller present significant challenges. This paper distinguishes between two types of uncertainty within the posteriors of GP dynamical models: the well-documented m… ▽ More In Gaussian Process (GP) dynamical model learning for robot control, particularly for systems constrained by computational resources like small quadrotors equipped with low-end processors, analyzing stability and designing a stable controller present significant challenges. This paper distinguishes between two types of uncertainty within the posteriors of GP dynamical models: the well-documented mathematical uncertainty stemming from limited data and computational uncertainty arising from constrained computational capabilities, which has been largely overlooked in prior research. Our work demonstrates that computational uncertainty, quantified through a probabilistic approximation of the inverse covariance matrix in GP dynamical models, is essential for stable control under computational constraints. We show that incorporating computational uncertainty can prevent overestimating the region of attraction, a safe subset of the state space with asymptotic stability, thus improving system safety. Building on these insights, we propose an innovative controller design methodology that integrates computational uncertainty within a second-order cone programming framework. Simulations of canonical stable control tasks and experiments of quadrotor tracking exhibit the effectiveness of our method under computational constraints. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.00540 [pdf, ps, other]

Optimal Transmission Power Scheduling for Networked Control System under DoS Attack

Authors: Siyi Wang, Yulong Gao, Sandra Hirche

Abstract: Designing networked control systems that are reliable and resilient against adversarial threats, is essential for ensuring the security of cyber-physical systems. This paper addresses the communication-control co-design problem for networked control systems under denial-of-service (DoS) attacks. In the wireless channel, a transmission power scheduler periodically determines the power level for sen… ▽ More Designing networked control systems that are reliable and resilient against adversarial threats, is essential for ensuring the security of cyber-physical systems. This paper addresses the communication-control co-design problem for networked control systems under denial-of-service (DoS) attacks. In the wireless channel, a transmission power scheduler periodically determines the power level for sensory data transmission. Yet DoS attacks render data packets unavailable by disrupting the communication channel. This paper co-designs the control and power scheduling laws in the presence of DoS attacks and aims to minimize the sum of regulation control performance and transmission power consumption. Both finite- and infinite-horizon discounted cost criteria are addressed, respectively. By delving into the information structure between the controller and the power scheduler under attack, the original co-design problem is divided into two subproblems that can be solved individually without compromising optimality. The optimal control is shown to be certainty equivalent, and the optimal transmission power scheduling is solved using a dynamic programming approach. Moreover, in the infinite-horizon scenario, we analyze the performance of the designed scheduling policy and develop an upper bound of the total costs. Finally, a numerical example is provided to demonstrate the theoretical results. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2405.17199 [pdf, other]

Physically Consistent Modeling & Identification of Nonlinear Friction with Dissipative Gaussian Processes

Authors: Rui Dai, Giulio Evangelisti, Sandra Hirche

Abstract: Friction modeling has always been a challenging problem due to the complexity of real physical systems. Although a few state-of-the-art structured data-driven methods show their efficiency in nonlinear system modeling, deterministic passivity as one of the significant characteristics of friction is rarely considered in these methods. To address this issue, we propose a Gaussian Process based model… ▽ More Friction modeling has always been a challenging problem due to the complexity of real physical systems. Although a few state-of-the-art structured data-driven methods show their efficiency in nonlinear system modeling, deterministic passivity as one of the significant characteristics of friction is rarely considered in these methods. To address this issue, we propose a Gaussian Process based model that preserves the inherent structural properties such as passivity. A matrix-vector physical structure is considered in our approaches to ensure physical consistency, in particular, enabling a guarantee of positive semi-definiteness of the dam** matrix. An aircraft benchmark simulation is employed to demonstrate the efficacy of our methodology. Estimation accuracy and data efficiency are increased substantially by considering and enforcing more structured physical knowledge. Also, the fulfillment of the dissipative nature of the aerodynamics is validated numerically. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: accepted by L4DC 2024

arXiv:2405.08756 [pdf, other]

Stable Inverse Reinforcement Learning: Policies from Control Lyapunov Landscapes

Authors: Samuel Tesfazgi, Leonhard Sprandl, Armin Lederer, Sandra Hirche

Abstract: Learning from expert demonstrations to flexibly program an autonomous system with complex behaviors or to predict an agent's behavior is a powerful tool, especially in collaborative control settings. A common method to solve this problem is inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to the optimization of an intrinsic… ▽ More Learning from expert demonstrations to flexibly program an autonomous system with complex behaviors or to predict an agent's behavior is a powerful tool, especially in collaborative control settings. A common method to solve this problem is inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to the optimization of an intrinsic cost function that reflects its intent and informs its control actions. While the framework is expressive, it is also computationally demanding and generally lacks convergence guarantees. We therefore propose a novel, stability-certified IRL approach by reformulating the cost function inference problem to learning control Lyapunov functions (CLF) from demonstrations data. By additionally exploiting closed-form expressions for associated control policies, we are able to efficiently search the space of CLFs by observing the attractor landscape of the induced dynamics. For the construction of the inverse optimal CLFs, we use a Sum of Squares and formulate a convex optimization problem. We present a theoretical analysis of the optimality properties provided by the CLF and evaluate our approach using both simulated and real-world data. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.08711 [pdf, other]

Data-driven Force Observer for Human-Robot Interaction with Series Elastic Actuators using Gaussian Processes

Authors: Samuel Tesfazgi, Markus Keßler, Emilio Trigili, Armin Lederer, Sandra Hirche

Abstract: Ensuring safety and adapting to the user's behavior are of paramount importance in physical human-robot interaction. Thus, incorporating elastic actuators in the robot's mechanical design has become popular, since it offers intrinsic compliance and additionally provide a coarse estimate for the interaction force by measuring the deformation of the elastic components. While observer-based methods h… ▽ More Ensuring safety and adapting to the user's behavior are of paramount importance in physical human-robot interaction. Thus, incorporating elastic actuators in the robot's mechanical design has become popular, since it offers intrinsic compliance and additionally provide a coarse estimate for the interaction force by measuring the deformation of the elastic components. While observer-based methods have been shown to improve these estimates, they rely on accurate models of the system, which are challenging to obtain in complex operating environments. In this work, we overcome this issue by learning the unknown dynamics components using Gaussian process (GP) regression. By employing the learned model in a Bayesian filtering framework, we improve the estimation accuracy and additionally obtain an observer that explicitly considers local model uncertainty in the confidence measure of the state estimate. Furthermore, we derive guaranteed estimation error bounds, thus, facilitating the use in safety-critical applications. We demonstrate the effectiveness of the proposed approach experimentally in a human-exoskeleton interaction scenario. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.07312 [pdf, other]

Nonparametric Control-Koopman Operator Learning: Flexible and Scalable Models for Prediction and Control

Authors: Petar Bevanda, Bas Driessen, Lucian Cristian Iacob, Roland Toth, Stefan Sosnowski, Sandra Hirche

Abstract: Linearity of Koopman operators and simplicity of their estimators coupled with model-reduction capabilities has lead to their great popularity in applications for learning dynamical systems. While nonparametric Koopman operator learning in infinite-dimensional reproducing kernel Hilbert spaces is well understood for autonomous systems, its control system analogues are largely unexplored. Addressin… ▽ More Linearity of Koopman operators and simplicity of their estimators coupled with model-reduction capabilities has lead to their great popularity in applications for learning dynamical systems. While nonparametric Koopman operator learning in infinite-dimensional reproducing kernel Hilbert spaces is well understood for autonomous systems, its control system analogues are largely unexplored. Addressing systems with control inputs in a principled manner is crucial for fully data-driven learning of controllers, especially since existing approaches commonly resort to representational heuristics or parametric models of limited expressiveness and scalability. We address the aforementioned challenge by proposing a universal framework via control-affine reproducing kernels that enables direct estimation of a single operator even for control systems. The proposed approach, called control-Koopman operator regression (cKOR), is thus completely analogous to Koopman operator regression of the autonomous case. First in the literature, we present a nonparametric framework for learning Koopman operator representations of nonlinear control-affine systems that does not suffer from the curse of control input dimensionality. This allows for reformulating the infinite-dimensional learning problem in a finite-dimensional space based solely on data without apriori loss of precision due to a restriction to a finite span of functions or inputs as in other approaches. For enabling applications to large-scale control systems, we also enhance the scalability of control-Koopman operator estimators by leveraging random projections (sketching). The efficacy of our novel cKOR approach is demonstrated on both forecasting and control tasks. △ Less

Submitted 12 May, 2024; originally announced May 2024.

arXiv:2404.02988 [pdf, other]

Risk-averse Learning with Non-Stationary Distributions

Authors: Siyi Wang, Zifan Wang, Xinlei Yi, Michael M. Zavlanos, Karl H. Johansson, Sandra Hirche

Abstract: Considering non-stationary environments in online optimization enables decision-maker to effectively adapt to changes and improve its performance over time. In such cases, it is favorable to adopt a strategy that minimizes the negative impact of change to avoid potentially risky situations. In this paper, we investigate risk-averse online optimization where the distribution of the random cost chan… ▽ More Considering non-stationary environments in online optimization enables decision-maker to effectively adapt to changes and improve its performance over time. In such cases, it is favorable to adopt a strategy that minimizes the negative impact of change to avoid potentially risky situations. In this paper, we investigate risk-averse online optimization where the distribution of the random cost changes over time. We minimize risk-averse objective function using the Conditional Value at Risk (CVaR) as risk measure. Due to the difficulty in obtaining the exact CVaR gradient, we employ a zeroth-order optimization approach that queries the cost function values multiple times at each iteration and estimates the CVaR gradient using the sampled values. To facilitate the regret analysis, we use a variation metric based on Wasserstein distance to capture time-varying distributions. Given that the distribution variation is sub-linear in the total number of episodes, we show that our designed learning algorithm achieves sub-linear dynamic regret with high probability for both convex and strongly convex functions. Moreover, theoretical results suggest that increasing the number of samples leads to a reduction in the dynamic regret bounds until the sampling number reaches a specific limit. Finally, we provide numerical experiments of dynamic pricing in a parking lot to illustrate the efficacy of the designed algorithm. △ Less

Submitted 3 April, 2024; originally announced April 2024.

arXiv:2403.10735 [pdf, other]

Time-Robust Path Planning with Piece-Wise Linear Trajectory for Signal Temporal Logic Specifications

Authors: Nhan-Khanh Le, Erfaun Noorani, Sandra Hirche, John Baras

Abstract: Real-world scenarios are characterized by timing uncertainties, e.g., delays, and disturbances. Algorithms with temporal robustness are crucial in guaranteeing the successful execution of tasks and missions in such scenarios. We study time-robust path planning for synthesizing robots' trajectories that adhere to spatial-temporal specifications expressed in Signal Temporal Logic (STL). In contrast… ▽ More Real-world scenarios are characterized by timing uncertainties, e.g., delays, and disturbances. Algorithms with temporal robustness are crucial in guaranteeing the successful execution of tasks and missions in such scenarios. We study time-robust path planning for synthesizing robots' trajectories that adhere to spatial-temporal specifications expressed in Signal Temporal Logic (STL). In contrast to prior approaches that rely on {discretize}d trajectories with fixed time steps, we leverage Piece-Wise Linear (PWL) signals for the synthesis. PWL signals represent a trajectory through a sequence of time-stamped waypoints. This allows us to encode the STL formula into a Mixed-Integer Linear Program (MILP) with fewer variables. This reduction is more pronounced for specifications with a long planning horizon. To that end, we define time-robustness for PWL signals. Subsequently, we propose quantitative semantics for PWL signals according to the recursive syntax of STL and prove their soundness. We then propose an encoding strategy to transform our semantics into a MILP. Our simulations showcase the soundness and the performance of our algorithm. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.08054 [pdf, other]

Learning-based Prescribed-Time Safety for Control of Unknown Systems with Control Barrier Functions

Authors: Tzu-Yuan Huang, Sihua Zhang, Xiaobing Dai, Alexandre Capone, Velimir Todorovski, Stefan Sosnowski, Sandra Hirche

Abstract: In many control system applications, state constraint satisfaction needs to be guaranteed within a prescribed time. While this issue has been partially addressed for systems with known dynamics, it remains largely unaddressed for systems with unknown dynamics. In this paper, we propose a Gaussian process-based time-varying control method that leverages backstep** and control barrier functions to… ▽ More In many control system applications, state constraint satisfaction needs to be guaranteed within a prescribed time. While this issue has been partially addressed for systems with known dynamics, it remains largely unaddressed for systems with unknown dynamics. In this paper, we propose a Gaussian process-based time-varying control method that leverages backstep** and control barrier functions to achieve safety requirements within prescribed time windows for control affine systems. It can be used to keep a system within a safe region or to make it return to a safe region within a limited time window. These properties are cemented by rigorous theoretical results. The effectiveness of the proposed controller is demonstrated in a simulation of a robotic manipulator. △ Less

Submitted 13 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

arXiv:2402.09575 [pdf, other]

Analyzing the Impact of Computation in Adaptive Dynamic Programming for Stochastic LQR Problem

Authors: Wenhan Cao, Alexandre Capone, Sandra Hirche, Wei Pan

Abstract: Adaptive dynamic programming (ADP) for stochastic linear quadratic regulation (LQR) demands the precise computation of stochastic integrals during policy iteration (PI). In a fully model-free problem setting, this computation can only be approximated by state samples collected at discrete time points using computational methods such as the canonical Euler-Maruyama method. Our research reveals a cr… ▽ More Adaptive dynamic programming (ADP) for stochastic linear quadratic regulation (LQR) demands the precise computation of stochastic integrals during policy iteration (PI). In a fully model-free problem setting, this computation can only be approximated by state samples collected at discrete time points using computational methods such as the canonical Euler-Maruyama method. Our research reveals a critical phenomenon: the sampling period can significantly impact control performance. This impact is due to the fact that computational errors introduced in each step of PI can significantly affect the algorithm's convergence behavior, which in turn influences the resulting control policy. We draw a parallel between PI and Newton's method applied to the Ricatti equation to elucidate how the computation impacts control. In this light, the computational error in each PI step manifests itself as an extra error term in each step of Newton's method, with its upper bound proportional to the computational error. Furthermore, we demonstrate that the convergence rate for ADP in stochastic LQR problems using the Euler-Maruyama method is O(h), with h being the sampling period. A sensorimotor control task finally validates these theoretical findings. △ Less

Submitted 14 February, 2024; originally announced February 2024.

arXiv:2402.08819 [pdf, ps, other]

Infinite-horizon optimal scheduling for feedback control

Authors: Siyi Wang, Sandra Hirche

Abstract: Emerging cyber-physical systems impel the development of communication protocols to efficiently utilize resources. This paper investigates the optimal co-design of control and scheduling in networked control systems. The objective is to co-design the control law and the scheduling mechanism that jointly optimize the tradeoff between regulation performance and communication resource consumption in… ▽ More Emerging cyber-physical systems impel the development of communication protocols to efficiently utilize resources. This paper investigates the optimal co-design of control and scheduling in networked control systems. The objective is to co-design the control law and the scheduling mechanism that jointly optimize the tradeoff between regulation performance and communication resource consumption in the long run. The concept of the value of information (VoI) is employed to evaluate the importance of data being transmitted. The optimal solution includes a certainty equivalent control law and a stationary scheduling policy based on the VoI function. The closed-loop system under the designed scheduling policy is shown to be stochastically stable. By analyzing the property of the VoI function, we show that the optimal scheduling policy is symmetric and is a monotone function when the system matrix is diagonal. Moreover, by the diagonal system matrix assumption, the optimal scheduling policy is shown to be of threshold type. Then we provide a simplified yet equivalent form of the threshold-based optimal scheduling policy. The threshold value searching region is also given. Finally, the numerical simulation illustrates the theoretical result of the VoI-based scheduling. △ Less

Submitted 3 April, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

arXiv:2402.03174 [pdf, ps, other]

Decentralized Event-Triggered Online Learning for Safe Consensus of Multi-Agent Systems with Gaussian Process Regression

Authors: Xiaobing Dai, Zewen Yang, Mengtian Xu, Fangzhou Liu, Georges Hattab, Sandra Hirche

Abstract: Consensus control in multi-agent systems has received significant attention and practical implementation across various domains. However, managing consensus control under unknown dynamics remains a significant challenge for control design due to system uncertainties and environmental disturbances. This paper presents a novel learning-based distributed control law, augmented by an auxiliary dynamic… ▽ More Consensus control in multi-agent systems has received significant attention and practical implementation across various domains. However, managing consensus control under unknown dynamics remains a significant challenge for control design due to system uncertainties and environmental disturbances. This paper presents a novel learning-based distributed control law, augmented by an auxiliary dynamics. Gaussian processes are harnessed to compensate for the unknown components of the multi-agent system. For continuous enhancement in predictive performance of Gaussian process model, a data-efficient online learning strategy with a decentralized event-triggered mechanism is proposed. Furthermore, the control performance of the proposed approach is ensured via the Lyapunov theory, based on a probabilistic guarantee for prediction error bounds. To demonstrate the efficacy of the proposed learning-based controller, a comparative analysis is conducted, contrasting it with both conventional distributed control laws and offline learning methodologies. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2402.03048 [pdf, other]

Cooperative Learning with Gaussian Processes for Euler-Lagrange Systems Tracking Control under Switching Topologies

Authors: Zewen Yang, Songbo Dong, Armin Lederer, Xiaobing Dai, Siyu Chen, Stefan Sosnowski, Georges Hattab, Sandra Hirche

Abstract: This work presents an innovative learning-based approach to tackle the tracking control problem of Euler-Lagrange multi-agent systems with partially unknown dynamics operating under switching communication topologies. The approach leverages a correlation-aware cooperative algorithm framework built upon Gaussian process regression, which adeptly captures inter-agent correlations for uncertainty pre… ▽ More This work presents an innovative learning-based approach to tackle the tracking control problem of Euler-Lagrange multi-agent systems with partially unknown dynamics operating under switching communication topologies. The approach leverages a correlation-aware cooperative algorithm framework built upon Gaussian process regression, which adeptly captures inter-agent correlations for uncertainty predictions. A standout feature is its exceptional efficiency in deriving the aggregation weights achieved by circumventing the computationally intensive posterior variance calculations. Through Lyapunov stability analysis, the distributed control law ensures bounded tracking errors with high probability. Simulation experiments validate the protocol's efficacy in effectively managing complex scenarios, establishing it as a promising solution for robust tracking control in multi-agent systems characterized by uncertain dynamics and dynamic communication structures. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 8 pages

arXiv:2401.15824 [pdf, other]

Innovation-triggered Learning for Data-driven Predictive Control: Deterministic and Stochastic Formulations

Authors: Kaikai Zheng, Dawei Shi, Sandra Hirche, Yang Shi

Abstract: Data-driven control has attracted lots of attention in recent years, especially for plants that are difficult to model based on first-principle. In particular, a key issue in data-driven approaches is how to make efficient use of data as the abundance of data becomes overwhelming. {To address this issue, this work proposes an innovation-triggered learning framework and a corresponding data-driven… ▽ More Data-driven control has attracted lots of attention in recent years, especially for plants that are difficult to model based on first-principle. In particular, a key issue in data-driven approaches is how to make efficient use of data as the abundance of data becomes overwhelming. {To address this issue, this work proposes an innovation-triggered learning framework and a corresponding data-driven controller design approach with guaranteed stability. Specifically, we consider a linear time-invariant system with unknown dynamics subject to deterministic/stochastic disturbances, respectively. Two kinds of data selection mechanisms are proposed by online evaluating the innovation contained in the sampled data, wherein the innovation is quantified by its effect of shrinking the set of potential system dynamics that are compatible with the sampled data. Next, after introducing a stability criterion using the set-valued estimation of system dynamics, a robust data-driven predictive controller is designed by minimizing a worst-case cost function.} The closed-loop stability of the data-driven predictive controller equipped with the innovation-triggered learning protocol is proved with a high probability framework. Finally, numerical experiments are performed to verify the validity of the proposed approaches, and the characteristics and the selection principle of the learning hyper-parameter are also discussed. △ Less

Submitted 28 January, 2024; originally announced January 2024.

arXiv:2311.11337 [pdf, other]

H2 suboptimal containment control of homogeneous and heterogeneous multi-agent systems

Authors: Yuan Gao, Junjie Jiao, Zhongkui Li, Sandra Hirche

Abstract: This paper deals with the H2 suboptimal state containment control problem for homogeneous linear multi-agent systems and the H2 suboptimal output containment control problem for heterogeneous linear multi-agent systems. For both problems, given multiple autonomous leaders and a number of followers, we introduce suitable performance outputs and an associated H2 cost functional, respectively. The ai… ▽ More This paper deals with the H2 suboptimal state containment control problem for homogeneous linear multi-agent systems and the H2 suboptimal output containment control problem for heterogeneous linear multi-agent systems. For both problems, given multiple autonomous leaders and a number of followers, we introduce suitable performance outputs and an associated H2 cost functional, respectively. The aim is to design a distributed protocol by dynamic output feedback that achieves state/output containment control while the associated H2 cost is smaller than an a priori given upper bound. To this end, we first show that the H2 suboptimal state/output containment control problem can be equivalently transformed into H2 suboptimal control problems for a set of independent systems. Based on this, design methods are then provided to compute such distributed dynamic output feedback protocols. Simulation examples are provided to illustrate the performance of our proposed protocols. △ Less

Submitted 19 November, 2023; originally announced November 2023.

Comments: 15 papges, 7 figures

arXiv:2311.02133 [pdf, other]

Safe Online Dynamics Learning with Initially Unknown Models and Infeasible Safety Certificates

Authors: Alexandre Capone, Ryan Cosner, Aaron Ames, Sandra Hirche

Abstract: Safety-critical control tasks with high levels of uncertainty are becoming increasingly common. Typically, techniques that guarantee safety during learning and control utilize constraint-based safety certificates, which can be leveraged to compute safe control inputs. However, excessive model uncertainty can render robust safety certification methods or infeasible, meaning no control input satisfi… ▽ More Safety-critical control tasks with high levels of uncertainty are becoming increasingly common. Typically, techniques that guarantee safety during learning and control utilize constraint-based safety certificates, which can be leveraged to compute safe control inputs. However, excessive model uncertainty can render robust safety certification methods or infeasible, meaning no control input satisfies the constraints imposed by the safety certificate. This paper considers a learning-based setting with a robust safety certificate based on a control barrier function (CBF) second-order cone program. If the control barrier function certificate is feasible, our approach leverages it to guarantee safety. Otherwise, our method explores the system dynamics to collect data and recover the feasibility of the control barrier function constraint. To this end, we employ a method inspired by well-established tools from Bayesian optimization. We show that if the sampling frequency is high enough, we recover the feasibility of the robust CBF certificate, guaranteeing safety. Our approach requires no prior model and corresponds, to the best of our knowledge, to the first algorithm that guarantees safety in settings with occasionally infeasible safety certificates without requiring a backup non-learning-based controller. △ Less

Submitted 3 November, 2023; originally announced November 2023.

arXiv:2310.02942 [pdf, other]

Online Constraint Tightening in Stochastic Model Predictive Control: A Regression Approach

Authors: Alexandre Capone, Tim Brüdigam, Sandra Hirche

Abstract: Solving chance-constrained stochastic optimal control problems is a significant challenge in control. This is because no analytical solutions exist for up to a handful of special cases. A common and computationally efficient approach for tackling chance-constrained stochastic optimal control problems consists of reformulating the chance constraints as hard constraints with a constraint-tightening… ▽ More Solving chance-constrained stochastic optimal control problems is a significant challenge in control. This is because no analytical solutions exist for up to a handful of special cases. A common and computationally efficient approach for tackling chance-constrained stochastic optimal control problems consists of reformulating the chance constraints as hard constraints with a constraint-tightening parameter. However, in such approaches, the choice of constraint-tightening parameter remains challenging, and guarantees can mostly be obtained assuming that the process noise distribution is known a priori. Moreover, the chance constraints are often not tightly satisfied, leading to unnecessarily high costs. This work proposes a data-driven approach for learning the constraint-tightening parameters online during control. To this end, we reformulate the choice of constraint-tightening parameter for the closed-loop as a binary regression problem. We then leverage a highly expressive \gls{gp} model for binary regression to approximate the smallest constraint-tightening parameters that satisfy the chance constraints. By tuning the algorithm parameters appropriately, we show that the resulting constraint-tightening parameters satisfy the chance constraints up to an arbitrarily small margin with high probability. Our approach yields constraint-tightening parameters that tightly satisfy the chance constraints in numerical experiments, resulting in a lower average cost than three other state-of-the-art approaches. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: Submitted to Transactions on Automatic Control

arXiv:2310.01538 [pdf, ps, other]

Risk-Sensitive Inhibitory Control for Safe Reinforcement Learning

Authors: Armin Lederer, Erfaun Noorani, John S. Baras, Sandra Hirche

Abstract: Humans have the ability to deviate from their natural behavior when necessary, which is a cognitive process called response inhibition. Similar approaches have independently received increasing attention in recent years for ensuring the safety of control. Realized using control barrier functions or predictive safety filters, these approaches can effectively ensure the satisfaction of state constra… ▽ More Humans have the ability to deviate from their natural behavior when necessary, which is a cognitive process called response inhibition. Similar approaches have independently received increasing attention in recent years for ensuring the safety of control. Realized using control barrier functions or predictive safety filters, these approaches can effectively ensure the satisfaction of state constraints through an online adaptation of nominal control laws, e.g., obtained through reinforcement learning. While the focus of these realizations of inhibitory control has been on risk-neutral formulations, human studies have shown a tight link between response inhibition and risk attitude. Inspired by this insight, we propose a flexible, risk-sensitive method for inhibitory control. Our method is based on a risk-aware condition for value functions, which guarantees the satisfaction of state constraints. We propose a method for learning these value functions using common techniques from reinforcement learning and derive sufficient conditions for its success. By enforcing the derived safety conditions online using the learned value function, risk-sensitive inhibitory control is effectively achieved. The effectiveness of the developed control scheme is demonstrated in simulations. △ Less

Submitted 2 October, 2023; originally announced October 2023.

Comments: The 62nd IEEE Conference on Decision and Control, Dec. 13-15, 2023, Singapore

arXiv:2307.04415 [pdf, other]

Episodic Gaussian Process-Based Learning Control with Vanishing Tracking Errors

Authors: Armin Lederer, Jonas Umlauft, Sandra Hirche

Abstract: Due to the increasing complexity of technical systems, accurate first principle models can often not be obtained. Supervised machine learning can mitigate this issue by inferring models from measurement data. Gaussian process regression is particularly well suited for this purpose due to its high data-efficiency and its explicit uncertainty representation, which allows the derivation of prediction… ▽ More Due to the increasing complexity of technical systems, accurate first principle models can often not be obtained. Supervised machine learning can mitigate this issue by inferring models from measurement data. Gaussian process regression is particularly well suited for this purpose due to its high data-efficiency and its explicit uncertainty representation, which allows the derivation of prediction error bounds. These error bounds have been exploited to show tracking accuracy guarantees for a variety of control approaches, but their direct dependency on the training data is generally unclear. We address this issue by deriving a Bayesian prediction error bound for GP regression, which we show to decay with the growth of a novel, kernel-based measure of data density. Based on the prediction error bound, we prove time-varying tracking accuracy guarantees for learned GP models used as feedback compensation of unknown nonlinearities, and show to achieve vanishing tracking error with increasing data density. This enables us to develop an episodic approach for learning Gaussian process models, such that an arbitrary tracking accuracy can be guaranteed. The effectiveness of the derived theory is demonstrated in several simulations. △ Less

Submitted 10 July, 2023; originally announced July 2023.

arXiv:2305.16215 [pdf, other]

Koopman Kernel Regression

Authors: Petar Bevanda, Max Beier, Armin Lederer, Stefan Sosnowski, Eyke Hüllermeier, Sandra Hirche

Abstract: Many machine learning approaches for decision making, such as reinforcement learning, rely on simulators or predictive models to forecast the time-evolution of quantities of interest, e.g., the state of an agent or the reward of a policy. Forecasts of such complex phenomena are commonly described by highly nonlinear dynamical systems, making their use in optimization-based decision-making challeng… ▽ More Many machine learning approaches for decision making, such as reinforcement learning, rely on simulators or predictive models to forecast the time-evolution of quantities of interest, e.g., the state of an agent or the reward of a policy. Forecasts of such complex phenomena are commonly described by highly nonlinear dynamical systems, making their use in optimization-based decision-making challenging. Koopman operator theory offers a beneficial paradigm for addressing this problem by characterizing forecasts via linear time-invariant (LTI) ODEs, turning multi-step forecasts into sparse matrix multiplication. Though there exists a variety of learning approaches, they usually lack crucial learning-theoretic guarantees, making the behavior of the obtained models with increasing data and dimensionality unclear. We address the aforementioned by deriving a universal Koopman-invariant reproducing kernel Hilbert space (RKHS) that solely spans transformations into LTI dynamical systems. The resulting Koopman Kernel Regression (KKR) framework enables the use of statistical learning tools from function approximation for novel convergence results and generalization error bounds under weaker assumptions than existing work. Our experiments demonstrate superior forecasting performance compared to Koopman operator and sequential data predictors in RKHS. △ Less

Submitted 16 January, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: Accepted to the thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

arXiv:2305.08169 [pdf, ps, other]

Can Learning Deteriorate Control? Analyzing Computational Delays in Gaussian Process-Based Event-Triggered Online Learning

Authors: Xiaobing Dai, Armin Lederer, Zewen Yang, Sandra Hirche

Abstract: When the dynamics of systems are unknown, supervised machine learning techniques are commonly employed to infer models from data. Gaussian process (GP) regression is a particularly popular learning method for this purpose due to the existence of prediction error bounds. Moreover, GP models can be efficiently updated online, such that event-triggered online learning strategies can be pursued to ens… ▽ More When the dynamics of systems are unknown, supervised machine learning techniques are commonly employed to infer models from data. Gaussian process (GP) regression is a particularly popular learning method for this purpose due to the existence of prediction error bounds. Moreover, GP models can be efficiently updated online, such that event-triggered online learning strategies can be pursued to ensure specified tracking accuracies. However, existing trigger conditions must be able to be evaluated at arbitrary times, which cannot be achieved in practice due to non-negligible computation times. Therefore, we first derive a delay-aware tracking error bound, which reveals an accuracy-delay trade-off. Based on this result, we propose a novel event trigger for GP-based online learning with computational delays, which we show to offer advantages over offline trained GP models for sufficiently small computation times. Finally, we demonstrate the effectiveness of the proposed event trigger for online learning in simulations. △ Less

Submitted 14 May, 2023; originally announced May 2023.

arXiv:2304.05723 [pdf, ps, other]

Distributed Coverage Control of Constrained Constant-Speed Unicycle Multi-Agent Systems

Authors: Qingchen Liu, Zengjie Zhang, Nhan Khanh Le, Jiahu Qin, Fangzhou Liu, Sandra Hirche

Abstract: This paper proposes a novel distributed coverage controller for a multi-agent system with constant-speed unicycle robots (CSUR). The work is motivated by the limitation of the conventional method that does not ensure the satisfaction of hard state- and input-dependent constraints and leads to feasibility issues for multi-CSUR systems. In this paper, we solve these problems by designing a novel cov… ▽ More This paper proposes a novel distributed coverage controller for a multi-agent system with constant-speed unicycle robots (CSUR). The work is motivated by the limitation of the conventional method that does not ensure the satisfaction of hard state- and input-dependent constraints and leads to feasibility issues for multi-CSUR systems. In this paper, we solve these problems by designing a novel coverage cost function and a saturated gradient-search-based control law. Invariant set theory and Lyapunov-based techniques are used to prove the state-dependent confinement and the convergence of the system state to the optimal coverage configuration, respectively. The controller is implemented in a distributed manner based on a novel communication standard among the agents. A series of simulation case studies are conducted to validate the effectiveness of the proposed coverage controller in different initial conditions and with control parameters. A comparison study in simulation reveals the advantage of the proposed method in terms of avoiding infeasibility. The experiment study verifies the applicability of the method to real robots with uncertainties. The development procedure of the method from theoretical analysis to experimental validation provides a novel framework for multi-agent system coordinate control with complex agent dynamics. △ Less

Submitted 14 March, 2024; v1 submitted 12 April, 2023; originally announced April 2023.

arXiv:2304.05138 [pdf, other]

Cooperative Online Learning for Multi-Agent System Control via Gaussian Processes with Event-Triggered Mechanism: Extended Version

Authors: Xiaobing Dai, Zewen Yang, Sandra Hirche

Abstract: In the realm of the cooperative control of multi-agent systems (MASs) with unknown dynamics, Gaussian process (GP) regression is widely used to infer the uncertainties due to its modeling flexibility of nonlinear functions and the existence of a theoretical prediction error bound. Online learning, which involves incorporating newly acquired training data into Gaussian process models, promises to i… ▽ More In the realm of the cooperative control of multi-agent systems (MASs) with unknown dynamics, Gaussian process (GP) regression is widely used to infer the uncertainties due to its modeling flexibility of nonlinear functions and the existence of a theoretical prediction error bound. Online learning, which involves incorporating newly acquired training data into Gaussian process models, promises to improve control performance by enhancing predictions during the operation. Therefore, this paper investigates the online cooperative learning algorithm for MAS control. Moreover, an event-triggered data selection mechanism, inspired by the analysis of a centralized event-trigger, is introduced to reduce the model update frequency and enhance the data efficiency. With the proposed learning-based control, the practical convergence of the MAS is validated with guaranteed tracking performance via the Lynaponve theory. Furthermore, the exclusion of the Zeno behavior for individual agents is shown. Finally, the effectiveness of the proposed event-triggered online learning method is demonstrated in simulations. △ Less

Submitted 2 January, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

arXiv:2304.05131 [pdf, other]

Fast IMU-based Dual Estimation of Human Motion and Kinematic Parameters via Progressive In-Network Computing

Authors: Xiaobing Dai, Huanzhuo Wu, Siyi Wang, Junjie Jiao, Giang T. Nguyen, Frank H. P. Fitzek, Sandra Hirche

Abstract: Many applications involve humans in the loop, where continuous and accurate human motion monitoring provides valuable information for safe and intuitive human-machine interaction. Portable devices such as inertial measurement units (IMUs) are applicable to monitor human motions, while in practice often limited computational power is available locally. The human motion in task space coordinates req… ▽ More Many applications involve humans in the loop, where continuous and accurate human motion monitoring provides valuable information for safe and intuitive human-machine interaction. Portable devices such as inertial measurement units (IMUs) are applicable to monitor human motions, while in practice often limited computational power is available locally. The human motion in task space coordinates requires not only the human joint motion but also the nonlinear coordinate transformation depending on the parameters such as human limb length. In most applications, measuring these kinematics parameters for each individual requires undesirably high effort. Therefore, it is desirable to estimate both, the human motion and kinematic parameters from IMUs. In this work, we propose a novel computational framework for dual estimation in real-time exploiting in-network computational resources. We adopt the concept of field Kalman filtering, where the dual estimation problem is decomposed into a fast state estimation process and a computationally expensive parameter estimation process. In order to further accelerate the convergence, the parameter estimation is progressively computed on multiple networked computational nodes. The superiority of our proposed method is demonstrated by a simulation of a human arm, where the estimation accuracy is shown to converge faster than with conventional approaches. △ Less

Submitted 11 April, 2023; originally announced April 2023.

arXiv:2303.17963 [pdf, other]

Learning-Based Optimal Control with Performance Guarantees for Unknown Systems with Latent States

Authors: Robert Lefringhausen, Supitsana Srithasan, Armin Lederer, Sandra Hirche

Abstract: As control engineering methods are applied to increasingly complex systems, data-driven approaches for system identification appear as a promising alternative to physics-based modeling. While the Bayesian approaches prevalent for safety-critical applications usually rely on the availability of state measurements, the states of a complex system are often not directly measurable. It may then be nece… ▽ More As control engineering methods are applied to increasingly complex systems, data-driven approaches for system identification appear as a promising alternative to physics-based modeling. While the Bayesian approaches prevalent for safety-critical applications usually rely on the availability of state measurements, the states of a complex system are often not directly measurable. It may then be necessary to jointly estimate the dynamics and the latent state, making the quantification of uncertainties and the design of controllers with formal performance guarantees considerably more challenging. This paper proposes a novel method for the computation of an optimal input trajectory for unknown nonlinear systems with latent states based on a combination of particle Markov chain Monte Carlo methods and scenario theory. Probabilistic performance guarantees are derived for the resulting input trajectory, and an approach to validate the performance of arbitrary control laws is presented. The effectiveness of the proposed method is demonstrated in a numerical simulation. △ Less

Submitted 16 April, 2024; v1 submitted 31 March, 2023; originally announced March 2023.

Comments: Accepted version submitted to the 22nd European Control Conference

arXiv:2301.05445 [pdf, other]

Average Communication Rate for Event-Triggered Stochastic Control Systems

Authors: Zengjie Zhang, Qingchen Liu, Mohammad H. Mamduhi, Sandra Hirche

Abstract: Quantifying the average communication rate (ACR) of a networked event-triggered stochastic control system (NET-SCS) with deterministic thresholds is challenging due to the non-stationary nature of the system's stochastic processes. For a NET-SCS, the nonlinear statistics propagation of the network communication status brought up by deterministic thresholds makes the precise computation of ACR diff… ▽ More Quantifying the average communication rate (ACR) of a networked event-triggered stochastic control system (NET-SCS) with deterministic thresholds is challenging due to the non-stationary nature of the system's stochastic processes. For a NET-SCS, the nonlinear statistics propagation of the network communication status brought up by deterministic thresholds makes the precise computation of ACR difficult. Previous work used to over-simplify the computation using a Gaussian distribution without incorporating this nonlinearity, leading to sacrificed precision. This paper proposes both analytical and numerical approaches to predict the exact ACR for a NET-SCS using a recursive model. We use theoretical analysis and a numerical study to qualitatively evaluate the deviation gap of the conventional approach that ignores the side information. The accuracy of our proposed method, alongside its comparison with the simplified results of the conventional approach, is validated by experimental studies. Our work is promising to benefit the efficient resource planning of networked control systems with limited communication resources by providing accurate ACR computation. △ Less

Submitted 12 April, 2024; v1 submitted 13 January, 2023; originally announced January 2023.

arXiv:2212.00478 [pdf, ps, other]

Safe Learning-Based Control of Elastic Joint Robots via Control Barrier Functions

Authors: Armin Lederer, Azra Begzadić, Neha Das, Sandra Hirche

Abstract: Ensuring safety is of paramount importance in physical human-robot interaction applications. This requires both adherence to safety constraints defined on the system state, as well as guaranteeing compliant behavior of the robot. If the underlying dynamical system is known exactly, the former can be addressed with the help of control barrier functions. The incorporation of elastic actuators in the… ▽ More Ensuring safety is of paramount importance in physical human-robot interaction applications. This requires both adherence to safety constraints defined on the system state, as well as guaranteeing compliant behavior of the robot. If the underlying dynamical system is known exactly, the former can be addressed with the help of control barrier functions. The incorporation of elastic actuators in the robot's mechanical design can address the latter requirement. However, this elasticity can increase the complexity of the resulting system, leading to unmodeled dynamics, such that control barrier functions cannot directly ensure safety. In this paper, we mitigate this issue by learning the unknown dynamics using Gaussian process regression. By employing the model in a feedback linearizing control law, the safety conditions resulting from control barrier functions can be robustified to take into account model errors, while remaining feasible. In order to enforce them on-line, we formulate the derived safety conditions in the form of a second-order cone program. We demonstrate our proposed approach with simulations on a two-degree-of-freedom planar robot with elastic joints. △ Less

Submitted 14 April, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

arXiv:2209.06936 [pdf, other]

doi 10.1109/LRA.2023.3322899

Vision-Based Uncertainty-Aware Motion Planning based on Probabilistic Semantic Segmentation

Authors: Ralf Römer, Armin Lederer, Samuel Tesfazgi, Sandra Hirche

Abstract: For safe operation, a robot must be able to avoid collisions in uncertain environments. Existing approaches for motion planning under uncertainties often assume parametric obstacle representations and Gaussian uncertainty, which can be inaccurate. While visual perception can deliver a more accurate representation of the environment, its use for safe motion planning is limited by the inherent misca… ▽ More For safe operation, a robot must be able to avoid collisions in uncertain environments. Existing approaches for motion planning under uncertainties often assume parametric obstacle representations and Gaussian uncertainty, which can be inaccurate. While visual perception can deliver a more accurate representation of the environment, its use for safe motion planning is limited by the inherent miscalibration of neural networks and the challenge of obtaining adequate datasets. To address these limitations, we propose to employ ensembles of deep semantic segmentation networks trained with massively augmented datasets to ensure reliable probabilistic occupancy information. To avoid conservatism during motion planning, we directly employ the probabilistic perception in a scenario-based path planning approach. A velocity scheduling scheme is applied to the path to ensure a safe motion despite tracking inaccuracies. We demonstrate the effectiveness of the massive data augmentation in combination with deep ensembles and the proposed scenario-based planning approach in comparisons to state-of-the-art methods and validate our framework in an experiment with a human hand as an obstacle. △ Less

Submitted 1 December, 2023; v1 submitted 14 September, 2022; originally announced September 2022.

Journal ref: IEEE Robotics and Automation Letters, vol. 8, no. 11, pp. 7825-7832, 2023

arXiv:2207.01337 [pdf, other]

Safe Reinforcement Learning via Confidence-Based Filters

Authors: Sebastian Curi, Armin Lederer, Sandra Hirche, Andreas Krause

Abstract: Ensuring safety is a crucial challenge when deploying reinforcement learning (RL) to real-world systems. We develop confidence-based safety filters, a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard RL techniques, based on probabilistic dynamics models. Our approach is based on a reformulation of state constraints in terms of cost functi… ▽ More Ensuring safety is a crucial challenge when deploying reinforcement learning (RL) to real-world systems. We develop confidence-based safety filters, a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard RL techniques, based on probabilistic dynamics models. Our approach is based on a reformulation of state constraints in terms of cost functions, reducing safety verification to a standard RL task. By exploiting the concept of hallucinating inputs, we extend this formulation to determine a "backup" policy that is safe for the unknown system with high probability. Finally, the nominal policy is minimally adjusted at every time step during a roll-out towards the backup policy, such that safe recovery can be guaranteed afterwards. We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach. △ Less

Submitted 4 July, 2022; originally announced July 2022.

arXiv:2206.12272 [pdf, ps, other]

doi 10.1109/CDC51059.2022.9993123

Physically Consistent Learning of Conservative Lagrangian Systems with Gaussian Processes

Authors: Giulio Evangelisti, Sandra Hirche

Abstract: This paper proposes a physically consistent Gaussian Process (GP) enabling the identification of uncertain Lagrangian systems. The function space is tailored according to the energy components of the Lagrangian and the differential equation structure, analytically guaranteeing physical and mathematical properties such as energy conservation and quadratic form. The novel formulation of Cholesky dec… ▽ More This paper proposes a physically consistent Gaussian Process (GP) enabling the identification of uncertain Lagrangian systems. The function space is tailored according to the energy components of the Lagrangian and the differential equation structure, analytically guaranteeing physical and mathematical properties such as energy conservation and quadratic form. The novel formulation of Cholesky decomposed matrix kernels allow the probabilistic preservation of positive definiteness. Only differential input-to-output measurements of the function map are required while Gaussian noise is permitted in torques, velocities, and accelerations. We demonstrate the effectiveness of the approach in numerical simulation. △ Less

Submitted 3 February, 2023; v1 submitted 24 June, 2022; originally announced June 2022.

Comments: Accepted version of paper published by IEEE in 2022 IEEE 61st Conference on Decision and Control (CDC). Final published paper can be found at https://doi.org/10.1109/CDC51059.2022.9993123

arXiv:2203.02321 [pdf, ps, other]

Actuator Scheduling for Linear Systems: A Convex Relaxation Approach

Authors: Junjie Jiao, Dipankar Maity, John S. Baras, Sandra Hirche

Abstract: In this letter, we investigate the problem of actuator scheduling for networked control systems. Given a stochastic linear system with a number of actuators, we consider the case that one actuator is activated at each time. This problem is combinatorial in nature and NP hard to solve. We propose a convex relaxation to the actuator scheduling problem, and use its solution as a reference to design a… ▽ More In this letter, we investigate the problem of actuator scheduling for networked control systems. Given a stochastic linear system with a number of actuators, we consider the case that one actuator is activated at each time. This problem is combinatorial in nature and NP hard to solve. We propose a convex relaxation to the actuator scheduling problem, and use its solution as a reference to design an algorithm for solving the original scheduling problem. Using dynamic programming arguments, we provide a suboptimality bound of our proposed algorithm. Furthermore, we show that our framework can be extended to incorporate multiple actuators scheduling at each time and actuation costs. A simulation example is provided, which shows that our proposed method outperforms a random selection approach and a greedy selection approach. △ Less

Submitted 20 May, 2022; v1 submitted 4 March, 2022; originally announced March 2022.

Comments: 8 pages, 4 figures

arXiv:2202.11491 [pdf, other]

Networked Online Learning for Control of Safety-Critical Resource-Constrained Systems based on Gaussian Processes

Authors: Armin Lederer, Mingmin Zhang, Samuel Tesfazgi, Sandra Hirche

Abstract: Safety-critical technical systems operating in unknown environments require the ability to quickly adapt their behavior, which can be achieved in control by inferring a model online from the data stream generated during operation. Gaussian process-based learning is particularly well suited for safety-critical applications as it ensures bounded prediction errors. While there exist computationally e… ▽ More Safety-critical technical systems operating in unknown environments require the ability to quickly adapt their behavior, which can be achieved in control by inferring a model online from the data stream generated during operation. Gaussian process-based learning is particularly well suited for safety-critical applications as it ensures bounded prediction errors. While there exist computationally efficient approximations for online inference, these approaches lack guarantees for the prediction error and have high memory requirements, and are therefore not applicable to safety-critical systems with tight memory constraints. In this work, we propose a novel networked online learning approach based on Gaussian process regression, which addresses the issue of limited local resources by employing remote data management in the cloud. Our approach formally guarantees a bounded tracking error with high probability, which is exploited to identify the most relevant data to achieve a certain control performance. We further propose an effective data transmission scheme between the local system and the cloud taking bandwidth limitations and time delay of the transmission channel into account. The effectiveness of the proposed method is successfully demonstrated in a simulation. △ Less

Submitted 23 February, 2022; originally announced February 2022.

arXiv:2201.11640 [pdf, ps, other]

Towards Data-driven LQR with Koopmanizing Flows

Authors: Petar Bevanda, Max Beier, Shahab Heshmati-Alamdari, Stefan Sosnowski, Sandra Hirche

Abstract: We propose a novel framework for learning linear time-invariant (LTI) models for a class of continuous-time non-autonomous nonlinear dynamics based on a representation of Koopman operators. In general, the operator is infinite-dimensional but, crucially, linear. To utilize it for efficient LTI control design, we learn a finite representation of the Koopman operator that is linear in controls while… ▽ More We propose a novel framework for learning linear time-invariant (LTI) models for a class of continuous-time non-autonomous nonlinear dynamics based on a representation of Koopman operators. In general, the operator is infinite-dimensional but, crucially, linear. To utilize it for efficient LTI control design, we learn a finite representation of the Koopman operator that is linear in controls while concurrently learning meaningful lifting coordinates. For the latter, we rely on Koopmanizing Flows - a diffeomorphism-based representation of Koopman operators and extend it to systems with linear control entry. With such a learned model, we can replace the nonlinear optimal control problem with quadratic cost to that of a linear quadratic regulator (LQR), facilitating efficacious optimal control for nonlinear systems. The superior control performance of the proposed method is demonstrated on simulation examples. △ Less

Submitted 23 May, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

Comments: Final version, accepted for presentation at the 6th IFAC Conference on Intelligent Control and Automation Sciences (ICONS), 2022. arXiv admin note: text overlap with arXiv:2112.04085

arXiv:2112.04085 [pdf, other]

Diffeomorphically Learning Stable Koopman Operators

Authors: Petar Bevanda, Max Beier, Sebastian Kerz, Armin Lederer, Stefan Sosnowski, Sandra Hirche

Abstract: System representations inspired by the infinite-dimensional Koopman operator (generator) are increasingly considered for predictive modeling. Due to the operator's linearity, a range of nonlinear systems admit linear predictor representations - allowing for simplified prediction, analysis and control. However, finding meaningful finite-dimensional representations for prediction is difficult as it… ▽ More System representations inspired by the infinite-dimensional Koopman operator (generator) are increasingly considered for predictive modeling. Due to the operator's linearity, a range of nonlinear systems admit linear predictor representations - allowing for simplified prediction, analysis and control. However, finding meaningful finite-dimensional representations for prediction is difficult as it involves determining features that are both Koopman-invariant (evolve linearly under the dynamics) as well as relevant (spanning the original state) - a generally unsupervised problem. In this work, we present Koopmanizing Flows - a novel continuous-time framework for supervised learning of linear predictors for a class of nonlinear dynamics. In our model construction a latent diffeomorphically related linear system unfolds into a linear predictor through the composition with a monomial basis. The lifting, its linear dynamics and state reconstruction are learned simultaneously, while an unconstrained parameterization of Hurwitz matrices ensures asymptotic stability regardless of the operator approximation accuracy. The superior efficacy of Koopmanizing Flows is demonstrated in comparison to a state-of-the-art method on the well-known LASA handwriting benchmark. △ Less

Submitted 30 May, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

Comments: Revised version submitted to IEEE Control Systems Letters (L-CSS) with substantially revised exposition, evaluation and proof of Lemma 2 (previously Lemma 8)

arXiv:2111.03617 [pdf, ps, other]

Adaptive Low-Pass Filtering using Sliding Window Gaussian Processes

Authors: Alejandro J. Ordóñez-Conejo, Armin Lederer, Sandra Hirche

Abstract: When signals are measured through physical sensors, they are perturbed by noise. To reduce noise, low-pass filters are commonly employed in order to attenuate high frequency components in the incoming signal, regardless if they come from noise or the actual signal. Therefore, low-pass filters must be carefully tuned in order to avoid significant deterioration of the signal. This tuning requires pr… ▽ More When signals are measured through physical sensors, they are perturbed by noise. To reduce noise, low-pass filters are commonly employed in order to attenuate high frequency components in the incoming signal, regardless if they come from noise or the actual signal. Therefore, low-pass filters must be carefully tuned in order to avoid significant deterioration of the signal. This tuning requires prior knowledge about the signal, which is often not available in applications such as reinforcement learning or learning-based control. In order to overcome this limitation, we propose an adaptive low-pass filter based on Gaussian process regression. By considering a constant window of previous observations, updates and predictions fast enough for real-world filtering applications can be realized. Moreover, the online optimization of hyperparameters leads to an adaptation of the low-pass behavior, such that no prior tuning is necessary. We show that the estimation error of the proposed method is uniformly bounded, and demonstrate the flexibility and efficiency of the approach in several simulations. △ Less

Submitted 5 November, 2021; originally announced November 2021.

arXiv:2110.07786 [pdf, other]

Learning the Koopman Eigendecomposition: A Diffeomorphic Approach

Authors: Petar Bevanda, Johannes Kirmayr, Stefan Sosnowski, Sandra Hirche

Abstract: We present a novel data-driven approach for learning linear representations of a class of stable nonlinear systems using Koopman eigenfunctions. By learning the conjugacy map between a nonlinear system and its Jacobian linearization through a Normalizing Flow one can guarantee the learned function is a diffeomorphism. Using this diffeomorphism, we construct eigenfunctions of the nonlinear system v… ▽ More We present a novel data-driven approach for learning linear representations of a class of stable nonlinear systems using Koopman eigenfunctions. By learning the conjugacy map between a nonlinear system and its Jacobian linearization through a Normalizing Flow one can guarantee the learned function is a diffeomorphism. Using this diffeomorphism, we construct eigenfunctions of the nonlinear system via the spectral equivalence of conjugate systems - allowing the construction of linear predictors for nonlinear systems. The universality of the diffeomorphism learner leads to the universal approximation of the nonlinear system's Koopman eigenfunctions. The developed method is also safe as it guarantees the model is asymptotically stable regardless of the representation accuracy. To our best knowledge, this is the first work to close the gap between the operator, system and learning theories. The efficacy of our approach is shown through simulation examples. △ Less

Submitted 30 May, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

Comments: Accepted for presentation at the 2022 American Control Conference (ACC)

arXiv:2109.02606 [pdf, other]

Gaussian Process Uniform Error Bounds with Unknown Hyperparameters for Safety-Critical Applications

Authors: Alexandre Capone, Armin Lederer, Sandra Hirche

Abstract: Gaussian processes have become a promising tool for various safety-critical settings, since the posterior variance can be used to directly estimate the model error and quantify risk. However, state-of-the-art techniques for safety-critical settings hinge on the assumption that the kernel hyperparameters are known, which does not apply in general. To mitigate this, we introduce robust Gaussian proc… ▽ More Gaussian processes have become a promising tool for various safety-critical settings, since the posterior variance can be used to directly estimate the model error and quantify risk. However, state-of-the-art techniques for safety-critical settings hinge on the assumption that the kernel hyperparameters are known, which does not apply in general. To mitigate this, we introduce robust Gaussian process uniform error bounds in settings with unknown hyperparameters. Our approach computes a confidence region in the space of hyperparameters, which enables us to obtain a probabilistic upper bound for the model error of a Gaussian process with arbitrary hyperparameters. We do not require to know any bounds for the hyperparameters a priori, which is an assumption commonly found in related work. Instead, we are able to derive bounds from data in an intuitive fashion. We additionally employ the proposed technique to derive performance guarantees for a class of learning-based control problems. Experiments show that the bound performs significantly better than vanilla and fully Bayesian Gaussian processes. △ Less

Submitted 20 July, 2022; v1 submitted 6 September, 2021; originally announced September 2021.

arXiv:2107.07822 [pdf, other]

Distributed Value of Information in Feedback Control over Multi-hop Networks

Authors: Precious Ugo Abara, Sandra Hirche

Abstract: Recent works in the domain of networked control systems have demonstrated that the joint design of medium access control strategies and control strategies for the closed-loop system is beneficial. However, several metrics introduced so far fail in either appropriately representing the network requirements or in capturing how valuable the data is. In this paper we propose a distributed value of inf… ▽ More Recent works in the domain of networked control systems have demonstrated that the joint design of medium access control strategies and control strategies for the closed-loop system is beneficial. However, several metrics introduced so far fail in either appropriately representing the network requirements or in capturing how valuable the data is. In this paper we propose a distributed value of information (dVoI) metric for the joint design of control and schedulers for medium access in a multi-loop system and multi-hop network. We start by providing conditions under certainty equivalent controller is optimal. Then we reformulate the joint control and communication problem as a Bellman-like equation. The corresponding dynamic programming problem is solved in a distributed fashion by the proposed VoI-based scheduling policies for the multi-loop multi-hop networked control system, which outperforms the well-known time-triggered periodic sampling policies. Additionally we show that the dVoI-based scheduling policies are independent of each other, both loop-wise and hop-wise. At last, we illustrate the results with a numerical example. △ Less

Submitted 16 July, 2021; originally announced July 2021.

Comments: 19 pages, 10 figures

arXiv:2104.04483 [pdf, other]

Inverse Reinforcement Learning: A Control Lyapunov Approach

Authors: Samuel Tesfazgi, Armin Lederer, Sandra Hirche

Abstract: Inferring the intent of an intelligent agent from demonstrations and subsequently predicting its behavior, is a critical task in many collaborative settings. A common approach to solve this problem is the framework of inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to an intrinsic cost function that reflects its intent and… ▽ More Inferring the intent of an intelligent agent from demonstrations and subsequently predicting its behavior, is a critical task in many collaborative settings. A common approach to solve this problem is the framework of inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to an intrinsic cost function that reflects its intent and informs its control actions. In this work, we reformulate the IRL inference problem to learning control Lyapunov functions (CLF) from demonstrations by exploiting the inverse optimality property, which states that every CLF is also a meaningful value function. Moreover, the derived CLF formulation directly guarantees stability of inferred control policies. We show the flexibility of our proposed method by learning from goal-directed movement demonstrations in a continuous environment. △ Less

Submitted 4 October, 2021; v1 submitted 9 April, 2021; originally announced April 2021.

Comments: This work has been accepted for presentation at, and publication in the proceedings of, the 2021 IEEE Conference on Decision and Control (CDC)

arXiv:2104.03355 [pdf, other]

Value of information in networked control systems subject to delay

Authors: Siyi Wang, Qingchen Liu, Precious Ugo Abara, John S. Baras, Sandra Hirche

Abstract: In this paper, we study the trade-off between the transmission cost and the control performance of the multi-loop networked control system subject to network-induced delay. Within the linear-quadratic-Gaussian (LQG) framework, the joint design of control policy and networking strategy is decomposed into separation optimization problems. Based on the trade-off analysis, a scalable, delay-dependent… ▽ More In this paper, we study the trade-off between the transmission cost and the control performance of the multi-loop networked control system subject to network-induced delay. Within the linear-quadratic-Gaussian (LQG) framework, the joint design of control policy and networking strategy is decomposed into separation optimization problems. Based on the trade-off analysis, a scalable, delay-dependent Value-of-Information (VoI) based scheduling policy is constructed to quantify the value of transmitting the data packet, and enables the decision-makers embedded in subsystems to determine the transmission policy. The proposed scalable VoI inherits the task criticality of the previous VoI metric meanwhile is sensitive to the system parameters such as information freshness and network delays. The VoI-based scheduling policy is proved to outperform the periodical triggering policy and existing Age-of-Information (AoI) based policy for network control system under transmission delay. The effectiveness of the constructed VoI with arbitrary network delay is validated through numerical simulations. △ Less

Submitted 29 December, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

Comments: accepted CDC2021

arXiv:2104.00130 [pdf, other]

Safe Online Learning-based Formation Control of Multi-Agent Systems with Gaussian Processes

Authors: Thomas Beckers, Sandra Hirche, Leonardo Colombo

Abstract: Formation control algorithms for multi-agent systems have gained much attention in the recent years due to the increasing amount of mobile and aerial robotic swarms. The design of safe controllers for these vehicles is a substantial aspect for an increasing range of application domains. However, parts of the vehicle's dynamics and external disturbances are often unknown or very time-consuming to m… ▽ More Formation control algorithms for multi-agent systems have gained much attention in the recent years due to the increasing amount of mobile and aerial robotic swarms. The design of safe controllers for these vehicles is a substantial aspect for an increasing range of application domains. However, parts of the vehicle's dynamics and external disturbances are often unknown or very time-consuming to model. To overcome this issue, we present a safe formation control law for multiagent systems based on double integrator dynamics by using Gaussian Processes for an online learning of the unknown dynamics. The presented approach guarantees a bounded error to desired formations with high probability, where the bound is explicitly given. A numerical example highlights the effectiveness of the learning-based formation control law. △ Less

Submitted 31 March, 2021; originally announced April 2021.

Comments: Preprint submitted to IEEE CDC 2021

arXiv:2103.15929 [pdf, other]

Distributed Learning Consensus Control for Unknown Nonlinear Multi-Agent Systems based on Gaussian Processes

Authors: Zewen Yang, Stefan Sosnowski, Qingchen Liu, Junjie Jiao, Armin Lederer, Sandra Hirche

Abstract: In this paper, a distributed learning leader-follower consensus protocol based on Gaussian process regression for a class of nonlinear multi-agent systems with unknown dynamics is designed. We propose a distributed learning approach to predict the residual dynamics for each agent. The stability of the consensus protocol using the data-driven model of the dynamics is shown via Lyapunov analysis. Th… ▽ More In this paper, a distributed learning leader-follower consensus protocol based on Gaussian process regression for a class of nonlinear multi-agent systems with unknown dynamics is designed. We propose a distributed learning approach to predict the residual dynamics for each agent. The stability of the consensus protocol using the data-driven model of the dynamics is shown via Lyapunov analysis. The followers ultimately synchronize to the leader with guaranteed error bounds by applying the proposed control law with a high probability. The effectiveness and the applicability of the developed protocol are demonstrated by simulation examples. △ Less

Submitted 29 March, 2021; originally announced March 2021.

Comments: This paper was submitted to IEEE CDC2021

arXiv:2103.11851 [pdf, ps, other]

Data-driven output synchronization of heterogeneous leader-follower multi-agent systems

Authors: Junjie Jiao, Henk J. van Waarde, Harry L. Trentelman, M. Kanat Camlibel, Sandra Hirche

Abstract: This paper deals with data-driven output synchronization for heterogeneous leader-follower linear multi-agent systems. Given a multi-agent system that consists of one autonomous leader and a number of heterogeneous followers with external disturbances, we provide necessary and sufficient data-based conditions for output synchronization. We also provide a design method for obtaining such output syn… ▽ More This paper deals with data-driven output synchronization for heterogeneous leader-follower linear multi-agent systems. Given a multi-agent system that consists of one autonomous leader and a number of heterogeneous followers with external disturbances, we provide necessary and sufficient data-based conditions for output synchronization. We also provide a design method for obtaining such output synchronizing protocols directly from data. The results are then extended to the special case that the followers are disturbance-free. Finally, a simulation example is provided to illustrate our results. △ Less

Submitted 23 September, 2021; v1 submitted 22 March, 2021; originally announced March 2021.

Comments: 6 pages, 2 figures. This paper has been accepted by IEEE CDC 2021

arXiv:2102.02522 [pdf, other]

doi 10.1016/j.arcontrol.2021.09.002

Koopman Operator Dynamical Models: Learning, Analysis and Control

Authors: Petar Bevanda, Stefan Sosnowski, Sandra Hirche

Abstract: The Koopman operator allows for handling nonlinear systems through a (globally) linear representation. In general, the operator is infinite-dimensional - necessitating finite approximations - for which there is no overarching framework. Although there are principled ways of learning such finite approximations, they are in many instances overlooked in favor of, often ill-posed and unstructured meth… ▽ More The Koopman operator allows for handling nonlinear systems through a (globally) linear representation. In general, the operator is infinite-dimensional - necessitating finite approximations - for which there is no overarching framework. Although there are principled ways of learning such finite approximations, they are in many instances overlooked in favor of, often ill-posed and unstructured methods. Also, Koopman operator theory has long-standing connections to known system-theoretic and dynamical system notions that are not universally recognized. Given the former and latter realities, this work aims to bridge the gap between various concepts regarding both theory and tractable realizations. Firstly, we review data-driven representations (both unstructured and structured) for Koopman operator dynamical models, categorizing various existing methodologies and highlighting their differences. Furthermore, we provide concise insight into the paradigm's relation to system-theoretic notions and analyze the prospect of using the paradigm for modeling control systems. Additionally, we outline the current challenges and comment on future perspectives. △ Less

Submitted 22 December, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

Comments: This is an authors' version of the work that is published in Annual Reviews in Control journal. Changes were made to this version by the publisher prior to publication

Journal ref: Annual Reviews in Control - Volume 52, 2021, Pages 197-212

arXiv:2101.05328 [pdf, other]

Uniform Error and Posterior Variance Bounds for Gaussian Process Regression with Application to Safe Control

Authors: Armin Lederer, Jonas Umlauft, Sandra Hirche

Abstract: In application areas where data generation is expensive, Gaussian processes are a preferred supervised learning model due to their high data-efficiency. Particularly in model-based control, Gaussian processes allow the derivation of performance guarantees using probabilistic model error bounds. To make these approaches applicable in practice, two open challenges must be solved i) Existing error bo… ▽ More In application areas where data generation is expensive, Gaussian processes are a preferred supervised learning model due to their high data-efficiency. Particularly in model-based control, Gaussian processes allow the derivation of performance guarantees using probabilistic model error bounds. To make these approaches applicable in practice, two open challenges must be solved i) Existing error bounds rely on prior knowledge, which might not be available for many real-world tasks. (ii) The relationship between training data and the posterior variance, which mainly drives the error bound, is not well understood and prevents the asymptotic analysis. This article addresses these issues by presenting a novel uniform error bound using Lipschitz continuity and an analysis of the posterior variance function for a large class of kernels. Additionally, we show how these results can be used to guarantee safe control of an unknown dynamical system and provide numerical illustration examples. △ Less

Submitted 13 January, 2021; originally announced January 2021.

arXiv:2011.10596 [pdf, ps, other]

The Impact of Data on the Stability of Learning-Based Control- Extended Version

Authors: Armin Lederer, Alexandre Capone, Thomas Beckers, Jonas Umlauft, Sandra Hirche

Abstract: Despite the existence of formal guarantees for learning-based control approaches, the relationship between data and control performance is still poorly understood. In this paper, we propose a Lyapunov-based measure for quantifying the impact of data on the certifiable control performance. By modeling unknown system dynamics through Gaussian processes, we can determine the interrelation between mod… ▽ More Despite the existence of formal guarantees for learning-based control approaches, the relationship between data and control performance is still poorly understood. In this paper, we propose a Lyapunov-based measure for quantifying the impact of data on the certifiable control performance. By modeling unknown system dynamics through Gaussian processes, we can determine the interrelation between model uncertainty and satisfaction of stability conditions. This allows us to directly asses the impact of data on the provable stationary control performance, and thereby the value of the data for the closed-loop system performance. Our approach is applicable to a wide variety of unknown nonlinear systems that are to be controlled by a generic learning-based control law, and the results obtained in numerical simulations indicate the efficacy of the proposed measure. △ Less

Submitted 30 July, 2021; v1 submitted 20 November, 2020; originally announced November 2020.

arXiv:2010.02613 [pdf, other]

Deep Learning based Uncertainty Decomposition for Real-time Control

Authors: Neha Das, Jonas Umlauft, Armin Lederer, Thomas Beckers, Sandra Hirche

Abstract: Data-driven control in unknown environments requires a clear understanding of the involved uncertainties for ensuring safety and efficient exploration. While aleatoric uncertainty that arises from measurement noise can often be explicitly modeled given a parametric description, it can be harder to model epistemic uncertainty, which describes the presence or absence of training data. The latter can… ▽ More Data-driven control in unknown environments requires a clear understanding of the involved uncertainties for ensuring safety and efficient exploration. While aleatoric uncertainty that arises from measurement noise can often be explicitly modeled given a parametric description, it can be harder to model epistemic uncertainty, which describes the presence or absence of training data. The latter can be particularly useful for implementing exploratory control strategies when system dynamics are unknown. We propose a novel method for detecting the absence of training data using deep learning, which gives a continuous valued scalar output between $0$ (indicating low uncertainty) and $1$ (indicating high uncertainty). We utilize this detector as a proxy for epistemic uncertainty and show its advantages over existing approaches on synthetic and real-world datasets. Our approach can be directly combined with aleatoric uncertainty estimates and allows for uncertainty estimation in real-time as the inference is sample-free unlike existing approaches for uncertainty modeling. We further demonstrate the practicality of this uncertainty estimate in deploying online data-efficient control on a simulated quadcopter acted upon by an unknown disturbance model. △ Less

Submitted 12 July, 2023; v1 submitted 6 October, 2020; originally announced October 2020.

Comments: Accepted at IFAC World Congress 2023

arXiv:2009.06689 [pdf, other]

Online learning-based trajectory tracking for underactuated vehicles with uncertain dynamics

Authors: Thomas Beckers, Leonardo Colombo, Sandra Hirche, George J. Pappas

Abstract: Underactuated vehicles have gained much attention in the recent years due to the increasing amount of aerial and underwater vehicles as well as nanosatellites. Trajectory tracking control of these vehicles is a substantial aspect for an increasing range of application domains. However, external disturbances and parts of the internal dynamics are often unknown or very time-consuming to model. To ov… ▽ More Underactuated vehicles have gained much attention in the recent years due to the increasing amount of aerial and underwater vehicles as well as nanosatellites. Trajectory tracking control of these vehicles is a substantial aspect for an increasing range of application domains. However, external disturbances and parts of the internal dynamics are often unknown or very time-consuming to model. To overcome this issue, we present a tracking control law for underactuated rigid-body dynamics using an online learning-based oracle for the prediction of the unknown dynamics. We show that Gaussian process models are of particular interest for the role of the oracle. The presented approach guarantees a bounded tracking error with high probability where the bound is explicitly given. A numerical example highlights the effectiveness of the proposed control law. △ Less

Submitted 14 September, 2021; v1 submitted 14 September, 2020; originally announced September 2020.

arXiv:2007.12377 [pdf, ps, other]

Anticipating the Long-Term Effect of Online Learning in Control

Authors: Alexandre Capone, Sandra Hirche

Abstract: Control schemes that learn using measurement data collected online are increasingly promising for the control of complex and uncertain systems. However, in most approaches of this kind, learning is viewed as a side effect that passively improves control performance, e.g., by updating a model of the system dynamics. Determining how improvements in control performance due to learning can be actively… ▽ More Control schemes that learn using measurement data collected online are increasingly promising for the control of complex and uncertain systems. However, in most approaches of this kind, learning is viewed as a side effect that passively improves control performance, e.g., by updating a model of the system dynamics. Determining how improvements in control performance due to learning can be actively exploited in the control synthesis is still an open research question. In this paper, we present AntLer, a design algorithm for learning-based control laws that anticipates learning, i.e., that takes the impact of future learning in uncertain dynamic settings explicitly into account. AntLer expresses system uncertainty using a non-parametric probabilistic model. Given a cost function that measures control performance, AntLer chooses the control parameters such that the expected cost of the closed-loop system is minimized approximately. We show that AntLer approximates an optimal solution arbitrarily accurately with probability one. Furthermore, we apply AntLer to a nonlinear system, which yields better results compared to the case where learning is not anticipated. △ Less

Submitted 24 July, 2020; originally announced July 2020.

Showing 1–50 of 75 results for author: Hirche, S