-
Exponentially Stable Projector-based Control of Lagrangian Systems with Gaussian Processes
Authors:
Giulio Evangelisti,
Cosimo Della Santina,
Sandra Hirche
Abstract:
Designing accurate yet robust tracking controllers with tight performance guarantees for Lagrangian systems is challenging due to nonlinear modeling uncertainties and conservative stability criteria. This article proposes a structure-preserving projector-based tracking control law for uncertain Euler-Lagrange (EL) systems using physically consistent Lagrangian-Gaussian Processes (L-GPs). We levera…
▽ More
Designing accurate yet robust tracking controllers with tight performance guarantees for Lagrangian systems is challenging due to nonlinear modeling uncertainties and conservative stability criteria. This article proposes a structure-preserving projector-based tracking control law for uncertain Euler-Lagrange (EL) systems using physically consistent Lagrangian-Gaussian Processes (L-GPs). We leverage the uncertainty quantification of the L-GP for adaptive feedforward-feedback balancing. In particular, an accurate probabilistic guarantee for exponential stability is derived by leveraging matrix analysis results and contraction theory, where the benefit of the proposed controller is proven and shown in the closed-form expressions for convergence rate and radius. Extensive numerical simulations not only demonstrate the controller's efficacy based on a two-link and a soft robotic manipulator but also all theoretical results are explicitly analyzed and validated.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Computation-Aware Learning for Stable Control with Gaussian Process
Authors:
Wenhan Cao,
Alexandre Capone,
Rishabh Yadav,
Sandra Hirche,
Wei Pan
Abstract:
In Gaussian Process (GP) dynamical model learning for robot control, particularly for systems constrained by computational resources like small quadrotors equipped with low-end processors, analyzing stability and designing a stable controller present significant challenges. This paper distinguishes between two types of uncertainty within the posteriors of GP dynamical models: the well-documented m…
▽ More
In Gaussian Process (GP) dynamical model learning for robot control, particularly for systems constrained by computational resources like small quadrotors equipped with low-end processors, analyzing stability and designing a stable controller present significant challenges. This paper distinguishes between two types of uncertainty within the posteriors of GP dynamical models: the well-documented mathematical uncertainty stemming from limited data and computational uncertainty arising from constrained computational capabilities, which has been largely overlooked in prior research. Our work demonstrates that computational uncertainty, quantified through a probabilistic approximation of the inverse covariance matrix in GP dynamical models, is essential for stable control under computational constraints. We show that incorporating computational uncertainty can prevent overestimating the region of attraction, a safe subset of the state space with asymptotic stability, thus improving system safety. Building on these insights, we propose an innovative controller design methodology that integrates computational uncertainty within a second-order cone programming framework. Simulations of canonical stable control tasks and experiments of quadrotor tracking exhibit the effectiveness of our method under computational constraints.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Optimal Transmission Power Scheduling for Networked Control System under DoS Attack
Authors:
Siyi Wang,
Yulong Gao,
Sandra Hirche
Abstract:
Designing networked control systems that are reliable and resilient against adversarial threats, is essential for ensuring the security of cyber-physical systems. This paper addresses the communication-control co-design problem for networked control systems under denial-of-service (DoS) attacks. In the wireless channel, a transmission power scheduler periodically determines the power level for sen…
▽ More
Designing networked control systems that are reliable and resilient against adversarial threats, is essential for ensuring the security of cyber-physical systems. This paper addresses the communication-control co-design problem for networked control systems under denial-of-service (DoS) attacks. In the wireless channel, a transmission power scheduler periodically determines the power level for sensory data transmission. Yet DoS attacks render data packets unavailable by disrupting the communication channel. This paper co-designs the control and power scheduling laws in the presence of DoS attacks and aims to minimize the sum of regulation control performance and transmission power consumption. Both finite- and infinite-horizon discounted cost criteria are addressed, respectively. By delving into the information structure between the controller and the power scheduler under attack, the original co-design problem is divided into two subproblems that can be solved individually without compromising optimality. The optimal control is shown to be certainty equivalent, and the optimal transmission power scheduling is solved using a dynamic programming approach. Moreover, in the infinite-horizon scenario, we analyze the performance of the designed scheduling policy and develop an upper bound of the total costs. Finally, a numerical example is provided to demonstrate the theoretical results.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Physically Consistent Modeling & Identification of Nonlinear Friction with Dissipative Gaussian Processes
Authors:
Rui Dai,
Giulio Evangelisti,
Sandra Hirche
Abstract:
Friction modeling has always been a challenging problem due to the complexity of real physical systems. Although a few state-of-the-art structured data-driven methods show their efficiency in nonlinear system modeling, deterministic passivity as one of the significant characteristics of friction is rarely considered in these methods. To address this issue, we propose a Gaussian Process based model…
▽ More
Friction modeling has always been a challenging problem due to the complexity of real physical systems. Although a few state-of-the-art structured data-driven methods show their efficiency in nonlinear system modeling, deterministic passivity as one of the significant characteristics of friction is rarely considered in these methods. To address this issue, we propose a Gaussian Process based model that preserves the inherent structural properties such as passivity. A matrix-vector physical structure is considered in our approaches to ensure physical consistency, in particular, enabling a guarantee of positive semi-definiteness of the dam** matrix. An aircraft benchmark simulation is employed to demonstrate the efficacy of our methodology. Estimation accuracy and data efficiency are increased substantially by considering and enforcing more structured physical knowledge. Also, the fulfillment of the dissipative nature of the aerodynamics is validated numerically.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Stable Inverse Reinforcement Learning: Policies from Control Lyapunov Landscapes
Authors:
Samuel Tesfazgi,
Leonhard Sprandl,
Armin Lederer,
Sandra Hirche
Abstract:
Learning from expert demonstrations to flexibly program an autonomous system with complex behaviors or to predict an agent's behavior is a powerful tool, especially in collaborative control settings. A common method to solve this problem is inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to the optimization of an intrinsic…
▽ More
Learning from expert demonstrations to flexibly program an autonomous system with complex behaviors or to predict an agent's behavior is a powerful tool, especially in collaborative control settings. A common method to solve this problem is inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to the optimization of an intrinsic cost function that reflects its intent and informs its control actions. While the framework is expressive, it is also computationally demanding and generally lacks convergence guarantees. We therefore propose a novel, stability-certified IRL approach by reformulating the cost function inference problem to learning control Lyapunov functions (CLF) from demonstrations data. By additionally exploiting closed-form expressions for associated control policies, we are able to efficiently search the space of CLFs by observing the attractor landscape of the induced dynamics. For the construction of the inverse optimal CLFs, we use a Sum of Squares and formulate a convex optimization problem. We present a theoretical analysis of the optimality properties provided by the CLF and evaluate our approach using both simulated and real-world data.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Data-driven Force Observer for Human-Robot Interaction with Series Elastic Actuators using Gaussian Processes
Authors:
Samuel Tesfazgi,
Markus Keßler,
Emilio Trigili,
Armin Lederer,
Sandra Hirche
Abstract:
Ensuring safety and adapting to the user's behavior are of paramount importance in physical human-robot interaction. Thus, incorporating elastic actuators in the robot's mechanical design has become popular, since it offers intrinsic compliance and additionally provide a coarse estimate for the interaction force by measuring the deformation of the elastic components. While observer-based methods h…
▽ More
Ensuring safety and adapting to the user's behavior are of paramount importance in physical human-robot interaction. Thus, incorporating elastic actuators in the robot's mechanical design has become popular, since it offers intrinsic compliance and additionally provide a coarse estimate for the interaction force by measuring the deformation of the elastic components. While observer-based methods have been shown to improve these estimates, they rely on accurate models of the system, which are challenging to obtain in complex operating environments. In this work, we overcome this issue by learning the unknown dynamics components using Gaussian process (GP) regression. By employing the learned model in a Bayesian filtering framework, we improve the estimation accuracy and additionally obtain an observer that explicitly considers local model uncertainty in the confidence measure of the state estimate. Furthermore, we derive guaranteed estimation error bounds, thus, facilitating the use in safety-critical applications. We demonstrate the effectiveness of the proposed approach experimentally in a human-exoskeleton interaction scenario.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Nonparametric Control-Koopman Operator Learning: Flexible and Scalable Models for Prediction and Control
Authors:
Petar Bevanda,
Bas Driessen,
Lucian Cristian Iacob,
Roland Toth,
Stefan Sosnowski,
Sandra Hirche
Abstract:
Linearity of Koopman operators and simplicity of their estimators coupled with model-reduction capabilities has lead to their great popularity in applications for learning dynamical systems. While nonparametric Koopman operator learning in infinite-dimensional reproducing kernel Hilbert spaces is well understood for autonomous systems, its control system analogues are largely unexplored. Addressin…
▽ More
Linearity of Koopman operators and simplicity of their estimators coupled with model-reduction capabilities has lead to their great popularity in applications for learning dynamical systems. While nonparametric Koopman operator learning in infinite-dimensional reproducing kernel Hilbert spaces is well understood for autonomous systems, its control system analogues are largely unexplored. Addressing systems with control inputs in a principled manner is crucial for fully data-driven learning of controllers, especially since existing approaches commonly resort to representational heuristics or parametric models of limited expressiveness and scalability. We address the aforementioned challenge by proposing a universal framework via control-affine reproducing kernels that enables direct estimation of a single operator even for control systems. The proposed approach, called control-Koopman operator regression (cKOR), is thus completely analogous to Koopman operator regression of the autonomous case. First in the literature, we present a nonparametric framework for learning Koopman operator representations of nonlinear control-affine systems that does not suffer from the curse of control input dimensionality. This allows for reformulating the infinite-dimensional learning problem in a finite-dimensional space based solely on data without apriori loss of precision due to a restriction to a finite span of functions or inputs as in other approaches. For enabling applications to large-scale control systems, we also enhance the scalability of control-Koopman operator estimators by leveraging random projections (sketching). The efficacy of our novel cKOR approach is demonstrated on both forecasting and control tasks.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Risk-averse Learning with Non-Stationary Distributions
Authors:
Siyi Wang,
Zifan Wang,
Xinlei Yi,
Michael M. Zavlanos,
Karl H. Johansson,
Sandra Hirche
Abstract:
Considering non-stationary environments in online optimization enables decision-maker to effectively adapt to changes and improve its performance over time. In such cases, it is favorable to adopt a strategy that minimizes the negative impact of change to avoid potentially risky situations. In this paper, we investigate risk-averse online optimization where the distribution of the random cost chan…
▽ More
Considering non-stationary environments in online optimization enables decision-maker to effectively adapt to changes and improve its performance over time. In such cases, it is favorable to adopt a strategy that minimizes the negative impact of change to avoid potentially risky situations. In this paper, we investigate risk-averse online optimization where the distribution of the random cost changes over time. We minimize risk-averse objective function using the Conditional Value at Risk (CVaR) as risk measure. Due to the difficulty in obtaining the exact CVaR gradient, we employ a zeroth-order optimization approach that queries the cost function values multiple times at each iteration and estimates the CVaR gradient using the sampled values. To facilitate the regret analysis, we use a variation metric based on Wasserstein distance to capture time-varying distributions. Given that the distribution variation is sub-linear in the total number of episodes, we show that our designed learning algorithm achieves sub-linear dynamic regret with high probability for both convex and strongly convex functions. Moreover, theoretical results suggest that increasing the number of samples leads to a reduction in the dynamic regret bounds until the sampling number reaches a specific limit. Finally, we provide numerical experiments of dynamic pricing in a parking lot to illustrate the efficacy of the designed algorithm.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Time-Robust Path Planning with Piece-Wise Linear Trajectory for Signal Temporal Logic Specifications
Authors:
Nhan-Khanh Le,
Erfaun Noorani,
Sandra Hirche,
John Baras
Abstract:
Real-world scenarios are characterized by timing uncertainties, e.g., delays, and disturbances. Algorithms with temporal robustness are crucial in guaranteeing the successful execution of tasks and missions in such scenarios. We study time-robust path planning for synthesizing robots' trajectories that adhere to spatial-temporal specifications expressed in Signal Temporal Logic (STL). In contrast…
▽ More
Real-world scenarios are characterized by timing uncertainties, e.g., delays, and disturbances. Algorithms with temporal robustness are crucial in guaranteeing the successful execution of tasks and missions in such scenarios. We study time-robust path planning for synthesizing robots' trajectories that adhere to spatial-temporal specifications expressed in Signal Temporal Logic (STL). In contrast to prior approaches that rely on {discretize}d trajectories with fixed time steps, we leverage Piece-Wise Linear (PWL) signals for the synthesis. PWL signals represent a trajectory through a sequence of time-stamped waypoints. This allows us to encode the STL formula into a Mixed-Integer Linear Program (MILP) with fewer variables. This reduction is more pronounced for specifications with a long planning horizon. To that end, we define time-robustness for PWL signals. Subsequently, we propose quantitative semantics for PWL signals according to the recursive syntax of STL and prove their soundness. We then propose an encoding strategy to transform our semantics into a MILP. Our simulations showcase the soundness and the performance of our algorithm.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Learning-based Prescribed-Time Safety for Control of Unknown Systems with Control Barrier Functions
Authors:
Tzu-Yuan Huang,
Sihua Zhang,
Xiaobing Dai,
Alexandre Capone,
Velimir Todorovski,
Stefan Sosnowski,
Sandra Hirche
Abstract:
In many control system applications, state constraint satisfaction needs to be guaranteed within a prescribed time. While this issue has been partially addressed for systems with known dynamics, it remains largely unaddressed for systems with unknown dynamics. In this paper, we propose a Gaussian process-based time-varying control method that leverages backstep** and control barrier functions to…
▽ More
In many control system applications, state constraint satisfaction needs to be guaranteed within a prescribed time. While this issue has been partially addressed for systems with known dynamics, it remains largely unaddressed for systems with unknown dynamics. In this paper, we propose a Gaussian process-based time-varying control method that leverages backstep** and control barrier functions to achieve safety requirements within prescribed time windows for control affine systems. It can be used to keep a system within a safe region or to make it return to a safe region within a limited time window. These properties are cemented by rigorous theoretical results. The effectiveness of the proposed controller is demonstrated in a simulation of a robotic manipulator.
△ Less
Submitted 13 June, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Analyzing the Impact of Computation in Adaptive Dynamic Programming for Stochastic LQR Problem
Authors:
Wenhan Cao,
Alexandre Capone,
Sandra Hirche,
Wei Pan
Abstract:
Adaptive dynamic programming (ADP) for stochastic linear quadratic regulation (LQR) demands the precise computation of stochastic integrals during policy iteration (PI). In a fully model-free problem setting, this computation can only be approximated by state samples collected at discrete time points using computational methods such as the canonical Euler-Maruyama method. Our research reveals a cr…
▽ More
Adaptive dynamic programming (ADP) for stochastic linear quadratic regulation (LQR) demands the precise computation of stochastic integrals during policy iteration (PI). In a fully model-free problem setting, this computation can only be approximated by state samples collected at discrete time points using computational methods such as the canonical Euler-Maruyama method. Our research reveals a critical phenomenon: the sampling period can significantly impact control performance. This impact is due to the fact that computational errors introduced in each step of PI can significantly affect the algorithm's convergence behavior, which in turn influences the resulting control policy. We draw a parallel between PI and Newton's method applied to the Ricatti equation to elucidate how the computation impacts control. In this light, the computational error in each PI step manifests itself as an extra error term in each step of Newton's method, with its upper bound proportional to the computational error. Furthermore, we demonstrate that the convergence rate for ADP in stochastic LQR problems using the Euler-Maruyama method is O(h), with h being the sampling period. A sensorimotor control task finally validates these theoretical findings.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Infinite-horizon optimal scheduling for feedback control
Authors:
Siyi Wang,
Sandra Hirche
Abstract:
Emerging cyber-physical systems impel the development of communication protocols to efficiently utilize resources. This paper investigates the optimal co-design of control and scheduling in networked control systems. The objective is to co-design the control law and the scheduling mechanism that jointly optimize the tradeoff between regulation performance and communication resource consumption in…
▽ More
Emerging cyber-physical systems impel the development of communication protocols to efficiently utilize resources. This paper investigates the optimal co-design of control and scheduling in networked control systems. The objective is to co-design the control law and the scheduling mechanism that jointly optimize the tradeoff between regulation performance and communication resource consumption in the long run. The concept of the value of information (VoI) is employed to evaluate the importance of data being transmitted. The optimal solution includes a certainty equivalent control law and a stationary scheduling policy based on the VoI function. The closed-loop system under the designed scheduling policy is shown to be stochastically stable. By analyzing the property of the VoI function, we show that the optimal scheduling policy is symmetric and is a monotone function when the system matrix is diagonal. Moreover, by the diagonal system matrix assumption, the optimal scheduling policy is shown to be of threshold type. Then we provide a simplified yet equivalent form of the threshold-based optimal scheduling policy. The threshold value searching region is also given. Finally, the numerical simulation illustrates the theoretical result of the VoI-based scheduling.
△ Less
Submitted 3 April, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Decentralized Event-Triggered Online Learning for Safe Consensus of Multi-Agent Systems with Gaussian Process Regression
Authors:
Xiaobing Dai,
Zewen Yang,
Mengtian Xu,
Fangzhou Liu,
Georges Hattab,
Sandra Hirche
Abstract:
Consensus control in multi-agent systems has received significant attention and practical implementation across various domains. However, managing consensus control under unknown dynamics remains a significant challenge for control design due to system uncertainties and environmental disturbances. This paper presents a novel learning-based distributed control law, augmented by an auxiliary dynamic…
▽ More
Consensus control in multi-agent systems has received significant attention and practical implementation across various domains. However, managing consensus control under unknown dynamics remains a significant challenge for control design due to system uncertainties and environmental disturbances. This paper presents a novel learning-based distributed control law, augmented by an auxiliary dynamics. Gaussian processes are harnessed to compensate for the unknown components of the multi-agent system. For continuous enhancement in predictive performance of Gaussian process model, a data-efficient online learning strategy with a decentralized event-triggered mechanism is proposed. Furthermore, the control performance of the proposed approach is ensured via the Lyapunov theory, based on a probabilistic guarantee for prediction error bounds. To demonstrate the efficacy of the proposed learning-based controller, a comparative analysis is conducted, contrasting it with both conventional distributed control laws and offline learning methodologies.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Cooperative Learning with Gaussian Processes for Euler-Lagrange Systems Tracking Control under Switching Topologies
Authors:
Zewen Yang,
Songbo Dong,
Armin Lederer,
Xiaobing Dai,
Siyu Chen,
Stefan Sosnowski,
Georges Hattab,
Sandra Hirche
Abstract:
This work presents an innovative learning-based approach to tackle the tracking control problem of Euler-Lagrange multi-agent systems with partially unknown dynamics operating under switching communication topologies. The approach leverages a correlation-aware cooperative algorithm framework built upon Gaussian process regression, which adeptly captures inter-agent correlations for uncertainty pre…
▽ More
This work presents an innovative learning-based approach to tackle the tracking control problem of Euler-Lagrange multi-agent systems with partially unknown dynamics operating under switching communication topologies. The approach leverages a correlation-aware cooperative algorithm framework built upon Gaussian process regression, which adeptly captures inter-agent correlations for uncertainty predictions. A standout feature is its exceptional efficiency in deriving the aggregation weights achieved by circumventing the computationally intensive posterior variance calculations. Through Lyapunov stability analysis, the distributed control law ensures bounded tracking errors with high probability. Simulation experiments validate the protocol's efficacy in effectively managing complex scenarios, establishing it as a promising solution for robust tracking control in multi-agent systems characterized by uncertain dynamics and dynamic communication structures.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Innovation-triggered Learning for Data-driven Predictive Control: Deterministic and Stochastic Formulations
Authors:
Kaikai Zheng,
Dawei Shi,
Sandra Hirche,
Yang Shi
Abstract:
Data-driven control has attracted lots of attention in recent years, especially for plants that are difficult to model based on first-principle. In particular, a key issue in data-driven approaches is how to make efficient use of data as the abundance of data becomes overwhelming. {To address this issue, this work proposes an innovation-triggered learning framework and a corresponding data-driven…
▽ More
Data-driven control has attracted lots of attention in recent years, especially for plants that are difficult to model based on first-principle. In particular, a key issue in data-driven approaches is how to make efficient use of data as the abundance of data becomes overwhelming. {To address this issue, this work proposes an innovation-triggered learning framework and a corresponding data-driven controller design approach with guaranteed stability. Specifically, we consider a linear time-invariant system with unknown dynamics subject to deterministic/stochastic disturbances, respectively. Two kinds of data selection mechanisms are proposed by online evaluating the innovation contained in the sampled data, wherein the innovation is quantified by its effect of shrinking the set of potential system dynamics that are compatible with the sampled data. Next, after introducing a stability criterion using the set-valued estimation of system dynamics, a robust data-driven predictive controller is designed by minimizing a worst-case cost function.} The closed-loop stability of the data-driven predictive controller equipped with the innovation-triggered learning protocol is proved with a high probability framework. Finally, numerical experiments are performed to verify the validity of the proposed approaches, and the characteristics and the selection principle of the learning hyper-parameter are also discussed.
△ Less
Submitted 28 January, 2024;
originally announced January 2024.
-
H2 suboptimal containment control of homogeneous and heterogeneous multi-agent systems
Authors:
Yuan Gao,
Junjie Jiao,
Zhongkui Li,
Sandra Hirche
Abstract:
This paper deals with the H2 suboptimal state containment control problem for homogeneous linear multi-agent systems and the H2 suboptimal output containment control problem for heterogeneous linear multi-agent systems. For both problems, given multiple autonomous leaders and a number of followers, we introduce suitable performance outputs and an associated H2 cost functional, respectively. The ai…
▽ More
This paper deals with the H2 suboptimal state containment control problem for homogeneous linear multi-agent systems and the H2 suboptimal output containment control problem for heterogeneous linear multi-agent systems. For both problems, given multiple autonomous leaders and a number of followers, we introduce suitable performance outputs and an associated H2 cost functional, respectively. The aim is to design a distributed protocol by dynamic output feedback that achieves state/output containment control while the associated H2 cost is smaller than an a priori given upper bound. To this end, we first show that the H2 suboptimal state/output containment control problem can be equivalently transformed into H2 suboptimal control problems for a set of independent systems. Based on this, design methods are then provided to compute such distributed dynamic output feedback protocols. Simulation examples are provided to illustrate the performance of our proposed protocols.
△ Less
Submitted 19 November, 2023;
originally announced November 2023.
-
Safe Online Dynamics Learning with Initially Unknown Models and Infeasible Safety Certificates
Authors:
Alexandre Capone,
Ryan Cosner,
Aaron Ames,
Sandra Hirche
Abstract:
Safety-critical control tasks with high levels of uncertainty are becoming increasingly common. Typically, techniques that guarantee safety during learning and control utilize constraint-based safety certificates, which can be leveraged to compute safe control inputs. However, excessive model uncertainty can render robust safety certification methods or infeasible, meaning no control input satisfi…
▽ More
Safety-critical control tasks with high levels of uncertainty are becoming increasingly common. Typically, techniques that guarantee safety during learning and control utilize constraint-based safety certificates, which can be leveraged to compute safe control inputs. However, excessive model uncertainty can render robust safety certification methods or infeasible, meaning no control input satisfies the constraints imposed by the safety certificate. This paper considers a learning-based setting with a robust safety certificate based on a control barrier function (CBF) second-order cone program. If the control barrier function certificate is feasible, our approach leverages it to guarantee safety. Otherwise, our method explores the system dynamics to collect data and recover the feasibility of the control barrier function constraint. To this end, we employ a method inspired by well-established tools from Bayesian optimization. We show that if the sampling frequency is high enough, we recover the feasibility of the robust CBF certificate, guaranteeing safety. Our approach requires no prior model and corresponds, to the best of our knowledge, to the first algorithm that guarantees safety in settings with occasionally infeasible safety certificates without requiring a backup non-learning-based controller.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Online Constraint Tightening in Stochastic Model Predictive Control: A Regression Approach
Authors:
Alexandre Capone,
Tim Brüdigam,
Sandra Hirche
Abstract:
Solving chance-constrained stochastic optimal control problems is a significant challenge in control. This is because no analytical solutions exist for up to a handful of special cases. A common and computationally efficient approach for tackling chance-constrained stochastic optimal control problems consists of reformulating the chance constraints as hard constraints with a constraint-tightening…
▽ More
Solving chance-constrained stochastic optimal control problems is a significant challenge in control. This is because no analytical solutions exist for up to a handful of special cases. A common and computationally efficient approach for tackling chance-constrained stochastic optimal control problems consists of reformulating the chance constraints as hard constraints with a constraint-tightening parameter. However, in such approaches, the choice of constraint-tightening parameter remains challenging, and guarantees can mostly be obtained assuming that the process noise distribution is known a priori. Moreover, the chance constraints are often not tightly satisfied, leading to unnecessarily high costs. This work proposes a data-driven approach for learning the constraint-tightening parameters online during control. To this end, we reformulate the choice of constraint-tightening parameter for the closed-loop as a binary regression problem. We then leverage a highly expressive \gls{gp} model for binary regression to approximate the smallest constraint-tightening parameters that satisfy the chance constraints. By tuning the algorithm parameters appropriately, we show that the resulting constraint-tightening parameters satisfy the chance constraints up to an arbitrarily small margin with high probability. Our approach yields constraint-tightening parameters that tightly satisfy the chance constraints in numerical experiments, resulting in a lower average cost than three other state-of-the-art approaches.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Risk-Sensitive Inhibitory Control for Safe Reinforcement Learning
Authors:
Armin Lederer,
Erfaun Noorani,
John S. Baras,
Sandra Hirche
Abstract:
Humans have the ability to deviate from their natural behavior when necessary, which is a cognitive process called response inhibition. Similar approaches have independently received increasing attention in recent years for ensuring the safety of control. Realized using control barrier functions or predictive safety filters, these approaches can effectively ensure the satisfaction of state constra…
▽ More
Humans have the ability to deviate from their natural behavior when necessary, which is a cognitive process called response inhibition. Similar approaches have independently received increasing attention in recent years for ensuring the safety of control. Realized using control barrier functions or predictive safety filters, these approaches can effectively ensure the satisfaction of state constraints through an online adaptation of nominal control laws, e.g., obtained through reinforcement learning. While the focus of these realizations of inhibitory control has been on risk-neutral formulations, human studies have shown a tight link between response inhibition and risk attitude. Inspired by this insight, we propose a flexible, risk-sensitive method for inhibitory control. Our method is based on a risk-aware condition for value functions, which guarantees the satisfaction of state constraints. We propose a method for learning these value functions using common techniques from reinforcement learning and derive sufficient conditions for its success. By enforcing the derived safety conditions online using the learned value function, risk-sensitive inhibitory control is effectively achieved. The effectiveness of the developed control scheme is demonstrated in simulations.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Episodic Gaussian Process-Based Learning Control with Vanishing Tracking Errors
Authors:
Armin Lederer,
Jonas Umlauft,
Sandra Hirche
Abstract:
Due to the increasing complexity of technical systems, accurate first principle models can often not be obtained. Supervised machine learning can mitigate this issue by inferring models from measurement data. Gaussian process regression is particularly well suited for this purpose due to its high data-efficiency and its explicit uncertainty representation, which allows the derivation of prediction…
▽ More
Due to the increasing complexity of technical systems, accurate first principle models can often not be obtained. Supervised machine learning can mitigate this issue by inferring models from measurement data. Gaussian process regression is particularly well suited for this purpose due to its high data-efficiency and its explicit uncertainty representation, which allows the derivation of prediction error bounds. These error bounds have been exploited to show tracking accuracy guarantees for a variety of control approaches, but their direct dependency on the training data is generally unclear. We address this issue by deriving a Bayesian prediction error bound for GP regression, which we show to decay with the growth of a novel, kernel-based measure of data density. Based on the prediction error bound, we prove time-varying tracking accuracy guarantees for learned GP models used as feedback compensation of unknown nonlinearities, and show to achieve vanishing tracking error with increasing data density. This enables us to develop an episodic approach for learning Gaussian process models, such that an arbitrary tracking accuracy can be guaranteed. The effectiveness of the derived theory is demonstrated in several simulations.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
Koopman Kernel Regression
Authors:
Petar Bevanda,
Max Beier,
Armin Lederer,
Stefan Sosnowski,
Eyke Hüllermeier,
Sandra Hirche
Abstract:
Many machine learning approaches for decision making, such as reinforcement learning, rely on simulators or predictive models to forecast the time-evolution of quantities of interest, e.g., the state of an agent or the reward of a policy. Forecasts of such complex phenomena are commonly described by highly nonlinear dynamical systems, making their use in optimization-based decision-making challeng…
▽ More
Many machine learning approaches for decision making, such as reinforcement learning, rely on simulators or predictive models to forecast the time-evolution of quantities of interest, e.g., the state of an agent or the reward of a policy. Forecasts of such complex phenomena are commonly described by highly nonlinear dynamical systems, making their use in optimization-based decision-making challenging. Koopman operator theory offers a beneficial paradigm for addressing this problem by characterizing forecasts via linear time-invariant (LTI) ODEs, turning multi-step forecasts into sparse matrix multiplication. Though there exists a variety of learning approaches, they usually lack crucial learning-theoretic guarantees, making the behavior of the obtained models with increasing data and dimensionality unclear. We address the aforementioned by deriving a universal Koopman-invariant reproducing kernel Hilbert space (RKHS) that solely spans transformations into LTI dynamical systems. The resulting Koopman Kernel Regression (KKR) framework enables the use of statistical learning tools from function approximation for novel convergence results and generalization error bounds under weaker assumptions than existing work. Our experiments demonstrate superior forecasting performance compared to Koopman operator and sequential data predictors in RKHS.
△ Less
Submitted 16 January, 2024; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Can Learning Deteriorate Control? Analyzing Computational Delays in Gaussian Process-Based Event-Triggered Online Learning
Authors:
Xiaobing Dai,
Armin Lederer,
Zewen Yang,
Sandra Hirche
Abstract:
When the dynamics of systems are unknown, supervised machine learning techniques are commonly employed to infer models from data. Gaussian process (GP) regression is a particularly popular learning method for this purpose due to the existence of prediction error bounds. Moreover, GP models can be efficiently updated online, such that event-triggered online learning strategies can be pursued to ens…
▽ More
When the dynamics of systems are unknown, supervised machine learning techniques are commonly employed to infer models from data. Gaussian process (GP) regression is a particularly popular learning method for this purpose due to the existence of prediction error bounds. Moreover, GP models can be efficiently updated online, such that event-triggered online learning strategies can be pursued to ensure specified tracking accuracies. However, existing trigger conditions must be able to be evaluated at arbitrary times, which cannot be achieved in practice due to non-negligible computation times. Therefore, we first derive a delay-aware tracking error bound, which reveals an accuracy-delay trade-off. Based on this result, we propose a novel event trigger for GP-based online learning with computational delays, which we show to offer advantages over offline trained GP models for sufficiently small computation times. Finally, we demonstrate the effectiveness of the proposed event trigger for online learning in simulations.
△ Less
Submitted 14 May, 2023;
originally announced May 2023.
-
Distributed Coverage Control of Constrained Constant-Speed Unicycle Multi-Agent Systems
Authors:
Qingchen Liu,
Zengjie Zhang,
Nhan Khanh Le,
Jiahu Qin,
Fangzhou Liu,
Sandra Hirche
Abstract:
This paper proposes a novel distributed coverage controller for a multi-agent system with constant-speed unicycle robots (CSUR). The work is motivated by the limitation of the conventional method that does not ensure the satisfaction of hard state- and input-dependent constraints and leads to feasibility issues for multi-CSUR systems. In this paper, we solve these problems by designing a novel cov…
▽ More
This paper proposes a novel distributed coverage controller for a multi-agent system with constant-speed unicycle robots (CSUR). The work is motivated by the limitation of the conventional method that does not ensure the satisfaction of hard state- and input-dependent constraints and leads to feasibility issues for multi-CSUR systems. In this paper, we solve these problems by designing a novel coverage cost function and a saturated gradient-search-based control law. Invariant set theory and Lyapunov-based techniques are used to prove the state-dependent confinement and the convergence of the system state to the optimal coverage configuration, respectively. The controller is implemented in a distributed manner based on a novel communication standard among the agents. A series of simulation case studies are conducted to validate the effectiveness of the proposed coverage controller in different initial conditions and with control parameters. A comparison study in simulation reveals the advantage of the proposed method in terms of avoiding infeasibility. The experiment study verifies the applicability of the method to real robots with uncertainties. The development procedure of the method from theoretical analysis to experimental validation provides a novel framework for multi-agent system coordinate control with complex agent dynamics.
△ Less
Submitted 14 March, 2024; v1 submitted 12 April, 2023;
originally announced April 2023.
-
Cooperative Online Learning for Multi-Agent System Control via Gaussian Processes with Event-Triggered Mechanism: Extended Version
Authors:
Xiaobing Dai,
Zewen Yang,
Sandra Hirche
Abstract:
In the realm of the cooperative control of multi-agent systems (MASs) with unknown dynamics, Gaussian process (GP) regression is widely used to infer the uncertainties due to its modeling flexibility of nonlinear functions and the existence of a theoretical prediction error bound. Online learning, which involves incorporating newly acquired training data into Gaussian process models, promises to i…
▽ More
In the realm of the cooperative control of multi-agent systems (MASs) with unknown dynamics, Gaussian process (GP) regression is widely used to infer the uncertainties due to its modeling flexibility of nonlinear functions and the existence of a theoretical prediction error bound. Online learning, which involves incorporating newly acquired training data into Gaussian process models, promises to improve control performance by enhancing predictions during the operation. Therefore, this paper investigates the online cooperative learning algorithm for MAS control. Moreover, an event-triggered data selection mechanism, inspired by the analysis of a centralized event-trigger, is introduced to reduce the model update frequency and enhance the data efficiency. With the proposed learning-based control, the practical convergence of the MAS is validated with guaranteed tracking performance via the Lynaponve theory. Furthermore, the exclusion of the Zeno behavior for individual agents is shown. Finally, the effectiveness of the proposed event-triggered online learning method is demonstrated in simulations.
△ Less
Submitted 2 January, 2024; v1 submitted 11 April, 2023;
originally announced April 2023.
-
Fast IMU-based Dual Estimation of Human Motion and Kinematic Parameters via Progressive In-Network Computing
Authors:
Xiaobing Dai,
Huanzhuo Wu,
Siyi Wang,
Junjie Jiao,
Giang T. Nguyen,
Frank H. P. Fitzek,
Sandra Hirche
Abstract:
Many applications involve humans in the loop, where continuous and accurate human motion monitoring provides valuable information for safe and intuitive human-machine interaction. Portable devices such as inertial measurement units (IMUs) are applicable to monitor human motions, while in practice often limited computational power is available locally. The human motion in task space coordinates req…
▽ More
Many applications involve humans in the loop, where continuous and accurate human motion monitoring provides valuable information for safe and intuitive human-machine interaction. Portable devices such as inertial measurement units (IMUs) are applicable to monitor human motions, while in practice often limited computational power is available locally. The human motion in task space coordinates requires not only the human joint motion but also the nonlinear coordinate transformation depending on the parameters such as human limb length. In most applications, measuring these kinematics parameters for each individual requires undesirably high effort. Therefore, it is desirable to estimate both, the human motion and kinematic parameters from IMUs. In this work, we propose a novel computational framework for dual estimation in real-time exploiting in-network computational resources. We adopt the concept of field Kalman filtering, where the dual estimation problem is decomposed into a fast state estimation process and a computationally expensive parameter estimation process. In order to further accelerate the convergence, the parameter estimation is progressively computed on multiple networked computational nodes. The superiority of our proposed method is demonstrated by a simulation of a human arm, where the estimation accuracy is shown to converge faster than with conventional approaches.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Learning-Based Optimal Control with Performance Guarantees for Unknown Systems with Latent States
Authors:
Robert Lefringhausen,
Supitsana Srithasan,
Armin Lederer,
Sandra Hirche
Abstract:
As control engineering methods are applied to increasingly complex systems, data-driven approaches for system identification appear as a promising alternative to physics-based modeling. While the Bayesian approaches prevalent for safety-critical applications usually rely on the availability of state measurements, the states of a complex system are often not directly measurable. It may then be nece…
▽ More
As control engineering methods are applied to increasingly complex systems, data-driven approaches for system identification appear as a promising alternative to physics-based modeling. While the Bayesian approaches prevalent for safety-critical applications usually rely on the availability of state measurements, the states of a complex system are often not directly measurable. It may then be necessary to jointly estimate the dynamics and the latent state, making the quantification of uncertainties and the design of controllers with formal performance guarantees considerably more challenging. This paper proposes a novel method for the computation of an optimal input trajectory for unknown nonlinear systems with latent states based on a combination of particle Markov chain Monte Carlo methods and scenario theory. Probabilistic performance guarantees are derived for the resulting input trajectory, and an approach to validate the performance of arbitrary control laws is presented. The effectiveness of the proposed method is demonstrated in a numerical simulation.
△ Less
Submitted 16 April, 2024; v1 submitted 31 March, 2023;
originally announced March 2023.
-
Average Communication Rate for Event-Triggered Stochastic Control Systems
Authors:
Zengjie Zhang,
Qingchen Liu,
Mohammad H. Mamduhi,
Sandra Hirche
Abstract:
Quantifying the average communication rate (ACR) of a networked event-triggered stochastic control system (NET-SCS) with deterministic thresholds is challenging due to the non-stationary nature of the system's stochastic processes. For a NET-SCS, the nonlinear statistics propagation of the network communication status brought up by deterministic thresholds makes the precise computation of ACR diff…
▽ More
Quantifying the average communication rate (ACR) of a networked event-triggered stochastic control system (NET-SCS) with deterministic thresholds is challenging due to the non-stationary nature of the system's stochastic processes. For a NET-SCS, the nonlinear statistics propagation of the network communication status brought up by deterministic thresholds makes the precise computation of ACR difficult. Previous work used to over-simplify the computation using a Gaussian distribution without incorporating this nonlinearity, leading to sacrificed precision. This paper proposes both analytical and numerical approaches to predict the exact ACR for a NET-SCS using a recursive model. We use theoretical analysis and a numerical study to qualitatively evaluate the deviation gap of the conventional approach that ignores the side information. The accuracy of our proposed method, alongside its comparison with the simplified results of the conventional approach, is validated by experimental studies. Our work is promising to benefit the efficient resource planning of networked control systems with limited communication resources by providing accurate ACR computation.
△ Less
Submitted 12 April, 2024; v1 submitted 13 January, 2023;
originally announced January 2023.
-
Safe Learning-Based Control of Elastic Joint Robots via Control Barrier Functions
Authors:
Armin Lederer,
Azra Begzadić,
Neha Das,
Sandra Hirche
Abstract:
Ensuring safety is of paramount importance in physical human-robot interaction applications. This requires both adherence to safety constraints defined on the system state, as well as guaranteeing compliant behavior of the robot. If the underlying dynamical system is known exactly, the former can be addressed with the help of control barrier functions. The incorporation of elastic actuators in the…
▽ More
Ensuring safety is of paramount importance in physical human-robot interaction applications. This requires both adherence to safety constraints defined on the system state, as well as guaranteeing compliant behavior of the robot. If the underlying dynamical system is known exactly, the former can be addressed with the help of control barrier functions. The incorporation of elastic actuators in the robot's mechanical design can address the latter requirement. However, this elasticity can increase the complexity of the resulting system, leading to unmodeled dynamics, such that control barrier functions cannot directly ensure safety. In this paper, we mitigate this issue by learning the unknown dynamics using Gaussian process regression. By employing the model in a feedback linearizing control law, the safety conditions resulting from control barrier functions can be robustified to take into account model errors, while remaining feasible. In order to enforce them on-line, we formulate the derived safety conditions in the form of a second-order cone program. We demonstrate our proposed approach with simulations on a two-degree-of-freedom planar robot with elastic joints.
△ Less
Submitted 14 April, 2023; v1 submitted 1 December, 2022;
originally announced December 2022.
-
Vision-Based Uncertainty-Aware Motion Planning based on Probabilistic Semantic Segmentation
Authors:
Ralf Römer,
Armin Lederer,
Samuel Tesfazgi,
Sandra Hirche
Abstract:
For safe operation, a robot must be able to avoid collisions in uncertain environments. Existing approaches for motion planning under uncertainties often assume parametric obstacle representations and Gaussian uncertainty, which can be inaccurate. While visual perception can deliver a more accurate representation of the environment, its use for safe motion planning is limited by the inherent misca…
▽ More
For safe operation, a robot must be able to avoid collisions in uncertain environments. Existing approaches for motion planning under uncertainties often assume parametric obstacle representations and Gaussian uncertainty, which can be inaccurate. While visual perception can deliver a more accurate representation of the environment, its use for safe motion planning is limited by the inherent miscalibration of neural networks and the challenge of obtaining adequate datasets. To address these limitations, we propose to employ ensembles of deep semantic segmentation networks trained with massively augmented datasets to ensure reliable probabilistic occupancy information. To avoid conservatism during motion planning, we directly employ the probabilistic perception in a scenario-based path planning approach. A velocity scheduling scheme is applied to the path to ensure a safe motion despite tracking inaccuracies. We demonstrate the effectiveness of the massive data augmentation in combination with deep ensembles and the proposed scenario-based planning approach in comparisons to state-of-the-art methods and validate our framework in an experiment with a human hand as an obstacle.
△ Less
Submitted 1 December, 2023; v1 submitted 14 September, 2022;
originally announced September 2022.
-
Safe Reinforcement Learning via Confidence-Based Filters
Authors:
Sebastian Curi,
Armin Lederer,
Sandra Hirche,
Andreas Krause
Abstract:
Ensuring safety is a crucial challenge when deploying reinforcement learning (RL) to real-world systems. We develop confidence-based safety filters, a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard RL techniques, based on probabilistic dynamics models. Our approach is based on a reformulation of state constraints in terms of cost functi…
▽ More
Ensuring safety is a crucial challenge when deploying reinforcement learning (RL) to real-world systems. We develop confidence-based safety filters, a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard RL techniques, based on probabilistic dynamics models. Our approach is based on a reformulation of state constraints in terms of cost functions, reducing safety verification to a standard RL task. By exploiting the concept of hallucinating inputs, we extend this formulation to determine a "backup" policy that is safe for the unknown system with high probability. Finally, the nominal policy is minimally adjusted at every time step during a roll-out towards the backup policy, such that safe recovery can be guaranteed afterwards. We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
Physically Consistent Learning of Conservative Lagrangian Systems with Gaussian Processes
Authors:
Giulio Evangelisti,
Sandra Hirche
Abstract:
This paper proposes a physically consistent Gaussian Process (GP) enabling the identification of uncertain Lagrangian systems. The function space is tailored according to the energy components of the Lagrangian and the differential equation structure, analytically guaranteeing physical and mathematical properties such as energy conservation and quadratic form. The novel formulation of Cholesky dec…
▽ More
This paper proposes a physically consistent Gaussian Process (GP) enabling the identification of uncertain Lagrangian systems. The function space is tailored according to the energy components of the Lagrangian and the differential equation structure, analytically guaranteeing physical and mathematical properties such as energy conservation and quadratic form. The novel formulation of Cholesky decomposed matrix kernels allow the probabilistic preservation of positive definiteness. Only differential input-to-output measurements of the function map are required while Gaussian noise is permitted in torques, velocities, and accelerations. We demonstrate the effectiveness of the approach in numerical simulation.
△ Less
Submitted 3 February, 2023; v1 submitted 24 June, 2022;
originally announced June 2022.
-
Actuator Scheduling for Linear Systems: A Convex Relaxation Approach
Authors:
Junjie Jiao,
Dipankar Maity,
John S. Baras,
Sandra Hirche
Abstract:
In this letter, we investigate the problem of actuator scheduling for networked control systems. Given a stochastic linear system with a number of actuators, we consider the case that one actuator is activated at each time. This problem is combinatorial in nature and NP hard to solve. We propose a convex relaxation to the actuator scheduling problem, and use its solution as a reference to design a…
▽ More
In this letter, we investigate the problem of actuator scheduling for networked control systems. Given a stochastic linear system with a number of actuators, we consider the case that one actuator is activated at each time. This problem is combinatorial in nature and NP hard to solve. We propose a convex relaxation to the actuator scheduling problem, and use its solution as a reference to design an algorithm for solving the original scheduling problem. Using dynamic programming arguments, we provide a suboptimality bound of our proposed algorithm. Furthermore, we show that our framework can be extended to incorporate multiple actuators scheduling at each time and actuation costs. A simulation example is provided, which shows that our proposed method outperforms a random selection approach and a greedy selection approach.
△ Less
Submitted 20 May, 2022; v1 submitted 4 March, 2022;
originally announced March 2022.
-
Networked Online Learning for Control of Safety-Critical Resource-Constrained Systems based on Gaussian Processes
Authors:
Armin Lederer,
Mingmin Zhang,
Samuel Tesfazgi,
Sandra Hirche
Abstract:
Safety-critical technical systems operating in unknown environments require the ability to quickly adapt their behavior, which can be achieved in control by inferring a model online from the data stream generated during operation. Gaussian process-based learning is particularly well suited for safety-critical applications as it ensures bounded prediction errors. While there exist computationally e…
▽ More
Safety-critical technical systems operating in unknown environments require the ability to quickly adapt their behavior, which can be achieved in control by inferring a model online from the data stream generated during operation. Gaussian process-based learning is particularly well suited for safety-critical applications as it ensures bounded prediction errors. While there exist computationally efficient approximations for online inference, these approaches lack guarantees for the prediction error and have high memory requirements, and are therefore not applicable to safety-critical systems with tight memory constraints. In this work, we propose a novel networked online learning approach based on Gaussian process regression, which addresses the issue of limited local resources by employing remote data management in the cloud. Our approach formally guarantees a bounded tracking error with high probability, which is exploited to identify the most relevant data to achieve a certain control performance. We further propose an effective data transmission scheme between the local system and the cloud taking bandwidth limitations and time delay of the transmission channel into account. The effectiveness of the proposed method is successfully demonstrated in a simulation.
△ Less
Submitted 23 February, 2022;
originally announced February 2022.
-
Towards Data-driven LQR with Koopmanizing Flows
Authors:
Petar Bevanda,
Max Beier,
Shahab Heshmati-Alamdari,
Stefan Sosnowski,
Sandra Hirche
Abstract:
We propose a novel framework for learning linear time-invariant (LTI) models for a class of continuous-time non-autonomous nonlinear dynamics based on a representation of Koopman operators. In general, the operator is infinite-dimensional but, crucially, linear. To utilize it for efficient LTI control design, we learn a finite representation of the Koopman operator that is linear in controls while…
▽ More
We propose a novel framework for learning linear time-invariant (LTI) models for a class of continuous-time non-autonomous nonlinear dynamics based on a representation of Koopman operators. In general, the operator is infinite-dimensional but, crucially, linear. To utilize it for efficient LTI control design, we learn a finite representation of the Koopman operator that is linear in controls while concurrently learning meaningful lifting coordinates. For the latter, we rely on Koopmanizing Flows - a diffeomorphism-based representation of Koopman operators and extend it to systems with linear control entry. With such a learned model, we can replace the nonlinear optimal control problem with quadratic cost to that of a linear quadratic regulator (LQR), facilitating efficacious optimal control for nonlinear systems. The superior control performance of the proposed method is demonstrated on simulation examples.
△ Less
Submitted 23 May, 2022; v1 submitted 27 January, 2022;
originally announced January 2022.
-
Diffeomorphically Learning Stable Koopman Operators
Authors:
Petar Bevanda,
Max Beier,
Sebastian Kerz,
Armin Lederer,
Stefan Sosnowski,
Sandra Hirche
Abstract:
System representations inspired by the infinite-dimensional Koopman operator (generator) are increasingly considered for predictive modeling. Due to the operator's linearity, a range of nonlinear systems admit linear predictor representations - allowing for simplified prediction, analysis and control. However, finding meaningful finite-dimensional representations for prediction is difficult as it…
▽ More
System representations inspired by the infinite-dimensional Koopman operator (generator) are increasingly considered for predictive modeling. Due to the operator's linearity, a range of nonlinear systems admit linear predictor representations - allowing for simplified prediction, analysis and control. However, finding meaningful finite-dimensional representations for prediction is difficult as it involves determining features that are both Koopman-invariant (evolve linearly under the dynamics) as well as relevant (spanning the original state) - a generally unsupervised problem. In this work, we present Koopmanizing Flows - a novel continuous-time framework for supervised learning of linear predictors for a class of nonlinear dynamics. In our model construction a latent diffeomorphically related linear system unfolds into a linear predictor through the composition with a monomial basis. The lifting, its linear dynamics and state reconstruction are learned simultaneously, while an unconstrained parameterization of Hurwitz matrices ensures asymptotic stability regardless of the operator approximation accuracy. The superior efficacy of Koopmanizing Flows is demonstrated in comparison to a state-of-the-art method on the well-known LASA handwriting benchmark.
△ Less
Submitted 30 May, 2022; v1 submitted 7 December, 2021;
originally announced December 2021.
-
Adaptive Low-Pass Filtering using Sliding Window Gaussian Processes
Authors:
Alejandro J. Ordóñez-Conejo,
Armin Lederer,
Sandra Hirche
Abstract:
When signals are measured through physical sensors, they are perturbed by noise. To reduce noise, low-pass filters are commonly employed in order to attenuate high frequency components in the incoming signal, regardless if they come from noise or the actual signal. Therefore, low-pass filters must be carefully tuned in order to avoid significant deterioration of the signal. This tuning requires pr…
▽ More
When signals are measured through physical sensors, they are perturbed by noise. To reduce noise, low-pass filters are commonly employed in order to attenuate high frequency components in the incoming signal, regardless if they come from noise or the actual signal. Therefore, low-pass filters must be carefully tuned in order to avoid significant deterioration of the signal. This tuning requires prior knowledge about the signal, which is often not available in applications such as reinforcement learning or learning-based control. In order to overcome this limitation, we propose an adaptive low-pass filter based on Gaussian process regression. By considering a constant window of previous observations, updates and predictions fast enough for real-world filtering applications can be realized. Moreover, the online optimization of hyperparameters leads to an adaptation of the low-pass behavior, such that no prior tuning is necessary. We show that the estimation error of the proposed method is uniformly bounded, and demonstrate the flexibility and efficiency of the approach in several simulations.
△ Less
Submitted 5 November, 2021;
originally announced November 2021.
-
Learning the Koopman Eigendecomposition: A Diffeomorphic Approach
Authors:
Petar Bevanda,
Johannes Kirmayr,
Stefan Sosnowski,
Sandra Hirche
Abstract:
We present a novel data-driven approach for learning linear representations of a class of stable nonlinear systems using Koopman eigenfunctions. By learning the conjugacy map between a nonlinear system and its Jacobian linearization through a Normalizing Flow one can guarantee the learned function is a diffeomorphism. Using this diffeomorphism, we construct eigenfunctions of the nonlinear system v…
▽ More
We present a novel data-driven approach for learning linear representations of a class of stable nonlinear systems using Koopman eigenfunctions. By learning the conjugacy map between a nonlinear system and its Jacobian linearization through a Normalizing Flow one can guarantee the learned function is a diffeomorphism. Using this diffeomorphism, we construct eigenfunctions of the nonlinear system via the spectral equivalence of conjugate systems - allowing the construction of linear predictors for nonlinear systems. The universality of the diffeomorphism learner leads to the universal approximation of the nonlinear system's Koopman eigenfunctions. The developed method is also safe as it guarantees the model is asymptotically stable regardless of the representation accuracy. To our best knowledge, this is the first work to close the gap between the operator, system and learning theories. The efficacy of our approach is shown through simulation examples.
△ Less
Submitted 30 May, 2022; v1 submitted 14 October, 2021;
originally announced October 2021.
-
Gaussian Process Uniform Error Bounds with Unknown Hyperparameters for Safety-Critical Applications
Authors:
Alexandre Capone,
Armin Lederer,
Sandra Hirche
Abstract:
Gaussian processes have become a promising tool for various safety-critical settings, since the posterior variance can be used to directly estimate the model error and quantify risk. However, state-of-the-art techniques for safety-critical settings hinge on the assumption that the kernel hyperparameters are known, which does not apply in general. To mitigate this, we introduce robust Gaussian proc…
▽ More
Gaussian processes have become a promising tool for various safety-critical settings, since the posterior variance can be used to directly estimate the model error and quantify risk. However, state-of-the-art techniques for safety-critical settings hinge on the assumption that the kernel hyperparameters are known, which does not apply in general. To mitigate this, we introduce robust Gaussian process uniform error bounds in settings with unknown hyperparameters. Our approach computes a confidence region in the space of hyperparameters, which enables us to obtain a probabilistic upper bound for the model error of a Gaussian process with arbitrary hyperparameters. We do not require to know any bounds for the hyperparameters a priori, which is an assumption commonly found in related work. Instead, we are able to derive bounds from data in an intuitive fashion. We additionally employ the proposed technique to derive performance guarantees for a class of learning-based control problems. Experiments show that the bound performs significantly better than vanilla and fully Bayesian Gaussian processes.
△ Less
Submitted 20 July, 2022; v1 submitted 6 September, 2021;
originally announced September 2021.
-
Distributed Value of Information in Feedback Control over Multi-hop Networks
Authors:
Precious Ugo Abara,
Sandra Hirche
Abstract:
Recent works in the domain of networked control systems have demonstrated that the joint design of medium access control strategies and control strategies for the closed-loop system is beneficial. However, several metrics introduced so far fail in either appropriately representing the network requirements or in capturing how valuable the data is. In this paper we propose a distributed value of inf…
▽ More
Recent works in the domain of networked control systems have demonstrated that the joint design of medium access control strategies and control strategies for the closed-loop system is beneficial. However, several metrics introduced so far fail in either appropriately representing the network requirements or in capturing how valuable the data is. In this paper we propose a distributed value of information (dVoI) metric for the joint design of control and schedulers for medium access in a multi-loop system and multi-hop network. We start by providing conditions under certainty equivalent controller is optimal. Then we reformulate the joint control and communication problem as a Bellman-like equation. The corresponding dynamic programming problem is solved in a distributed fashion by the proposed VoI-based scheduling policies for the multi-loop multi-hop networked control system, which outperforms the well-known time-triggered periodic sampling policies. Additionally we show that the dVoI-based scheduling policies are independent of each other, both loop-wise and hop-wise. At last, we illustrate the results with a numerical example.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
Inverse Reinforcement Learning: A Control Lyapunov Approach
Authors:
Samuel Tesfazgi,
Armin Lederer,
Sandra Hirche
Abstract:
Inferring the intent of an intelligent agent from demonstrations and subsequently predicting its behavior, is a critical task in many collaborative settings. A common approach to solve this problem is the framework of inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to an intrinsic cost function that reflects its intent and…
▽ More
Inferring the intent of an intelligent agent from demonstrations and subsequently predicting its behavior, is a critical task in many collaborative settings. A common approach to solve this problem is the framework of inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to an intrinsic cost function that reflects its intent and informs its control actions. In this work, we reformulate the IRL inference problem to learning control Lyapunov functions (CLF) from demonstrations by exploiting the inverse optimality property, which states that every CLF is also a meaningful value function. Moreover, the derived CLF formulation directly guarantees stability of inferred control policies. We show the flexibility of our proposed method by learning from goal-directed movement demonstrations in a continuous environment.
△ Less
Submitted 4 October, 2021; v1 submitted 9 April, 2021;
originally announced April 2021.
-
Value of information in networked control systems subject to delay
Authors:
Siyi Wang,
Qingchen Liu,
Precious Ugo Abara,
John S. Baras,
Sandra Hirche
Abstract:
In this paper, we study the trade-off between the transmission cost and the control performance of the multi-loop networked control system subject to network-induced delay. Within the linear-quadratic-Gaussian (LQG) framework, the joint design of control policy and networking strategy is decomposed into separation optimization problems. Based on the trade-off analysis, a scalable, delay-dependent…
▽ More
In this paper, we study the trade-off between the transmission cost and the control performance of the multi-loop networked control system subject to network-induced delay. Within the linear-quadratic-Gaussian (LQG) framework, the joint design of control policy and networking strategy is decomposed into separation optimization problems. Based on the trade-off analysis, a scalable, delay-dependent Value-of-Information (VoI) based scheduling policy is constructed to quantify the value of transmitting the data packet, and enables the decision-makers embedded in subsystems to determine the transmission policy. The proposed scalable VoI inherits the task criticality of the previous VoI metric meanwhile is sensitive to the system parameters such as information freshness and network delays. The VoI-based scheduling policy is proved to outperform the periodical triggering policy and existing Age-of-Information (AoI) based policy for network control system under transmission delay. The effectiveness of the constructed VoI with arbitrary network delay is validated through numerical simulations.
△ Less
Submitted 29 December, 2021; v1 submitted 7 April, 2021;
originally announced April 2021.
-
Safe Online Learning-based Formation Control of Multi-Agent Systems with Gaussian Processes
Authors:
Thomas Beckers,
Sandra Hirche,
Leonardo Colombo
Abstract:
Formation control algorithms for multi-agent systems have gained much attention in the recent years due to the increasing amount of mobile and aerial robotic swarms. The design of safe controllers for these vehicles is a substantial aspect for an increasing range of application domains. However, parts of the vehicle's dynamics and external disturbances are often unknown or very time-consuming to m…
▽ More
Formation control algorithms for multi-agent systems have gained much attention in the recent years due to the increasing amount of mobile and aerial robotic swarms. The design of safe controllers for these vehicles is a substantial aspect for an increasing range of application domains. However, parts of the vehicle's dynamics and external disturbances are often unknown or very time-consuming to model. To overcome this issue, we present a safe formation control law for multiagent systems based on double integrator dynamics by using Gaussian Processes for an online learning of the unknown dynamics. The presented approach guarantees a bounded error to desired formations with high probability, where the bound is explicitly given. A numerical example highlights the effectiveness of the learning-based formation control law.
△ Less
Submitted 31 March, 2021;
originally announced April 2021.
-
Distributed Learning Consensus Control for Unknown Nonlinear Multi-Agent Systems based on Gaussian Processes
Authors:
Zewen Yang,
Stefan Sosnowski,
Qingchen Liu,
Junjie Jiao,
Armin Lederer,
Sandra Hirche
Abstract:
In this paper, a distributed learning leader-follower consensus protocol based on Gaussian process regression for a class of nonlinear multi-agent systems with unknown dynamics is designed. We propose a distributed learning approach to predict the residual dynamics for each agent. The stability of the consensus protocol using the data-driven model of the dynamics is shown via Lyapunov analysis. Th…
▽ More
In this paper, a distributed learning leader-follower consensus protocol based on Gaussian process regression for a class of nonlinear multi-agent systems with unknown dynamics is designed. We propose a distributed learning approach to predict the residual dynamics for each agent. The stability of the consensus protocol using the data-driven model of the dynamics is shown via Lyapunov analysis. The followers ultimately synchronize to the leader with guaranteed error bounds by applying the proposed control law with a high probability. The effectiveness and the applicability of the developed protocol are demonstrated by simulation examples.
△ Less
Submitted 29 March, 2021;
originally announced March 2021.
-
Data-driven output synchronization of heterogeneous leader-follower multi-agent systems
Authors:
Junjie Jiao,
Henk J. van Waarde,
Harry L. Trentelman,
M. Kanat Camlibel,
Sandra Hirche
Abstract:
This paper deals with data-driven output synchronization for heterogeneous leader-follower linear multi-agent systems. Given a multi-agent system that consists of one autonomous leader and a number of heterogeneous followers with external disturbances, we provide necessary and sufficient data-based conditions for output synchronization. We also provide a design method for obtaining such output syn…
▽ More
This paper deals with data-driven output synchronization for heterogeneous leader-follower linear multi-agent systems. Given a multi-agent system that consists of one autonomous leader and a number of heterogeneous followers with external disturbances, we provide necessary and sufficient data-based conditions for output synchronization. We also provide a design method for obtaining such output synchronizing protocols directly from data. The results are then extended to the special case that the followers are disturbance-free. Finally, a simulation example is provided to illustrate our results.
△ Less
Submitted 23 September, 2021; v1 submitted 22 March, 2021;
originally announced March 2021.
-
Koopman Operator Dynamical Models: Learning, Analysis and Control
Authors:
Petar Bevanda,
Stefan Sosnowski,
Sandra Hirche
Abstract:
The Koopman operator allows for handling nonlinear systems through a (globally) linear representation. In general, the operator is infinite-dimensional - necessitating finite approximations - for which there is no overarching framework. Although there are principled ways of learning such finite approximations, they are in many instances overlooked in favor of, often ill-posed and unstructured meth…
▽ More
The Koopman operator allows for handling nonlinear systems through a (globally) linear representation. In general, the operator is infinite-dimensional - necessitating finite approximations - for which there is no overarching framework. Although there are principled ways of learning such finite approximations, they are in many instances overlooked in favor of, often ill-posed and unstructured methods. Also, Koopman operator theory has long-standing connections to known system-theoretic and dynamical system notions that are not universally recognized. Given the former and latter realities, this work aims to bridge the gap between various concepts regarding both theory and tractable realizations. Firstly, we review data-driven representations (both unstructured and structured) for Koopman operator dynamical models, categorizing various existing methodologies and highlighting their differences. Furthermore, we provide concise insight into the paradigm's relation to system-theoretic notions and analyze the prospect of using the paradigm for modeling control systems. Additionally, we outline the current challenges and comment on future perspectives.
△ Less
Submitted 22 December, 2021; v1 submitted 4 February, 2021;
originally announced February 2021.
-
Uniform Error and Posterior Variance Bounds for Gaussian Process Regression with Application to Safe Control
Authors:
Armin Lederer,
Jonas Umlauft,
Sandra Hirche
Abstract:
In application areas where data generation is expensive, Gaussian processes are a preferred supervised learning model due to their high data-efficiency. Particularly in model-based control, Gaussian processes allow the derivation of performance guarantees using probabilistic model error bounds. To make these approaches applicable in practice, two open challenges must be solved i) Existing error bo…
▽ More
In application areas where data generation is expensive, Gaussian processes are a preferred supervised learning model due to their high data-efficiency. Particularly in model-based control, Gaussian processes allow the derivation of performance guarantees using probabilistic model error bounds. To make these approaches applicable in practice, two open challenges must be solved i) Existing error bounds rely on prior knowledge, which might not be available for many real-world tasks. (ii) The relationship between training data and the posterior variance, which mainly drives the error bound, is not well understood and prevents the asymptotic analysis. This article addresses these issues by presenting a novel uniform error bound using Lipschitz continuity and an analysis of the posterior variance function for a large class of kernels. Additionally, we show how these results can be used to guarantee safe control of an unknown dynamical system and provide numerical illustration examples.
△ Less
Submitted 13 January, 2021;
originally announced January 2021.
-
The Impact of Data on the Stability of Learning-Based Control- Extended Version
Authors:
Armin Lederer,
Alexandre Capone,
Thomas Beckers,
Jonas Umlauft,
Sandra Hirche
Abstract:
Despite the existence of formal guarantees for learning-based control approaches, the relationship between data and control performance is still poorly understood. In this paper, we propose a Lyapunov-based measure for quantifying the impact of data on the certifiable control performance. By modeling unknown system dynamics through Gaussian processes, we can determine the interrelation between mod…
▽ More
Despite the existence of formal guarantees for learning-based control approaches, the relationship between data and control performance is still poorly understood. In this paper, we propose a Lyapunov-based measure for quantifying the impact of data on the certifiable control performance. By modeling unknown system dynamics through Gaussian processes, we can determine the interrelation between model uncertainty and satisfaction of stability conditions. This allows us to directly asses the impact of data on the provable stationary control performance, and thereby the value of the data for the closed-loop system performance. Our approach is applicable to a wide variety of unknown nonlinear systems that are to be controlled by a generic learning-based control law, and the results obtained in numerical simulations indicate the efficacy of the proposed measure.
△ Less
Submitted 30 July, 2021; v1 submitted 20 November, 2020;
originally announced November 2020.
-
Deep Learning based Uncertainty Decomposition for Real-time Control
Authors:
Neha Das,
Jonas Umlauft,
Armin Lederer,
Thomas Beckers,
Sandra Hirche
Abstract:
Data-driven control in unknown environments requires a clear understanding of the involved uncertainties for ensuring safety and efficient exploration. While aleatoric uncertainty that arises from measurement noise can often be explicitly modeled given a parametric description, it can be harder to model epistemic uncertainty, which describes the presence or absence of training data. The latter can…
▽ More
Data-driven control in unknown environments requires a clear understanding of the involved uncertainties for ensuring safety and efficient exploration. While aleatoric uncertainty that arises from measurement noise can often be explicitly modeled given a parametric description, it can be harder to model epistemic uncertainty, which describes the presence or absence of training data. The latter can be particularly useful for implementing exploratory control strategies when system dynamics are unknown. We propose a novel method for detecting the absence of training data using deep learning, which gives a continuous valued scalar output between $0$ (indicating low uncertainty) and $1$ (indicating high uncertainty). We utilize this detector as a proxy for epistemic uncertainty and show its advantages over existing approaches on synthetic and real-world datasets. Our approach can be directly combined with aleatoric uncertainty estimates and allows for uncertainty estimation in real-time as the inference is sample-free unlike existing approaches for uncertainty modeling. We further demonstrate the practicality of this uncertainty estimate in deploying online data-efficient control on a simulated quadcopter acted upon by an unknown disturbance model.
△ Less
Submitted 12 July, 2023; v1 submitted 6 October, 2020;
originally announced October 2020.
-
Online learning-based trajectory tracking for underactuated vehicles with uncertain dynamics
Authors:
Thomas Beckers,
Leonardo Colombo,
Sandra Hirche,
George J. Pappas
Abstract:
Underactuated vehicles have gained much attention in the recent years due to the increasing amount of aerial and underwater vehicles as well as nanosatellites. Trajectory tracking control of these vehicles is a substantial aspect for an increasing range of application domains. However, external disturbances and parts of the internal dynamics are often unknown or very time-consuming to model. To ov…
▽ More
Underactuated vehicles have gained much attention in the recent years due to the increasing amount of aerial and underwater vehicles as well as nanosatellites. Trajectory tracking control of these vehicles is a substantial aspect for an increasing range of application domains. However, external disturbances and parts of the internal dynamics are often unknown or very time-consuming to model. To overcome this issue, we present a tracking control law for underactuated rigid-body dynamics using an online learning-based oracle for the prediction of the unknown dynamics. We show that Gaussian process models are of particular interest for the role of the oracle. The presented approach guarantees a bounded tracking error with high probability where the bound is explicitly given. A numerical example highlights the effectiveness of the proposed control law.
△ Less
Submitted 14 September, 2021; v1 submitted 14 September, 2020;
originally announced September 2020.
-
Anticipating the Long-Term Effect of Online Learning in Control
Authors:
Alexandre Capone,
Sandra Hirche
Abstract:
Control schemes that learn using measurement data collected online are increasingly promising for the control of complex and uncertain systems. However, in most approaches of this kind, learning is viewed as a side effect that passively improves control performance, e.g., by updating a model of the system dynamics. Determining how improvements in control performance due to learning can be actively…
▽ More
Control schemes that learn using measurement data collected online are increasingly promising for the control of complex and uncertain systems. However, in most approaches of this kind, learning is viewed as a side effect that passively improves control performance, e.g., by updating a model of the system dynamics. Determining how improvements in control performance due to learning can be actively exploited in the control synthesis is still an open research question. In this paper, we present AntLer, a design algorithm for learning-based control laws that anticipates learning, i.e., that takes the impact of future learning in uncertain dynamic settings explicitly into account. AntLer expresses system uncertainty using a non-parametric probabilistic model. Given a cost function that measures control performance, AntLer chooses the control parameters such that the expected cost of the closed-loop system is minimized approximately. We show that AntLer approximates an optimal solution arbitrarily accurately with probability one. Furthermore, we apply AntLer to a nonlinear system, which yields better results compared to the case where learning is not anticipated.
△ Less
Submitted 24 July, 2020;
originally announced July 2020.