Search | arXiv e-print repository

Guarding a Target Area from a Heterogeneous Group of Cooperative Attackers

Authors: Yoonjae Lee, Goutam Das, Daigo Shishika, Efstathios Bakolas

Abstract: In this paper, we investigate a multi-agent target guarding problem in which a single defender seeks to capture multiple attackers aiming to reach a high-value target area. In contrast to previous studies, the attackers herein are assumed to be heterogeneous in the sense that they have not only different speeds but also different weights representing their respective degrees of importance (e.g., t… ▽ More In this paper, we investigate a multi-agent target guarding problem in which a single defender seeks to capture multiple attackers aiming to reach a high-value target area. In contrast to previous studies, the attackers herein are assumed to be heterogeneous in the sense that they have not only different speeds but also different weights representing their respective degrees of importance (e.g., the amount of allocated resources). The objective of the attacker team is to jointly minimize the weighted sum of their final levels of proximity to the target area, whereas the defender aims to maximize the same value. Using geometric arguments, we construct candidate equilibrium control policies that require the solution of a (possibly nonconvex) optimization problem. Subsequently, we validate the optimality of the candidate control policies using parametric optimization techniques. Lastly, we provide numerical examples to illustrate how cooperative behaviors emerge within the attacker team due to their heterogeneity. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: This is the revised version of the paper, with the same title, to be presented at American Control Conference (ACC) 2024

arXiv:2405.15159 [pdf, other]

Leveraging Gated Recurrent Units for Iterative Online Precise Attitude Control for Geodetic Missions

Authors: Vrushabh Zinage, Shrenik Zinage, Srinivas Bettadpur, Efstathios Bakolas

Abstract: In this paper, we consider the problem of precise attitude control for geodetic missions, such as the GRACE Follow-on (GRACE-FO) mission. Traditional and well-established control methods, such as Proportional-Integral-Derivative (PID) controllers, have been the standard in attitude control for most space missions, including the GRACE-FO mission. Instead of significantly modifying (or replacing) th… ▽ More In this paper, we consider the problem of precise attitude control for geodetic missions, such as the GRACE Follow-on (GRACE-FO) mission. Traditional and well-established control methods, such as Proportional-Integral-Derivative (PID) controllers, have been the standard in attitude control for most space missions, including the GRACE-FO mission. Instead of significantly modifying (or replacing) the original PID controllers that are being used for these missions, we introduce an iterative modification to the PID controller that ensures improved attitude control precision (i.e., reduction in attitude error). The proposed modification leverages Gated Recurrent Units (GRU) to learn and predict external disturbance trends derived from incoming attitude measurements from the GRACE satellites. Our analysis has revealed a distinct trend in the external disturbance time-series data, suggesting the potential utility of GRU's to predict future disturbances acting on the system. The learned GRU model compensates for these disturbances within the standard PID control loop in real time via an additive correction term which is updated at regular time intervals. The simulation results verify the significant reduction in attitude error, verifying the efficacy of our proposed approach. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 14 pages

arXiv:2403.18066 [pdf, ps, other]

Path Integral Control with Rollout Clustering and Dynamic Obstacles

Authors: Steven Patrick, Efstathios Bakolas

Abstract: Model Predictive Path Integral (MPPI) control has proven to be a powerful tool for the control of uncertain systems (such as systems subject to disturbances and systems with unmodeled dynamics). One important limitation of the baseline MPPI algorithm is that it does not utilize simulated trajectories to their fullest extent. For one, it assumes that the average of all trajectories weighted by thei… ▽ More Model Predictive Path Integral (MPPI) control has proven to be a powerful tool for the control of uncertain systems (such as systems subject to disturbances and systems with unmodeled dynamics). One important limitation of the baseline MPPI algorithm is that it does not utilize simulated trajectories to their fullest extent. For one, it assumes that the average of all trajectories weighted by their performance index will be a safe trajectory. In this paper, multiple examples are shown where the previous assumption does not hold, and a trajectory clustering technique is presented that reduces the chances of the weighted average crossing in an unsafe region. Secondly, MPPI does not account for dynamic obstacles, so the authors put forward a novel cost function that accounts for dynamic obstacles without adding significant computation time to the overall algorithm. The novel contributions proposed in this paper were evaluated with extensive simulations to demonstrate improvements upon the state-of-the-art MPPI techniques. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: 8 pages, 5 figures, extended version of ACC 2024 submission

arXiv:2403.13905 [pdf, other]

Motion Prediction of Multi-agent systems with Multi-view clustering

Authors: Anegi James, Efstathios Bakolas

Abstract: This paper presents a method for future motion prediction of multi-agent systems by including group formation information and future intent. Formation of groups depends on a physics-based clustering method that follows the agglomerative hierarchical clustering algorithm. We identify clusters that incorporate the minimum cost-to-go function of a relevant optimal control problem as a metric for clus… ▽ More This paper presents a method for future motion prediction of multi-agent systems by including group formation information and future intent. Formation of groups depends on a physics-based clustering method that follows the agglomerative hierarchical clustering algorithm. We identify clusters that incorporate the minimum cost-to-go function of a relevant optimal control problem as a metric for clustering between the groups among agents, where groups with similar associated costs are assumed to be likely to move together. The cost metric accounts for proximity to other agents as well as the intended goal of each agent. An unscented Kalman filter based approach is used to update the established clusters as well as add new clusters when new information is obtained. Our approach is verified through non-trivial numerical simulations implementing the proposed algorithm on different datasets pertaining to a variety of scenarios and agents. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: 20 pages, 9 figures

arXiv:2312.07345 [pdf, other]

Neural Differentiable Integral Control Barrier Functions for Unknown Nonlinear Systems with Input Constraints

Authors: Vrushabh Zinage, Rohan Chandra, Efstathios Bakolas

Abstract: In this paper, we propose a deep learning based control synthesis framework for fast and online computation of controllers that guarantees the safety of general nonlinear control systems with unknown dynamics in the presence of input constraints. Towards this goal, we propose a framework for simultaneously learning the unknown system dynamics, which can change with time due to external disturbance… ▽ More In this paper, we propose a deep learning based control synthesis framework for fast and online computation of controllers that guarantees the safety of general nonlinear control systems with unknown dynamics in the presence of input constraints. Towards this goal, we propose a framework for simultaneously learning the unknown system dynamics, which can change with time due to external disturbances, and an integral control law for trajectory tracking based on imitation learning. Simultaneously, we learn corresponding safety certificates, which we refer to as Neural Integral Control Barrier Functions (Neural ICBF's), that automatically encode both the state and input constraints into a single scalar-valued function and enable the design of controllers that can guarantee that the state of the unknown system will never leave a safe subset of the state space. Finally, we provide numerical simulations that validate our proposed approach and compare it with classical as well as recent learning based methods from the relevant literature. △ Less

Submitted 12 December, 2023; originally announced December 2023.

Comments: 15 pages, 4 figures

arXiv:2311.08500 [pdf, other]

Density Steering of Gaussian Mixture Models for Discrete-Time Linear Systems

Authors: Isin M. Balci, Efstathios Bakolas

Abstract: In this paper, we study the finite-horizon optimal density steering problem for discrete-time stochastic linear dynamical systems. Specifically, we focus on steering probability densities represented as Gaussian mixture models which are known to give good approximations for general smooth probability density functions. We then revisit the covariance steering problem for Gaussian distributions and… ▽ More In this paper, we study the finite-horizon optimal density steering problem for discrete-time stochastic linear dynamical systems. Specifically, we focus on steering probability densities represented as Gaussian mixture models which are known to give good approximations for general smooth probability density functions. We then revisit the covariance steering problem for Gaussian distributions and derive its optimal control policy. Subsequently, we propose a randomized policy to enhance the numerical tractability of the problem and demonstrate that under this policy the state distribution remains a Gaussian mixture. By leveraging these results, we reduce the Gaussian mixture steering problem to a linear program. We also discuss the problem of steering general distributions using Gaussian mixture approximations. Finally, we present results of non-trivial numerical experiments and demonstrate that our approach can be applied to general distribution steering problems. △ Less

Submitted 17 December, 2023; v1 submitted 14 November, 2023; originally announced November 2023.

arXiv:2309.16945 [pdf, other]

Disturbance Observer-based Robust Integral Control Barrier Functions for Nonlinear Systems with High Relative Degree

Authors: Vrushabh Zinage, Rohan Chandra, Efstathios Bakolas

Abstract: In this paper, we consider the problem of safe control synthesis of general controlled nonlinear systems in the presence of bounded additive disturbances. Towards this aim, we first construct a governing augmented state space model consisting of the equations of motion of the original system, the integral control law and the nonlinear disturbance observer. Next, we propose the concept of Disturban… ▽ More In this paper, we consider the problem of safe control synthesis of general controlled nonlinear systems in the presence of bounded additive disturbances. Towards this aim, we first construct a governing augmented state space model consisting of the equations of motion of the original system, the integral control law and the nonlinear disturbance observer. Next, we propose the concept of Disturbance Observer based Integral Control Barrier Functions (DO-ICBFs) which we utilize to synthesize safe control inputs. The characterization of the safe controller is obtained after modifying the governing integral control law with an additive auxiliary control input which is computed via the solution of a quadratic problem. In contrast to prior methods in the relevant literature which can be unnecessarily cautious due to their reliance on the worst case disturbance estimates, our DO-ICBF based controller uses the available control effort frugally by leveraging the disturbance estimates computed by the disturbance observer. By construction, the proposed DO-ICBF based controller can ensure state and input constraint satisfaction at all times. Further, we propose Higher Order DO-ICBFs that extend our proposed method to nonlinear systems with higher relative degree with respect to the auxiliary control input. Finally, numerical simulations are provided to validate our proposed approach. △ Less

Submitted 28 September, 2023; originally announced September 2023.

Comments: 8 pages and 7 figures

arXiv:2308.10966 [pdf, other]

Deadlock-free, Safe, and Decentralized Multi-Robot Navigation in Social Mini-Games via Discrete-Time Control Barrier Functions

Authors: Rohan Chandra, Vrushabh Zinage, Efstathios Bakolas, Peter Stone, Joydeep Biswas

Abstract: We present an approach to ensure safe and deadlock-free navigation for decentralized multi-robot systems operating in constrained environments, including doorways and intersections. Although many solutions have been proposed that ensure safety and resolve deadlocks, optimally preventing deadlocks in a minimally invasive and decentralized fashion remains an open problem. We first formalize the obje… ▽ More We present an approach to ensure safe and deadlock-free navigation for decentralized multi-robot systems operating in constrained environments, including doorways and intersections. Although many solutions have been proposed that ensure safety and resolve deadlocks, optimally preventing deadlocks in a minimally invasive and decentralized fashion remains an open problem. We first formalize the objective as a non-cooperative, non-communicative, partially observable multi-robot navigation problem in constrained spaces with multiple conflicting agents, which we term as social mini-games. Formally, we solve a discrete-time optimal receding horizon control problem leveraging control barrier functions for safe long-horizon planning. Our approach to ensuring liveness rests on the insight that \textit{there exists barrier certificates that allow each robot to preemptively perturb their state in a minimally-invasive fashion onto liveness sets i.e. states where robots are deadlock-free}. We evaluate our approach in simulation as well on physical robots using F$1/10$ robots, a Clearpath Jackal, as well as a Boston Dynamics Spot in a doorway, hallway, and corridor intersection scenario. Compared to both fully decentralized and centralized approaches with and without deadlock resolution capabilities, we demonstrate that our approach results in safer, more efficient, and smoother navigation, based on a comprehensive set of metrics including success rate, collision rate, stop time, change in velocity, path deviation, time-to-goal, and flow rate. △ Less

Submitted 8 February, 2024; v1 submitted 21 August, 2023; originally announced August 2023.

Comments: major update since last revision

arXiv:2305.15573 [pdf, other]

Semi-global Exponential Stability for Dual Quaternion Based Rigid-Body Tracking Control

Authors: Vrushabh Zinage, S P Arjun Ram, Maruthi R. Akella, Efstathios Bakolas

Abstract: Semi-Global Exponential Stability (SGES) is proved for the combined attitude and position rigid body motion tracking problem, which was previously only known to be asymptotically stable. Dual quaternions are used to jointly represent the rotational and translation tracking error dynamics of the rigid body. A novel nonlinear feedback tracking controller is proposed and a Lyapunov based analysis is… ▽ More Semi-Global Exponential Stability (SGES) is proved for the combined attitude and position rigid body motion tracking problem, which was previously only known to be asymptotically stable. Dual quaternions are used to jointly represent the rotational and translation tracking error dynamics of the rigid body. A novel nonlinear feedback tracking controller is proposed and a Lyapunov based analysis is provided to prove the semi-global exponential stability of the closed-loop dynamics. Our analysis does not place any restrictions on the reference trajectory or the feedback gains. This stronger SGES result aids in further analyzing the robustness of the rigid body system by establishing Input-to-State Stability (ISS) in the presence of time-varying additive and bounded external disturbances. Motivated by the fact that in many aerospace applications, stringent adherence to safety constraints such as approach path and input constraints is critical for overall mission success, we present a framework for safe control of spacecraft that combines the proposed feedback controller with Control Barrier Functions. Numerical simulations are provided to verify the SGES and ISS results and also showcase the efficacy of the proposed nonlinear feedback controller in several non-trivial scenarios including the Mars Cube One (MarCO) mission, Apollo transposition and docking problem, Starship flip maneuver, collision avoidance of spherical robots, and the rendezvous of SpaceX Dragon 2 with the International Space Station. △ Less

Submitted 29 December, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: 25 pages

arXiv:2212.00398 [pdf, other]

Distributed Model Predictive Covariance Steering

Authors: Augustinos D. Saravanos, Isin M. Balci, Efstathios Bakolas, Evangelos A. Theodorou

Abstract: This paper proposes Distributed Model Predictive Covariance Steering (DMPCS), a novel method for safe multi-robot control under uncertainty. The scope of our approach is to blend covariance steering theory, distributed optimization and model predictive control (MPC) into a single methodology that is safe, scalable and decentralized. Initially, we pose a problem formulation that uses the Wasserstei… ▽ More This paper proposes Distributed Model Predictive Covariance Steering (DMPCS), a novel method for safe multi-robot control under uncertainty. The scope of our approach is to blend covariance steering theory, distributed optimization and model predictive control (MPC) into a single methodology that is safe, scalable and decentralized. Initially, we pose a problem formulation that uses the Wasserstein distance to steer the state distributions of a multi-robot team to desired targets, and probabilistic constraints to ensure safety. We then transform this problem into a finite-dimensional optimization one by utilizing a disturbance feedback policy parametrization for covariance steering and a tractable approximation of the safety constraints. To solve the latter problem, we derive a decentralized consensus-based algorithm using the Alternating Direction Method of Multipliers (ADMM). This method is then extended to a receding horizon form, which yields the proposed DMPCS algorithm. Simulation experiments on large-scale problems with up to hundreds of robots successfully demonstrate the effectiveness and scalability of DMPCS. Its superior capability in achieving safety is also highlighted through a comparison against a standard stochastic MPC approach. A video with all simulation experiments is available in https://youtu.be/Hks-0BRozxA. △ Less

Submitted 1 December, 2022; originally announced December 2022.

arXiv:2210.01743 [pdf, other]

Covariance Steering of Discrete-Time Linear Systems with Mixed Multiplicative and Additive Noise

Authors: Isin M. Balci, Efstathios Bakolas

Abstract: In this paper, we study the covariance steering (CS) problem for discrete-time linear systems subject to multiplicative and additive noise. Specifically, we consider two variants of the so-called CS problem. The goal of the first problem, which is called the exact CS problem, is to steer the mean and the covariance of the state process to their desired values in finite time. In the second one, whi… ▽ More In this paper, we study the covariance steering (CS) problem for discrete-time linear systems subject to multiplicative and additive noise. Specifically, we consider two variants of the so-called CS problem. The goal of the first problem, which is called the exact CS problem, is to steer the mean and the covariance of the state process to their desired values in finite time. In the second one, which is called the ``relaxed'' CS problem, the covariance assignment constraint is relaxed into a positive semi-definite constraint. We show that the relaxed CS problem can be cast as an equivalent convex semi-definite program (SDP) after applying suitable variable transformations and constraint relaxations. Furthermore, we propose a two-step solution procedure for the exact CS problem based on the relaxed problem formulation which returns a feasible solution, if there exists one. Finally, results from numerical experiments are provided to show the efficacy of the proposed solution methods. △ Less

Submitted 4 October, 2022; originally announced October 2022.

arXiv:2210.01364 [pdf, other]

Two-Player Reconnaissance Game with Half-Planar Target and Retreat Regions

Authors: Yoonjae Lee, Efstathios Bakolas

Abstract: This paper is concerned with the reconnaissance game that involves two mobile agents: the Intruder and the Defender. The Intruder is tasked to reconnoiter a territory of interest (target region) and then return to a safe zone (retreat region), where the two regions are disjoint half-planes, while being chased by the faster Defender. This paper focuses on the scenario where the Defender is not guar… ▽ More This paper is concerned with the reconnaissance game that involves two mobile agents: the Intruder and the Defender. The Intruder is tasked to reconnoiter a territory of interest (target region) and then return to a safe zone (retreat region), where the two regions are disjoint half-planes, while being chased by the faster Defender. This paper focuses on the scenario where the Defender is not guaranteed to capture the Intruder before the latter agent reaches the retreat region. The goal of the Intruder is to minimize its distance to the target region, whereas the Defender's goal is to maximize the same distance. The game is decomposed into two phases based on the Intruder's myopic goal. The complete solution of the game corresponding to each phase, namely the Value function and state-feedback equilibrium strategies, is developed in closed-form using differential game methods. Numerical simulation results are presented to showcase the efficacy of our solutions. △ Less

Submitted 21 March, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

arXiv:2209.07685 [pdf, other]

Neural Koopman Control Barrier Functions for Safety-Critical Control of Unknown Nonlinear Systems

Authors: Vrushabh Zinage, Efstathios Bakolas

Abstract: We consider the problem of synthesis of safe controllers for nonlinear systems with unknown dynamics using Control Barrier Functions (CBF). We utilize Koopman operator theory (KOT) to associate the (unknown) nonlinear system with a higher dimensional bilinear system and propose a data-driven learning framework that uses a learner and a falsifier to simultaneously learn the Koopman operator based b… ▽ More We consider the problem of synthesis of safe controllers for nonlinear systems with unknown dynamics using Control Barrier Functions (CBF). We utilize Koopman operator theory (KOT) to associate the (unknown) nonlinear system with a higher dimensional bilinear system and propose a data-driven learning framework that uses a learner and a falsifier to simultaneously learn the Koopman operator based bilinear system and a corresponding CBF. We prove that the learned CBF for the latter bilinear system is also a valid CBF for the unknown nonlinear system by characterizing the $\ell^2$-norm error bound between these two systems. We show that this error can be partially tuned by using the Lipschitz constant of the Koopman based observables. The CBF is then used to formulate a quadratic program to compute inputs that guarantee safety of the unknown nonlinear system. Numerical simulations are presented to validate our approach. △ Less

Submitted 15 September, 2022; originally announced September 2022.

Comments: 6 pages

arXiv:2207.01747 [pdf, other]

Vector Field-based Collision Avoidance for Moving Obstacles with Time-Varying Elliptical Shape

Authors: Martin Braquet, Efstathios Bakolas

Abstract: This paper presents an algorithm for local motion planning in environments populated by moving elliptical obstacles whose velocity, shape and size are fully known but may change with time. We base the algorithm on a collision avoidance vector field (CAVF) that aims to steer an agent to a desired final state whose motion is described by a double integrator kinematic model. In addition to handling m… ▽ More This paper presents an algorithm for local motion planning in environments populated by moving elliptical obstacles whose velocity, shape and size are fully known but may change with time. We base the algorithm on a collision avoidance vector field (CAVF) that aims to steer an agent to a desired final state whose motion is described by a double integrator kinematic model. In addition to handling multiple obstacles, the method is applicable in bounded environments for more realistic applications (e.g., motion planning inside a building). We also incorporate a method to deal with agents whose control input is limited so that they safely navigate around the obstacles. To showcase our approach, extensive simulations results are presented in 2D and 3D scenarios. △ Less

Submitted 23 July, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

Comments: 6 pages, 6 figures, conference

arXiv:2205.10740 [pdf, other]

Exact SDP Formulation for Discrete-Time Covariance Steering with Wasserstein Terminal Cost

Authors: Isin M. Balci, Efstathios Bakolas

Abstract: In this paper, we present new results on the covariance steering problem with Wasserstein distance terminal cost. We show that the state history feedback control policy parametrization, which has been used before to solve this class of problems, requires an unnecessarily large number of variables and can be replaced by a randomized state feedback policy which leads to more tractable problem formul… ▽ More In this paper, we present new results on the covariance steering problem with Wasserstein distance terminal cost. We show that the state history feedback control policy parametrization, which has been used before to solve this class of problems, requires an unnecessarily large number of variables and can be replaced by a randomized state feedback policy which leads to more tractable problem formulations without any performance loss. In particular, we show that under the latter policy, the problem can be equivalently formulated as a semi-definite program (SDP) which is in sharp contrast with our previous results that could only guarantee that the stochastic optimal control problem can be reduced to a difference of convex functions program. Then, we show that the optimal policy that is found by solving the associated SDP corresponds to a deterministic state feedback policy. Finally, we present non-trivial numerical simulations which show the benefits of our proposed randomized state feedback policy derived from the SDP formulation of the problem over existing approaches in the field in terms of computational efficacy and controller performance. △ Less

Submitted 22 May, 2022; originally announced May 2022.

arXiv:2202.09392 [pdf, other]

Smooth time optimal trajectory generation for drones

Authors: Srinath Tankasala, Can Pehlivanturk, Efstathios Bakolas, Mitch Pryor

Abstract: In this paper, we address a minimum-time steering problem for a drone modeled as point mass with bounded acceleration, across a set of desired waypoints in the presence of gravity. We first provide a method to solve for the minimum-time control input that will steer the point mass between two waypoints based on a continuous-time problem formulation which we address by using Pontryagin's Minimum Pr… ▽ More In this paper, we address a minimum-time steering problem for a drone modeled as point mass with bounded acceleration, across a set of desired waypoints in the presence of gravity. We first provide a method to solve for the minimum-time control input that will steer the point mass between two waypoints based on a continuous-time problem formulation which we address by using Pontryagin's Minimum Principle. Subsequently, we solve for the time-optimal trajectory across the given set of waypoints by discretizing in the time domain and formulating the minimum-time problem as a nonlinear program (NLP). The velocities at each waypoint obtained from solving the NLP in the discretized domain are then used as boundary conditions to extend our two-point solution across those multiple waypoints. We apply this planning methodology to execute a surveying task that minimizes the time taken to completely explore a target area or volume. Numerical simulations and theoretical analyses of this new planning methodology are presented. The results from our approach are also compared to traditional polynomial trajectories like minimum snap planning. △ Less

Submitted 18 February, 2022; originally announced February 2022.

arXiv:2201.05098 [pdf, other]

Neural Koopman Lyapunov Control

Authors: Vrushabh Zinage, Efstathios Bakolas

Abstract: Learning and synthesizing stabilizing controllers for unknown nonlinear control systems is a challenging problem for real-world and industrial applications. Koopman operator theory allows one to analyze nonlinear systems through the lens of linear systems and nonlinear control systems through the lens of bilinear control systems. The key idea of these methods lies in the transformation of the coor… ▽ More Learning and synthesizing stabilizing controllers for unknown nonlinear control systems is a challenging problem for real-world and industrial applications. Koopman operator theory allows one to analyze nonlinear systems through the lens of linear systems and nonlinear control systems through the lens of bilinear control systems. The key idea of these methods lies in the transformation of the coordinates of the nonlinear system into the Koopman observables, which are coordinates that allow the representation of the original system (control system) as a higher dimensional linear (bilinear control) system. However, for nonlinear control systems, the bilinear control model obtained by applying Koopman operator based learning methods is not necessarily stabilizable. Simultaneous identification of stabilizable lifted bilinear control systems as well as the associated Koopman observables is still an open problem. In this paper, we propose a framework to construct these stabilizable bilinear models and identify its associated observables from data by simultaneously learning a bilinear Koopman embedding for the underlying unknown control affine nonlinear system as well as a Control Lyapunov Function (CLF) for the Koopman based bilinear model using a learner and falsifier. Our proposed approach thereby provides provable guarantees of asymptotic stability for the Koopman based representation of the unknown control affine nonlinear control system as a bilinear system. Numerical simulations are provided to validate the efficacy of our proposed class of stabilizing feedback controllers for unknown control-affine nonlinear systems. △ Less

Submitted 22 May, 2022; v1 submitted 13 January, 2022; originally announced January 2022.

arXiv:2111.09455 [pdf, other]

Feedback Strategies for Hypersonic Pursuit of a Ground Evader

Authors: Yoonjae Lee, Efstathios Bakolas, Maruthi R. Akella

Abstract: In this paper, we present a game-theoretic feedback terminal guidance law for an autonomous, unpowered hypersonic pursuit vehicle that seeks to intercept an evading ground target whose motion is constrained in a one-dimensional space. We formulate this problem as a pursuit-evasion game whose saddle point solution is in general difficult to compute onboard the hypersonic vehicle due to its highly n… ▽ More In this paper, we present a game-theoretic feedback terminal guidance law for an autonomous, unpowered hypersonic pursuit vehicle that seeks to intercept an evading ground target whose motion is constrained in a one-dimensional space. We formulate this problem as a pursuit-evasion game whose saddle point solution is in general difficult to compute onboard the hypersonic vehicle due to its highly nonlinear dynamics. To overcome this computational complexity, we linearize the nonlinear hypersonic dynamics around a reference trajectory and subsequently utilize feedback control design techniques from Linear Quadratic Differential Games (LQDGs). In our proposed guidance algorithm, the hypersonic vehicle computes its open-loop optimal state and input trajectories off-line and prior to the commencement of the game. These trajectories are then used to linearize the nonlinear equations of hypersonic motion. Subsequently, using this linearized system model, we formulate an auxiliary two-player zero-sum LQDG which is effective in the neighborhood of the given reference trajectory and derive its feedback saddle point strategy that allows the hypersonic vehicle to modify its trajectory online in response to the target's evasive maneuvers. We provide numerical simulations to showcase the performance of our proposed guidance law. △ Less

Submitted 13 January, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

arXiv:2110.07744 [pdf, other]

Constrained Covariance Steering Based Tube-MPPI

Authors: Isin M. Balci, Efstathios Bakolas, Bogdan Vlahov, Evangelos Theodorou

Abstract: In this paper, we present a new trajectory optimization algorithm for stochastic linear systems which combines Model Predictive Path Integral (MPPI) control with Constrained Covariance Steering (CSS) to achieve high performance with safety guarantees (robustness). Although MPPI can be used to solve complex nonlinear trajectory optimization problems, it may not always handle constraints effectively… ▽ More In this paper, we present a new trajectory optimization algorithm for stochastic linear systems which combines Model Predictive Path Integral (MPPI) control with Constrained Covariance Steering (CSS) to achieve high performance with safety guarantees (robustness). Although MPPI can be used to solve complex nonlinear trajectory optimization problems, it may not always handle constraints effectively and its performance may degrade in the presence of unmodeled disturbances. By contrast, CCS can handle probabilistic state and / or input constraints (e.g., chance constraints) and also steer the state covariance of the system to a desired positive definite matrix (control of uncertainty) which both imply that CCS can provide robustness against stochastic disturbances. CCS, however, suffers from scalability issues and cannot handle complex cost functions in general. We argue that the combination of the two methods yields a class of trajectory optimization algorithms that can achieve high performance (a feature of MPPI) while ensuring safety with high probability (a feature of CCS). The efficacy of our algorithm is demonstrated in an obstacle avoidance problem and a circular track path generation problem. △ Less

Submitted 19 April, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

arXiv:2110.04967 [pdf, other]

Koopman Operator Based Modeling and Control of Rigid Body Motion Represented by Dual Quaternions

Authors: Vrushabh Zinage, Efstathios Bakolas

Abstract: In this paper, we systematically derive a finite set of Koopman based observables to construct a lifted linear state space model that describes the rigid body dynamics based on the dual quaternion representation. In general, the Koopman operator is a linear infinite dimensional operator, which means that the derived linear state space model of the rigid body dynamics will be infinite-dimensional,… ▽ More In this paper, we systematically derive a finite set of Koopman based observables to construct a lifted linear state space model that describes the rigid body dynamics based on the dual quaternion representation. In general, the Koopman operator is a linear infinite dimensional operator, which means that the derived linear state space model of the rigid body dynamics will be infinite-dimensional, which is not suitable for modeling and control design purposes. Recently, finite approximations of the operator computed by means of methods like the Extended Dynamic Mode Decomposition (EDMD) have shown promising results for different classes of problems. However, without using an appropriate set of observables in the EDMD approach, there can be no guarantees that the computed approximation of the nonlinear dynamics is sufficiently accurate. The major challenge in using the Koopman operator for constructing a linear state space model is the choice of observables. State-of-the-art methods in the field compute the approximations of the observables by using neural networks, standard radial basis functions (RBFs), polynomials or heuristic approximations of these functions. However, these observables might not providea sufficiently accurate approximation or representation of the dynamics. In contrast, we first show the pointwise convergence of the derived observable functions to zero, thereby allowing us to choose a finite set of these observables. Next, we use the derived observables in EDMD to compute the lifted linear state and input matrices for the rigid body dynamics. Finally, we show that an LQR type (linear) controller, which is designed based on the truncated linear state space model, can steer the rigid body to a desired state while its performance is commensurate with that of a nonlinear controller. The efficacy of our approach is demonstrated through numerical simulations. △ Less

Submitted 18 September, 2022; v1 submitted 10 October, 2021; originally announced October 2021.

arXiv:2109.08781 [pdf, ps, other]

Minimum-fuel Spacecraft Rendezvous based on Sparsity Promoting Optimization

Authors: Vrushabh Zinage, Efstathios Bakolas

Abstract: In this paper, we consider the classical spacecraft rendezvous problem in which the so-called active spacecraft has to approach the target spacecraft which is moving in an elliptical orbit around a planet by using the minimum possible amount of fuel. Instead of using standard convex optimization tools which can be computationally expensive, we use modified versions of the Iteratively Reweighted Le… ▽ More In this paper, we consider the classical spacecraft rendezvous problem in which the so-called active spacecraft has to approach the target spacecraft which is moving in an elliptical orbit around a planet by using the minimum possible amount of fuel. Instead of using standard convex optimization tools which can be computationally expensive, we use modified versions of the Iteratively Reweighted Least Squares (IRLS) algorithm from compressive sensing to compute sparse optimal control sequences which minimize the fuel consumption for both thrust vectoring and orthogonal vectoring (active) spacecraft. Numerical simulations are performed to verify the efficacy of our approach. △ Less

Submitted 17 September, 2021; originally announced September 2021.

arXiv:2109.07075 [pdf, other]

doi 10.1109/LCSYS.2021.3132083

Guarding a Target Set from a Single Attacker in the Euclidean Space

Authors: Yoonjae Lee, Efstathios Bakolas

Abstract: This paper addresses a two-player target defense game in the $n$-dimensional Euclidean space where an attacker attempts to enter a closed convex target set while a defender strives to capture the attacker beforehand. We provide a complete and universal differential game-based solution which not only encompasses recent work associated with similar problems whose target sets have simple, low-dimensi… ▽ More This paper addresses a two-player target defense game in the $n$-dimensional Euclidean space where an attacker attempts to enter a closed convex target set while a defender strives to capture the attacker beforehand. We provide a complete and universal differential game-based solution which not only encompasses recent work associated with similar problems whose target sets have simple, low-dimensional geometric shapes, but can also address problems that involve nontrivial geometric shapes of high-dimensional target sets. The value functions of the game are derived in a semi-analytical form that includes a convex optimization problem. When the latter problem has a closed-form solution, one of the value functions is used to analytically construct the barrier surface that divides the state space of the game into the winning sets of players. For the case where the barrier surface has no analytical expression but the target set has a smooth boundary, the bijective map between the target boundary and the projection of the barrier surface is obtained. By using Hamilton-Jacobi-Isaacs equation, we verify that the proposed optimal state feedback strategies always constitute the game's unique saddle point whether or not the optimization problem has a closed-form solution. We illustrate our solutions via numerical simulations. △ Less

Submitted 15 September, 2021; originally announced September 2021.

arXiv:2107.07117 [pdf, other]

Collision Avoidance Using Spherical Harmonics

Authors: Steven Patrick, Efstathios Bakolas

Abstract: In this paper, we propose a novel optimization-based trajectory planner that utilizes spherical harmonics to estimate the collision-free solution space around an agent. The space is estimated using a constrained over-determined least-squares estimator to determine the parameters that define a spherical harmonic approximation at a given time step. Since spherical harmonics produce star-convex shape… ▽ More In this paper, we propose a novel optimization-based trajectory planner that utilizes spherical harmonics to estimate the collision-free solution space around an agent. The space is estimated using a constrained over-determined least-squares estimator to determine the parameters that define a spherical harmonic approximation at a given time step. Since spherical harmonics produce star-convex shapes, the planner can consider all paths that are in line-of-sight for the agent within a given radius. This contrasts with other state-of-the-art planners that generate trajectories by estimating obstacle boundaries with rough approximations and using heuristic rules to prune a solution space into one that can be easily explored. Those methods cause the trajectory planner to be overly conservative in environments where an agent must get close to obstacles to accomplish a goal. Our method is shown to perform on-par with other path planners and surpass these planners in certain environments. It generates feasible trajectories while still running in real-time and guaranteeing safety when a valid solution exists. △ Less

Submitted 15 July, 2021; originally announced July 2021.

Comments: 6 pages, MECC 2021

arXiv:2107.04078 [pdf, other]

Distributed Coverage Control of Multi-Agent Networks with Guaranteed Collision Avoidance in Cluttered Environments

Authors: Alaa Z. Abdulghafoor, Efstathios Bakolas

Abstract: We propose a distributed control algorithm for a multi-agent network whose agents deploy over a cluttered region in accordance with a time-varying coverage density function while avoiding collisions with all obstacles they encounter. Our algorithm is built on a two-level characterization of the network. The first level treats the multi-agent network as a whole based on the distribution of the loca… ▽ More We propose a distributed control algorithm for a multi-agent network whose agents deploy over a cluttered region in accordance with a time-varying coverage density function while avoiding collisions with all obstacles they encounter. Our algorithm is built on a two-level characterization of the network. The first level treats the multi-agent network as a whole based on the distribution of the locations of its agents over the spatial domain. In the second level, the network is described in terms of the individual positions of its agents. The aim of the multi-agent network is to attain a spatial distribution that resembles that of a reference coverage density function (high-level problem) by means of local (microscopic) interactions of its agents (low-level problem). In addition, as the agents deploy, they must avoid collisions with all the obstacles in the region at all times. Our approach utilizes a modified version of Voronoi tessellations which are comprised of what we refer to as Obstacle-Aware Voronoi Cells (OAVC) in order to enable coverage control while ensuring obstacle avoidance. We consider two control problems. The first problem which we refer to as the high-level coverage control problem corresponds to an interpolation problem in the class of Gaussian mixtures (no collision avoidance requirement), which we solve analytically. The second problem which we refer to as the low-level coverage control problem corresponds to a distributed control problem (collision avoidance requirement is now enforced at all times) which is solved by utilizing Lloyd's algorithm together with the modified Voronoi tessellation (OAVC) and a time-varying coverage density function which corresponds to the solution of the high-level coverage control problem. Finally, simulation results for coverage in a cluttered environment are provided to demonstrate the efficacy of the proposed approach. △ Less

Submitted 8 July, 2021; originally announced July 2021.

arXiv:2107.00144 [pdf, other]

Greedy Decentralized Auction-based Task Allocation for Multi-Agent Systems

Authors: Martin Braquet, Efstathios Bakolas

Abstract: We propose a decentralized auction-based algorithm for the solution of dynamic task allocation problems for spatially distributed multi-agent systems. In our approach, each member of the multi-agent team is assigned to at most one task from a set of spatially distributed tasks, while several agents can be allocated to the same task. The task assignment is dynamic since it is updated at discrete ti… ▽ More We propose a decentralized auction-based algorithm for the solution of dynamic task allocation problems for spatially distributed multi-agent systems. In our approach, each member of the multi-agent team is assigned to at most one task from a set of spatially distributed tasks, while several agents can be allocated to the same task. The task assignment is dynamic since it is updated at discrete time stages (iterations) to account for the current states of the agents as the latter move towards the tasks assigned to them at the previous stage. Our proposed methods can find applications in problems of resource allocation by intelligent machines such as the delivery of packages by a fleet of unmanned or semi-autonomous aerial vehicles. In our approach, the task allocation accounts for both the cost incurred by the agents for the completion of their assigned tasks (e.g., energy or fuel consumption) and the rewards earned for their completion (which may reflect, for instance, the agents' satisfaction). We propose a Greedy Coalition Auction Algorithm (GCAA) in which the agents possess bid vectors representing their best evaluations of the task utilities. The agents propose bids, deduce an allocation based on their bid vectors and update them after each iteration. The solution estimate of the proposed task allocation algorithm converges after a finite number of iterations which cannot exceed the number of agents. Finally, we use numerical simulations to illustrate the effectiveness of the proposed task allocation algorithm (in terms of performance and computation time) in several scenarios involving multiple agents and tasks distributed over a spatial 2D domain. △ Less

Submitted 30 June, 2021; originally announced July 2021.

Comments: 8 pages, conference

arXiv:2106.13451 [pdf, ps, other]

doi 10.2514/1.G004446

Collision Avoidance for Unmanned Aerial Vehicles in the Presence of Static and Moving Obstacles

Authors: Andrei Marchidan, Efstathios Bakolas

Abstract: This paper presents a new collision avoidance procedure for unmanned aerial vehicles in the presence of static and moving obstacles. The proposed procedure is based on a new form of local parametrized guidance vector fields, called collision avoidance vector fields, that produce smooth and intuitive maneuvers around obstacles. The maneuvers follow nominal collision-free paths which we refer to as… ▽ More This paper presents a new collision avoidance procedure for unmanned aerial vehicles in the presence of static and moving obstacles. The proposed procedure is based on a new form of local parametrized guidance vector fields, called collision avoidance vector fields, that produce smooth and intuitive maneuvers around obstacles. The maneuvers follow nominal collision-free paths which we refer to as streamlines of the collision avoidance vector fields. In the case of multiple obstacles, the proposed procedure determines a mixed vector field that blends the collision avoidance vector field of each obstacle and assumes its form whenever a pre-defined distance threshold is reached. Then, in accordance to the computed guidance vector fields, different collision avoidance controllers that generate collision-free maneuvers are developed. Furthermore, it is shown that any tracking controller with convergence guarantees can be used with the avoidance controllers to track the streamlines of the collision avoidance vector fields. Finally, numerical simulations demonstrate the efficacy of the proposed approach and its ability to avoid collisions with static and moving pop-up threats in three different practical scenarios. △ Less

Submitted 25 June, 2021; originally announced June 2021.

Comments: This is a revised version of the original published in the AIAA Journal of Guidance, Controls and Dynamics, Vol. 43, Iss. 1

Journal ref: Journal of Guidance, Control, and Dynamics 43.1 (2020): 96-110

arXiv:2104.00717 [pdf, other]

doi 10.1109/CDC45484.2021.9683090

Optimal Strategies for Guarding a Compact and Convex Target Set: A Differential Game Approach

Authors: Yoonjae Lee, Efstathios Bakolas

Abstract: We revisit the two-player planar target-defense game initially posed by Isaacs where a pursuer (or defender) attempts to guard a target set from an attack by an evader (or attacker). This paper builds on existing analytical solutions to games of defending a simple shape of target area to develop a generalized and extended solution to the same game with a compact convex target set with smooth bound… ▽ More We revisit the two-player planar target-defense game initially posed by Isaacs where a pursuer (or defender) attempts to guard a target set from an attack by an evader (or attacker). This paper builds on existing analytical solutions to games of defending a simple shape of target area to develop a generalized and extended solution to the same game with a compact convex target set with smooth boundary. Isaacs' method is applied to address the game of kind and games of degree. A geometric solution approach is used to find the barrier surface that demarcates the winning sets of the players. A value function coupled with a set of optimal state feedback strategies in each winning set is derived and proven to correspond to the saddle point solution of the game. The proposed solutions are illustrated by means of numerical simulations. △ Less

Submitted 15 September, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

arXiv:2103.14428 [pdf, other]

Covariance Control of Discrete-Time Gaussian Linear Systems Using Affine Disturbance Feedback Control Policies

Authors: Isin M. Balci, Efstathios Bakolas

Abstract: In this paper, we present a new control policy parametrization for the finite-horizon covariance steering problem for discrete-time Gaussian linear systems (DTGLS) which can reduce the latter stochastic optimal control problem to a tractable optimization problem. The covariance steering problem seeks for a feedback control policy that will steer the state covariance of a DTGLS to a desired positiv… ▽ More In this paper, we present a new control policy parametrization for the finite-horizon covariance steering problem for discrete-time Gaussian linear systems (DTGLS) which can reduce the latter stochastic optimal control problem to a tractable optimization problem. The covariance steering problem seeks for a feedback control policy that will steer the state covariance of a DTGLS to a desired positive definite matrix in finite time. We consider two different formulations of the covariance steering problem, one with hard terminal LMI constraints (hard-constrained covariance steering) and another one with soft terminal constraints in the form of a terminal cost which corresponds to the squared Wasserstein distance between the actual terminal state (Gaussian) distribution and the desired one (soft-constrained covariance steering). We propose a solution approach that relies on the affine disturbance feedback parametrization for both problem formulations. We show that this particular parametrization allows us to reduce the hard-constrained covariance steering problem into a semi-definite program (SDP) and the soft-constrained covariance steering problem into a difference of convex functions program(DCP). Finally, we show the advantages of our approach over other covariance steering algorithms in terms of computational complexity and computation time by means of theoretical analysis and numerical simulations. △ Less

Submitted 26 March, 2021; originally announced March 2021.

arXiv:2103.13579 [pdf, other]

On the Convexity of Discrete Time Covariance Steering in Stochastic Linear Systems with Wasserstein Terminal Cost

Authors: Isin M. Balci, Abhishek Halder, Efstathios Bakolas

Abstract: In this work, we analyze the properties of the solution to the covariance steering problem for discrete time Gaussian linear systems with a squared Wasserstein distance terminal cost. In our previous work, we have shown that by utilizing the state feedback control policy parametrization, this stochastic optimal control problem can be associated with a difference of convex functions program. Here,… ▽ More In this work, we analyze the properties of the solution to the covariance steering problem for discrete time Gaussian linear systems with a squared Wasserstein distance terminal cost. In our previous work, we have shown that by utilizing the state feedback control policy parametrization, this stochastic optimal control problem can be associated with a difference of convex functions program. Here, we revisit the same covariance control problem but this time we focus on the analysis of the problem. Specifically, we establish the existence of solutions to the optimization problem and derive the first and second order conditions for optimality. We provide analytic expressions for the gradient and the Hessian of the performance index by utilizing specialized tools from matrix calculus. Subsequently, we prove that the optimization problem always admits a global minimizer, and finally, we provide a sufficient condition for the performance index to be a strictly convex function (under the latter condition, the problem admits a unique global minimizer). In particular, we show that when the terminal state covariance is upper bounded, with respect to the Löwner partial order, by the covariance matrix of the desired terminal normal distribution, then our problem admits a unique global minimizing state feedback gain. The results of this paper set the stage for the development of specialized control design tools that exploit the structure of the solution to the covariance steering problem with a squared Wasserstein distance terminal cost. △ Less

Submitted 24 March, 2021; originally announced March 2021.

arXiv:2103.03363 [pdf, ps, other]

Koopman Operator Based Modeling for Quadrotor Control on $SE(3)$

Authors: Vrushabh Zinage, Efstathios Bakolas

Abstract: In this paper, we propose a Koopman operator based approach to describe the nonlinear dynamics of a quadrotor on SE(3) in terms of an infinite-dimensional linear system which evolves in the space of observable functions (lifted space) and which is more appropriate for control design purposes. The major challenge when using the Koopman operator is the characterization of a set of observable functio… ▽ More In this paper, we propose a Koopman operator based approach to describe the nonlinear dynamics of a quadrotor on SE(3) in terms of an infinite-dimensional linear system which evolves in the space of observable functions (lifted space) and which is more appropriate for control design purposes. The major challenge when using the Koopman operator is the characterization of a set of observable functions that can span the lifted space. Recent methods either use tools from machine learning to learn the observable functions or guess a suitable set of observables that best describes the nonlinear dynamics. Instead of guessing or learning the observables, in this work we derive them in a systematic way for the quadrotor dynamics on SE(3). In addition, we prove that the proposed sequence of observable functions converges pointwise to the zero function, which allows us to select only a finite set of observable functions to form (an approximation of) the lifted space. Our theoretical analysis is also confirmed by numerical simulations which demonstrate that by increasing the dimension of the lifted space, the derived linear state space model can approximate the nonlinear quadrotor dynamics more accurately. △ Less

Submitted 9 June, 2021; v1 submitted 4 March, 2021; originally announced March 2021.

Comments: 7 pages, 4 figures

arXiv:2011.05394 [pdf, ps, other]

Minimum Variance and Covariance Steering Based on Affine Disturbance Feedback Control Parameterization

Authors: Efstathios Bakolas

Abstract: The goal of this paper is to address finite-horizon minimum variance and covariance steering problems for discrete-time stochastic (Gaussian) linear systems. On the one hand, the minimum variance problem seeks for a control policy that will steer the state mean of an uncertain system to a prescribed quantity while minimizing the trace of its terminal state covariance (or variance). On the other ha… ▽ More The goal of this paper is to address finite-horizon minimum variance and covariance steering problems for discrete-time stochastic (Gaussian) linear systems. On the one hand, the minimum variance problem seeks for a control policy that will steer the state mean of an uncertain system to a prescribed quantity while minimizing the trace of its terminal state covariance (or variance). On the other hand, the covariance steering problem seeks for a control policy that will steer the covariance of the terminal state to a prescribed positive definite matrix. We propose a solution approach that relies on the stochastic version of the affine disturbance feedback control parametrization according to which the control input at each stage can be expressed as an affine function of the history of disturbances that have acted upon the system. Our analysis reveals that this particular parametrization allows one to reduce the stochastic optimal control problems considered herein into tractable convex programs with essentially the same decision variables. This is in contrast with other control policy parametrizations, such as the state feedback parametrization, in which the decision variables of the convex program do not coincide with the controller's parameters of the stochastic optimal control problem. In addition, we propose a variation of the control parametrization which relies on truncated histories of past disturbances. We show that by selecting the length of the truncated sequences appropriately, we can design suboptimal controllers which can strike the desired balance between performance and computational cost. △ Less

Submitted 10 November, 2020; originally announced November 2020.

Comments: 12 pages

arXiv:2010.00778 [pdf, other]

Nonlinear Covariance Steering using Variational Gaussian Process Predictive Models

Authors: Alexandros Tsolovikos, Efstathios Bakolas

Abstract: In this work, we consider the problem of steering the first two moments of the uncertain state of an unknown discrete-time stochastic nonlinear system to a given terminal distribution in finite time. Toward that goal, first, a non-parametric predictive model is learned from a set of available training data points using stochastic variational Gaussian process regression: a powerful and scalable mac… ▽ More In this work, we consider the problem of steering the first two moments of the uncertain state of an unknown discrete-time stochastic nonlinear system to a given terminal distribution in finite time. Toward that goal, first, a non-parametric predictive model is learned from a set of available training data points using stochastic variational Gaussian process regression: a powerful and scalable machine learning tool for learning distributions over arbitrary nonlinear functions. Second, we formulate a tractable nonlinear covariance steering algorithm that utilizes the Gaussian process predictive model to compute a feedback policy that will drive the distribution of the state of the system close to the goal distribution. In particular, we implement a greedy covariance steering control policy that linearizes at each time step the Gaussian process model around the latest predicted mean and covariance, solves the linear covariance steering control problem, and applies only the first control law. The state uncertainty under the latest feedback control policy is then propagated using the unscented transform with the learned Gaussian process predictive model and the algorithm proceeds to the next time step. Numerical simulations illustrating the main ideas of this paper are also presented. △ Less

Submitted 1 April, 2021; v1 submitted 30 September, 2020; originally announced October 2020.

Comments: 7 pages, 3 figures, submitted to the Modeling, Estimation and Control Conference 2021

arXiv:2009.14407 [pdf, other]

doi 10.23919/ACC50511.2021.9482912

Relay Pursuit of an Evader by a Heterogeneous Group of Pursuers Using Potential Games

Authors: Yoonjae Lee, Efstathios Bakolas

Abstract: We propose a decentralized solution for a pursuit-evasion game involving a heterogeneous group of rational (selfish) pursuers and a single evader based on the framework of potential games. In the proposed game, the evader aims to delay (or, if possible, avoid) capture by any of the pursuers whereas each pursuer tries to capture the latter only if this is to his best interest. Our approach resemble… ▽ More We propose a decentralized solution for a pursuit-evasion game involving a heterogeneous group of rational (selfish) pursuers and a single evader based on the framework of potential games. In the proposed game, the evader aims to delay (or, if possible, avoid) capture by any of the pursuers whereas each pursuer tries to capture the latter only if this is to his best interest. Our approach resembles in principle the so-called relay pursuit strategy introduced in [1], in which only the pursuer that can capture the evader faster than the others is active. In sharp contrast with the latter approach, the active pursuer herein is not determined by a reactive ad-hoc rule but from the solution of a corresponding potential game. We assume that each pursuer has different capabilities and his decision whether to go after the evader or not is based on the maximization of his individual utility (conditional on the choices and actions of the other pursuers). The pursuers' utilities depend on both the rewards that they will receive by capturing the evader and the time of capture (cost of capturing the evader) so that a pursuer should only seek capture when the incurred cost is relatively small. The determination of the active pursuer-evader assignments (in other words, which pursuers should be active) is done iteratively by having the pursuers exchange information and updating their own actions by executing a learning algorithm for games known as Spatial Adaptive Play (SAP). We illustrate the performance of our algorithm by means of extensive numerical simulations. △ Less

Submitted 29 September, 2020; originally announced September 2020.

Comments: 7 pages, 3 figures

arXiv:2009.14252 [pdf, other]

Covariance Steering of Discrete-Time Stochastic Linear Systems Based on Distribution Distance Terminal Costs

Authors: Isin M. Balci, Efstathios Bakolas

Abstract: We consider a class of stochastic optimal control problems for discrete-time stochastic linear systems which seek for control policies that will steer the probability distribution of the terminal state of the system close to a desired Gaussian distribution. In our problem formulation, the closeness between the terminal state distribution and the desired (goal) distribution is measured in terms of… ▽ More We consider a class of stochastic optimal control problems for discrete-time stochastic linear systems which seek for control policies that will steer the probability distribution of the terminal state of the system close to a desired Gaussian distribution. In our problem formulation, the closeness between the terminal state distribution and the desired (goal) distribution is measured in terms of the squared Wasserstein distance which is associated with a corresponding terminal cost term. We recast the stochastic optimal control problem as a finite-dimensional nonlinear program and we show that its performance index can be expressed as the difference of two convex functions. This representation of the performance index allows us to find local minimizers of the original nonlinear program via the so-called convex-concave procedure [1]. Subsequently, we consider a similar problem but this time we use a terminal cost that corresponds to the KL divergence. Finally, we present non-trivial numerical simulations to demonstrate the proposed techniques and compare them in terms of computation time. △ Less

Submitted 29 September, 2020; originally announced September 2020.

arXiv:2009.13787 [pdf, ps, other]

Far-Field Minimum-Fuel Spacecraft Rendezvous using Koopman Operator and $\ell_2/\ell_1$ Optimization

Authors: Vrushabh Zinage, Efstathios Bakolas

Abstract: We propose a method to compute approximate solutions to the minimum-fuel far-field rendezvous problem for thrust-vectoring spacecraft. It is well-known that the use of linearized spacecraft rendezvous equations may not give sufficiently accurate results for far-field rendezvous. In particular, as the distance between the active and the target spacecraft becomes significantly greater than the dista… ▽ More We propose a method to compute approximate solutions to the minimum-fuel far-field rendezvous problem for thrust-vectoring spacecraft. It is well-known that the use of linearized spacecraft rendezvous equations may not give sufficiently accurate results for far-field rendezvous. In particular, as the distance between the active and the target spacecraft becomes significantly greater than the distance between the target spacecraft and the center of gravity of the planet, the accuracy of linearization-based control design approaches may decline substantially. In this paper, we use a nonlinear state space model which corresponds to more accurate description of dynamics than linearized models but at the same time poses the known challenges of nonlinear control design. To overcome these challenges, we utilize a Koopman operator based approach with which the nonlinear spacecraft rendezvous dynamics is lifted into a higher dimensional space over which the nonlinear dynamics can be approximated by a linear system which is more suitable for control design purposes than the original nonlinear model. An Iteratively Recursive Least Squares (IRLS) algorithm from compressive sensing is then used to solve the minimum fuel control problem based on the lifted linear system. Numerical simulations are performed to show the efficacy of the proposed Koopman operator based approach. △ Less

Submitted 29 September, 2020; originally announced September 2020.

arXiv:2009.08628 [pdf, other]

doi 10.23919/ACC50511.2021.9483030

Decentralized Game-Theoretic Control for Dynamic Task Allocation Problems for Multi-Agent Systems

Authors: Efstathios Bakolas, Yoonjae Lee

Abstract: We propose a decentralized game-theoretic framework for dynamic task allocation problems for multi-agent systems. In our problem formulation, the agents' utilities depend on both the rewards and the costs associated with the successful completion of the tasks assigned to them. The rewards reflect how likely is for the agents to accomplish their assigned tasks whereas the costs reflect the effort n… ▽ More We propose a decentralized game-theoretic framework for dynamic task allocation problems for multi-agent systems. In our problem formulation, the agents' utilities depend on both the rewards and the costs associated with the successful completion of the tasks assigned to them. The rewards reflect how likely is for the agents to accomplish their assigned tasks whereas the costs reflect the effort needed to complete these tasks (this effort is determined by the solution of corresponding optimal control problems). The task allocation problem considered herein corresponds to a dynamic game whose solution depends on the states of the agents in contrast with classic static (or single-act) game formulations. We propose a greedy solution approach in which the agents negotiate with each other to find a mutually agreeable (or individually rational) task assignment profile based on evaluations of the task utilities that reflect their current states. We illustrate the main ideas of this work by means of extensive numerical simulations. △ Less

Submitted 29 September, 2020; v1 submitted 18 September, 2020; originally announced September 2020.

Comments: 8 pages, 3 figures

arXiv:2009.05891 [pdf, other]

MPC-Based Hierarchical Task Space Control of Underactuated and Constrained Robots for Execution of Multiple Tasks

Authors: Jaemin Lee, Seung Hyeon Bang, Efstathios Bakolas, Luis Sentis

Abstract: This paper proposes an MPC-based controller to efficiently execute multiple hierarchical tasks for underactuated and constrained robotic systems. Existing task-space controllers or whole-body controllers solve instantaneous optimization problems given task trajectories and the robot plant dynamics. However, the task-space control method we propose here relies on the prediction of future state traj… ▽ More This paper proposes an MPC-based controller to efficiently execute multiple hierarchical tasks for underactuated and constrained robotic systems. Existing task-space controllers or whole-body controllers solve instantaneous optimization problems given task trajectories and the robot plant dynamics. However, the task-space control method we propose here relies on the prediction of future state trajectories and the corresponding costs-to-go terms over a finite time-horizon for computing control commands. We employ acceleration energy error as the performance index for the optimization problem and extend it over the finite-time horizon of our MPC. Our approach employs quadratically constrained quadratic programming, which includes quadratic constraints to handle multiple hierarchical tasks, and is computationally more efficient than nonlinear MPC-based approaches that rely on nonlinear programming. We validate our approach using numerical simulations of a new type of robot manipulator system, which contains underactuated and constrained mechanical structures. △ Less

Submitted 12 September, 2020; originally announced September 2020.

Comments: 8 pages, 5 figures

arXiv:2005.05392 [pdf, other]

doi 10.1109/TCNS.2020.3002984

Workspace Partitioning and Topology Discovery Algorithms for Heterogeneous Multi-Agent Networks

Authors: Efstathios Bakolas

Abstract: In this paper, we consider a class of workspace partitioning problems that arise in the context of area coverage and spatial load balancing for spatially distributed heterogeneous multi-agent networks. It is assumed that each agent has certain directions of motion or directions for sensing and exploration that are more preferable than others. These preferences are measured by means of convex and a… ▽ More In this paper, we consider a class of workspace partitioning problems that arise in the context of area coverage and spatial load balancing for spatially distributed heterogeneous multi-agent networks. It is assumed that each agent has certain directions of motion or directions for sensing and exploration that are more preferable than others. These preferences are measured by means of convex and anisotropic (direction-dependent) quadratic proximity metrics which are, in general, different for each agent. These proximity metrics induce Voronoi-like partitions of the network's workspace that are comprised of cells which may not always be convex (or even connected) sets but are necessarily contained in ellipsoids that are known to their corresponding agents. The main contributions of this work are 1) a distributed algorithm for the computation of a Voronoi-like partition of the workspace of a heterogeneous multi-agent network and 2) a systematic process to discover the network topology induced by the latter Voronoi-like partition. Numerical simulations that illustrate the efficacy of the proposed algorithms are also presented. △ Less

Submitted 11 May, 2020; originally announced May 2020.

Comments: 20 pages, 8 figures

arXiv:2003.03727 [pdf, other]

Min-Max Q-Learning for Multi-Player Pursuit-Evasion Games

Authors: Jhanani Selvakumar, Efstathios Bakolas

Abstract: In this paper, we address a pursuit-evasion game involving multiple players by utilizing tools and techniques from reinforcement learning and matrix game theory. In particular, we consider the problem of steering an evader to a goal destination while avoiding capture by multiple pursuers, which is a high-dimensional and computationally intractable problem in general. In our proposed approach, we f… ▽ More In this paper, we address a pursuit-evasion game involving multiple players by utilizing tools and techniques from reinforcement learning and matrix game theory. In particular, we consider the problem of steering an evader to a goal destination while avoiding capture by multiple pursuers, which is a high-dimensional and computationally intractable problem in general. In our proposed approach, we first formulate the multi-agent pursuit-evasion game as a sequence of discrete matrix games. Next, in order to simplify the solution process, we transform the high-dimensional state space into a low-dimensional manifold and the continuous action space into a feature-based space, which is a discrete abstraction of the original space. Based on these transformed state and action spaces, we subsequently employ min-max Q-learning, to generate the entries of the payoff matrix of the game, and subsequently obtain the optimal action for the evader at each stage. Finally, we present extensive numerical simulations to evaluate the performance of the proposed learning-based evading strategy in terms of the evader's ability to reach the desired target location without being captured, as well as computational efficiency. △ Less

Submitted 8 March, 2020; originally announced March 2020.

arXiv:2003.03679 [pdf, ps, other]

Greedy Finite-Horizon Covariance Steering for Discrete-Time Stochastic Nonlinear Systems Based on the Unscented Transform

Authors: Efstathios Bakolas, Alexandros Tsolovikos

Abstract: In this work, we consider the problem of steering the first two moments of the uncertain state of a discrete time nonlinear stochastic system to prescribed goal quantities at a given final time. In principle, the latter problem can be formulated as a density tracking problem, which seeks for a feedback policy that will keep the probability density function of the state of the system close, in term… ▽ More In this work, we consider the problem of steering the first two moments of the uncertain state of a discrete time nonlinear stochastic system to prescribed goal quantities at a given final time. In principle, the latter problem can be formulated as a density tracking problem, which seeks for a feedback policy that will keep the probability density function of the state of the system close, in terms of an appropriate metric, to the goal density. The solution to the latter infinite-dimensional problem can be, however, a complex and computationally expensive task. Instead, we propose a more tractable and intuitive approach which relies on a greedy control policy. The latter control policy is comprised of the first elements of the control policies that solve a sequence of corresponding linearized covariance steering problems. Each of these covariance steering problems relies only on information available about the state mean and state covariance at the current stage and can be formulated as a tractable (finite-dimensional) convex program. At each stage, the information on the state statistics is updated by computing approximations of the predicted state mean and covariance of the resulting closed-loop nonlinear system at the next stage by utilizing the (scaled) unscented transform. Numerical simulations that illustrate the key ideas of our approach are also presented. △ Less

Submitted 29 September, 2020; v1 submitted 7 March, 2020; originally announced March 2020.

arXiv:1903.11163 [pdf, other]

Efficient Trajectory Generation for Robotic Systems Constrained by Contact Forces

Authors: Jaemin Lee, Efstathios Bakolas, Luis Sentis

Abstract: In this work, we propose a trajectory generation method for robotic systems with contact force constraint based on optimal control and reachability analysis. Normally, the dynamics and constraints of the contact-constrained robot are nonlinear and coupled to each other. Instead of linearizing the model and constraints, we directly solve the optimal control problem to obtain the feasible state traj… ▽ More In this work, we propose a trajectory generation method for robotic systems with contact force constraint based on optimal control and reachability analysis. Normally, the dynamics and constraints of the contact-constrained robot are nonlinear and coupled to each other. Instead of linearizing the model and constraints, we directly solve the optimal control problem to obtain the feasible state trajectory and the control input of the system. A tractable optimal control problem is formulated which is addressed by dual approaches, which are sampling-based dynamic programming and rigorous reachability analysis. The sampling-based method and Partially Observable Markov Decision Process (POMDP) are used to break down the end-to-end trajectory generation problem via sample-wise optimization in terms of given conditions. The result generates sequential pairs of subregions to be passed to reach the final goal. The reachability analysis ensures that we will find at least one trajectory starting from a given initial state and going through a sequence of subregions. The distinctive contributions of our method are to enable handling the intricate contact constraint coupled with system's dynamics due to the reduction of computational complexity of the algorithm. We validate our method using extensive numerical simulations with a legged robot. △ Less

Submitted 26 March, 2019; originally announced March 2019.

Comments: 12 pages, 5 figures

arXiv:1809.10598 [pdf, other]

Trajectory Generation for Robotic Systems with Contact Force Constraints

Authors: Jaemin Lee, Efstathios Bakolas, Luis Sentis

Abstract: This paper presents a trajectory generation method for contact-constrained robotic systems such as manipulators and legged robots. Contact-constrained systems are affected by the interaction forces between the robot and the environment. In turn, these forces determine and constrain state reachability of the robot parts or end effectors. Our study subdivides the trajectory generation problem and th… ▽ More This paper presents a trajectory generation method for contact-constrained robotic systems such as manipulators and legged robots. Contact-constrained systems are affected by the interaction forces between the robot and the environment. In turn, these forces determine and constrain state reachability of the robot parts or end effectors. Our study subdivides the trajectory generation problem and the supporting reachability analysis into tractable subproblems consisting of a sampling problem, a convex optimization problem, and a nonlinear programming problem. Our method leads to significant reduction of computational cost. The proposed approach is validated using a realistic simulated contact-constrained robotic system. △ Less

Submitted 27 September, 2018; originally announced September 2018.

Comments: 8 pages, 6 figures

Showing 1–42 of 42 results for author: Bakolas, E