Search | arXiv e-print repository

Second-Order Algorithms for Finding Local Nash Equilibria in Zero-Sum Games

Authors: Kushagra Gupta, Xinjie Liu, Ufuk Topcu, David Fridovich-Keil

Abstract: Zero-sum games arise in a wide variety of problems, including robust optimization and adversarial learning. However, algorithms deployed for finding a local Nash equilibrium in these games often converge to non-Nash stationary points. This highlights a key challenge: for any algorithm, the stability properties of its underlying dynamical system can cause non-Nash points to be potential attractors.… ▽ More Zero-sum games arise in a wide variety of problems, including robust optimization and adversarial learning. However, algorithms deployed for finding a local Nash equilibrium in these games often converge to non-Nash stationary points. This highlights a key challenge: for any algorithm, the stability properties of its underlying dynamical system can cause non-Nash points to be potential attractors. To overcome this challenge, algorithms must account for subtleties involving the curvatures of players' costs. To this end, we leverage dynamical system theory and develop a second-order algorithm for finding a local Nash equilibrium in the smooth, possibly nonconvex-nonconcave, zero-sum game setting. First, we prove that this novel method guarantees convergence to only local Nash equilibria with a local linear convergence rate. We then interpret a version of this method as a modified Gauss-Newton algorithm with local superlinear convergence to the neighborhood of a point that satisfies first-order local Nash equilibrium conditions. In comparison, current related state-of-the-art methods do not offer convergence rate guarantees. Furthermore, we show that this approach naturally generalizes to settings with convex and potentially coupled constraints while retaining earlier guarantees of convergence to only local (generalized) Nash equilibria. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2404.00733 [pdf, other]

Smooth Information Gathering in Two-Player Noncooperative Games

Authors: Fernando Palafox, Jesse Milzman, Dong Ho Lee, Ryan Park, David Fridovich-Keil

Abstract: We present a mathematical framework for modeling two-player noncooperative games in which one player (the defender) is uncertain of the costs of the game and the second player's (the attacker's) intention but can preemptively allocate information-gathering resources to reduce this uncertainty. We obtain the defender's decisions by solving a two-stage problem. In Stage 1, the defender allocates inf… ▽ More We present a mathematical framework for modeling two-player noncooperative games in which one player (the defender) is uncertain of the costs of the game and the second player's (the attacker's) intention but can preemptively allocate information-gathering resources to reduce this uncertainty. We obtain the defender's decisions by solving a two-stage problem. In Stage 1, the defender allocates information-gathering resources, and in Stage 2, the information-gathering resources output a signal that informs the defender about the costs of the game and the attacker's intent, and then both players play a noncooperative game. We provide a gradient-based algorithm to solve the two-stage game and apply this framework to a tower-defense game which can be interpreted as a variant of a Colonel Blotto game with smooth payoff functions and uncertainty over battlefield valuations. Finally, we analyze how optimal decisions shift with changes in information-gathering allocations and perturbations in the cost functions. △ Less

Submitted 31 March, 2024; originally announced April 2024.

Comments: https://github.com/CLeARoboticsLab/GamesVoI.jl

arXiv:2403.12210 [pdf, other]

Decomposing Control Lyapunov Functions for Efficient Reinforcement Learning

Authors: Antonio Lopez, David Fridovich-Keil

Abstract: Recent methods using Reinforcement Learning (RL) have proven to be successful for training intelligent agents in unknown environments. However, RL has not been applied widely in real-world robotics scenarios. This is because current state-of-the-art RL methods require large amounts of data to learn a specific task, leading to unreasonable costs when deploying the agent to collect data in real-worl… ▽ More Recent methods using Reinforcement Learning (RL) have proven to be successful for training intelligent agents in unknown environments. However, RL has not been applied widely in real-world robotics scenarios. This is because current state-of-the-art RL methods require large amounts of data to learn a specific task, leading to unreasonable costs when deploying the agent to collect data in real-world applications. In this paper, we build from existing work that reshapes the reward function in RL by introducing a Control Lyapunov Function (CLF), which is demonstrated to reduce the sample complexity. Still, this formulation requires knowing a CLF of the system, but due to the lack of a general method, it is often a challenge to identify a suitable CLF. Existing work can compute low-dimensional CLFs via a Hamilton-Jacobi reachability procedure. However, this class of methods becomes intractable on high-dimensional systems, a problem that we address by using a system decomposition technique to compute what we call Decomposed Control Lyapunov Functions (DCLFs). We use the computed DCLF for reward sha**, which we show improves RL performance. Through multiple examples, we demonstrate the effectiveness of this approach, where our method finds a policy to successfully land a quadcopter in less than half the amount of real-world data required by the state-of-the-art Soft-Actor Critic algorithm. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.10384 [pdf, other]

Coordination in Noncooperative Multiplayer Matrix Games via Reduced Rank Correlated Equilibria

Authors: Jaehan Im, Yue Yu, David Fridovich-Keil, Ufuk Topcu

Abstract: Coordination in multiplayer games enables players to avoid the lose-lose outcome that often arises at Nash equilibria. However, designing a coordination mechanism typically requires the consideration of the joint actions of all players, which becomes intractable in large-scale games. We develop a novel coordination mechanism, termed reduced rank correlated equilibria, which reduces the number of j… ▽ More Coordination in multiplayer games enables players to avoid the lose-lose outcome that often arises at Nash equilibria. However, designing a coordination mechanism typically requires the consideration of the joint actions of all players, which becomes intractable in large-scale games. We develop a novel coordination mechanism, termed reduced rank correlated equilibria, which reduces the number of joint actions to be considered and thereby mitigates computational complexity. The idea is to approximate the set of all joint actions with the actions used in a set of pre-computed Nash equilibria via a convex hull operation. In a game with n players and each player having m actions, the proposed mechanism reduces the number of joint actions considered from O(m^n) to O(mn). We demonstrate the application of the proposed mechanism to an air traffic queue management problem. Compared with the correlated equilibrium-a popular benchmark coordination mechanism-the proposed approach is capable of solving a problem involving four thousand times more joint actions while yielding similar or better performance in terms of a fairness indicator and showing a maximum optimality gap of 0.066% in terms of the average delay cost. In the meantime, it yields a solution that shows up to 99.5% improvement in a fairness indicator and up to 50.4% reduction in average delay cost compared to the Nash solution, which does not involve coordination. △ Less

Submitted 12 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

arXiv:2402.08902 [pdf, other]

Auto-Encoding Bayesian Inverse Games

Authors: Xinjie Liu, Lasse Peters, Javier Alonso-Mora, Ufuk Topcu, David Fridovich-Keil

Abstract: When multiple agents interact in a common environment, each agent's actions impact others' future decisions, and noncooperative dynamic games naturally capture this coupling. In interactive motion planning, however, agents typically do not have access to a complete model of the game, e.g., due to unknown objectives of other players. Therefore, we consider the inverse game problem, in which some pr… ▽ More When multiple agents interact in a common environment, each agent's actions impact others' future decisions, and noncooperative dynamic games naturally capture this coupling. In interactive motion planning, however, agents typically do not have access to a complete model of the game, e.g., due to unknown objectives of other players. Therefore, we consider the inverse game problem, in which some properties of the game are unknown a priori and must be inferred from observations. Existing maximum likelihood estimation (MLE) approaches to solve inverse games provide only point estimates of unknown parameters without quantifying uncertainty, and perform poorly when many parameter values explain the observed behavior. To address these limitations, we take a Bayesian perspective and construct posterior distributions of game parameters. To render inference tractable, we employ a variational autoencoder (VAE) with an embedded differentiable game solver. This structured VAE can be trained from an unlabeled dataset of observed interactions, naturally handles continuous, multi-modal distributions, and supports efficient sampling from the inferred posteriors without computing game solutions at runtime. Extensive evaluations in simulated driving scenarios demonstrate that the proposed approach successfully learns the prior and posterior game parameter distributions, provides more accurate objective estimates than MLE baselines, and facilitates safer and more efficient game-theoretic motion planning. △ Less

Submitted 15 June, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

arXiv:2401.15745 [pdf, other]

The computation of approximate feedback Stackelberg equilibria in multi-player nonlinear constrained dynamic games

Authors: **gqi Li, Somayeh Sojoudi, Claire Tomlin, David Fridovich-Keil

Abstract: Solving feedback Stackelberg games with nonlinear dynamics and coupled constraints, a common scenario in practice, presents significant challenges. This work introduces an efficient method for computing local feedback Stackelberg policies in multi-player general-sum dynamic games, with continuous state and action spaces. Different from existing (approximate) dynamic programming solutions that are… ▽ More Solving feedback Stackelberg games with nonlinear dynamics and coupled constraints, a common scenario in practice, presents significant challenges. This work introduces an efficient method for computing local feedback Stackelberg policies in multi-player general-sum dynamic games, with continuous state and action spaces. Different from existing (approximate) dynamic programming solutions that are primarily designed for unconstrained problems, our approach involves reformulating a feedback Stackelberg dynamic game into a sequence of nested optimization problems, enabling the derivation of Karush-Kuhn-Tucker (KKT) conditions and the establishment of a second-order sufficient condition for local feedback Stackelberg policies. We propose a Newton-style primal-dual interior point method for solving constrained linear quadratic (LQ) feedback Stackelberg games, offering provable convergence guarantees. Our method is further extended to compute local feedback Stackelberg policies for more general nonlinear games by iteratively approximating them using LQ games, ensuring that their KKT conditions are locally aligned with those of the original nonlinear games. We prove the exponential convergence of our algorithm in constrained nonlinear games. In a feedback Stackelberg game with nonlinear dynamics and (nonconvex) coupled costs and constraints, our experimental results reveal the algorithm's ability to handle infeasible initial conditions and achieve exponential convergence towards an approximate local feedback Stackelberg equilibrium. △ Less

Submitted 15 February, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

Comments: This manuscript is currently under review by SIAM Journal on Optimization

arXiv:2309.11076 [pdf, other]

Symbolic Regression on Sparse and Noisy Data with Gaussian Processes

Authors: Junette Hsin, Shubhankar Agarwal, Adam Thorpe, Luis Sentis, David Fridovich-Keil

Abstract: In this paper, we address the challenge of deriving dynamical models from sparse and noisy data. High-quality data is crucial for symbolic regression algorithms; limited and noisy data can present modeling challenges. To overcome this, we combine Gaussian process regression with a sparse identification of nonlinear dynamics (SINDy) method to denoise the data and identify nonlinear dynamical equati… ▽ More In this paper, we address the challenge of deriving dynamical models from sparse and noisy data. High-quality data is crucial for symbolic regression algorithms; limited and noisy data can present modeling challenges. To overcome this, we combine Gaussian process regression with a sparse identification of nonlinear dynamics (SINDy) method to denoise the data and identify nonlinear dynamical equations. Our simple approach offers improved robustness with sparse, noisy data compared to SINDy alone. We demonstrate its effectiveness on a Lotka-Volterra model, a unicycle dynamic model in simulation, and hardware data from an NVIDIA JetRacer system. We show superior performance over baselines including 20.78% improvement over SINDy and 61.92% improvement over SSR in predicting future trajectories from discovered dynamics. △ Less

Submitted 27 March, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

Comments: Submitted to CDC 2024

arXiv:2308.11546 [pdf, other]

Risk-Minimizing Two-Player Zero-Sum Stochastic Differential Game via Path Integral Control

Authors: Apurva Patil, Yu**g Zhou, David Fridovich-Keil, Takashi Tanaka

Abstract: This paper addresses a continuous-time risk-minimizing two-player zero-sum stochastic differential game (SDG), in which each player aims to minimize its probability of failure. Failure occurs in the event when the state of the game enters into predefined undesirable domains, and one player's failure is the other's success. We derive a sufficient condition for this game to have a saddle-point equil… ▽ More This paper addresses a continuous-time risk-minimizing two-player zero-sum stochastic differential game (SDG), in which each player aims to minimize its probability of failure. Failure occurs in the event when the state of the game enters into predefined undesirable domains, and one player's failure is the other's success. We derive a sufficient condition for this game to have a saddle-point equilibrium and show that it can be solved via a Hamilton-Jacobi-Isaacs (HJI) partial differential equation (PDE) with Dirichlet boundary condition. Under certain assumptions on the system dynamics and cost function, we establish the existence and uniqueness of the saddle-point of the game. We provide explicit expressions for the saddle-point policies which can be numerically evaluated using path integral control. This allows us to solve the game online via Monte Carlo sampling of system trajectories. We implement our control synthesis framework on two classes of risk-minimizing zero-sum SDGs: a disturbance attenuation problem and a pursuit-evasion game. Simulation studies are presented to validate the proposed control synthesis framework. △ Less

Submitted 22 August, 2023; originally announced August 2023.

Comments: 8 pages, 4 figures, CDC 2023

arXiv:2308.08017 [pdf, other]

Active Inverse Learning in Stackelberg Trajectory Games

Authors: Yue Yu, Jacob Levy, Negar Mehr, David Fridovich-Keil, Ufuk Topcu

Abstract: Game-theoretic inverse learning is the problem of inferring the players' objectives from their actions. We formulate an inverse learning problem in a Stackelberg game between a leader and a follower, where each player's action is the trajectory of a dynamical system. We propose an active inverse learning method for the leader to infer which hypothesis among a finite set of candidates describes the… ▽ More Game-theoretic inverse learning is the problem of inferring the players' objectives from their actions. We formulate an inverse learning problem in a Stackelberg game between a leader and a follower, where each player's action is the trajectory of a dynamical system. We propose an active inverse learning method for the leader to infer which hypothesis among a finite set of candidates describes the follower's objective function. Instead of using passively observed trajectories like existing methods, the proposed method actively maximizes the differences in the follower's trajectories under different hypotheses to accelerate the leader's inference. We demonstrate the proposed method in a receding-horizon repeated trajectory game. Compared with uniformly random inputs, the leader inputs provided by the proposed method accelerate the convergence of the probability of different hypotheses conditioned on the follower's trajectory by orders of magnitude. △ Less

Submitted 15 August, 2023; originally announced August 2023.

arXiv:2304.01945 [pdf, other]

Scenario-Game ADMM: A Parallelized Scenario-Based Solver for Stochastic Noncooperative Games

Authors: **gqi Li, Chih-Yuan Chiu, Lasse Peters, Fernando Palafox, Mustafa Karabag, Javier Alonso-Mora, Somayeh Sojoudi, Claire Tomlin, David Fridovich-Keil

Abstract: Decision-making in multi-player games can be extremely challenging, particularly under uncertainty. In this work, we propose a new sample-based approximation to a class of stochastic, general-sum, pure Nash games, where each player has an expected-value objective and a set of chance constraints. This new approximation scheme inherits the accuracy of objective approximation from the established sam… ▽ More Decision-making in multi-player games can be extremely challenging, particularly under uncertainty. In this work, we propose a new sample-based approximation to a class of stochastic, general-sum, pure Nash games, where each player has an expected-value objective and a set of chance constraints. This new approximation scheme inherits the accuracy of objective approximation from the established sample average approximation (SAA) method and enjoys a feasibility guarantee derived from the scenario optimization literature. We characterize the sample complexity of this new game-theoretic approximation scheme, and observe that high accuracy usually requires a large number of samples, which results in a large number of sampled constraints. To accommodate this, we decompose the approximated game into a set of smaller games with few constraints for each sampled scenario, and propose a decentralized, consensus-based ADMM algorithm to efficiently compute a generalized Nash equilibrium (GNE) of the approximated game. We prove the convergence of our algorithm to a GNE and empirically demonstrate superior performance relative to a recent baseline algorithm based on ADMM and interior point method. △ Less

Submitted 13 September, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

arXiv:2301.07822 [pdf, other]

GrAVITree: Graph-based Approximate Value Function In a Tree

Authors: Patrick H. Washington, David Fridovich-Keil, Mac Schwager

Abstract: In this paper, we introduce GrAVITree, a tree- and sampling-based algorithm to compute a near-optimal value function and corresponding feedback policy for indefinite time-horizon, terminal state-constrained nonlinear optimal control problems. Our algorithm is suitable for arbitrary nonlinear control systems with both state and input constraints. The algorithm works by sampling feasible control inp… ▽ More In this paper, we introduce GrAVITree, a tree- and sampling-based algorithm to compute a near-optimal value function and corresponding feedback policy for indefinite time-horizon, terminal state-constrained nonlinear optimal control problems. Our algorithm is suitable for arbitrary nonlinear control systems with both state and input constraints. The algorithm works by sampling feasible control inputs and branching backwards in time from the terminal state to build the tree, thereby associating each vertex in the tree with a feasible control sequence to reach the terminal state. Additionally, we embed this stochastic tree within a larger graph structure, rewiring of which enables rapid adaptation to changes in problem structure due to, e.g., newly detected obstacles. Because our method reasons about global problem structure without relying on (potentially imprecise) derivative information, it is particularly well suited to controlling a system based on an imperfect deep neural network model of its dynamics. We demonstrate this capability in the context of an inverted pendulum, where we use a learned model of the pendulum with actuator limits and achieve robust stabilization in settings where competing graph-based and derivative-based techniques fail. △ Less

Submitted 18 January, 2023; originally announced January 2023.

Comments: 6 pages. 8 figures. Submitted to the 2023 American Control Conference

arXiv:2301.01398 [pdf, other]

Cost Inference for Feedback Dynamic Games from Noisy Partial State Observations and Incomplete Trajectories

Authors: **gqi Li, Chih-Yuan Chiu, Lasse Peters, Somayeh Sojoudi, Claire Tomlin, David Fridovich-Keil

Abstract: In multi-agent dynamic games, the Nash equilibrium state trajectory of each agent is determined by its cost function and the information pattern of the game. However, the cost and trajectory of each agent may be unavailable to the other agents. Prior work on using partial observations to infer the costs in dynamic games assumes an open-loop information pattern. In this work, we demonstrate that th… ▽ More In multi-agent dynamic games, the Nash equilibrium state trajectory of each agent is determined by its cost function and the information pattern of the game. However, the cost and trajectory of each agent may be unavailable to the other agents. Prior work on using partial observations to infer the costs in dynamic games assumes an open-loop information pattern. In this work, we demonstrate that the feedback Nash equilibrium concept is more expressive and encodes more complex behavior. It is desirable to develop specific tools for inferring players' objectives in feedback games. Therefore, we consider the dynamic game cost inference problem under the feedback information pattern, using only partial state observations and incomplete trajectory data. To this end, we first propose an inverse feedback game loss function, whose minimizer yields a feedback Nash equilibrium state trajectory closest to the observation data. We characterize the landscape and differentiability of the loss function. Given the difficulty of obtaining the exact gradient, our main contribution is an efficient gradient approximator, which enables a novel inverse feedback game solver that minimizes the loss using first-order optimization. In thorough empirical evaluations, we demonstrate that our algorithm converges reliably and has better robustness and generalization performance than the open-loop baseline method when the observation data reflects a group of players acting in a feedback Nash game. △ Less

Submitted 3 January, 2023; originally announced January 2023.

Comments: Accepted by AAMAS 2023. This is a preprint version

arXiv:2207.06392 [pdf, other]

Relationship Design for Socially-Aware Behavior in Static Games

Authors: Shenghui Chen, Yigit E. Bayiz, David Fridovich-Keil, Ufuk Topcu

Abstract: Autonomous agents can adopt socially-aware behaviors to reduce social costs, mimicking the way animals interact in nature and humans in society. We present a new approach to model socially-aware decision-making that includes two key elements: bounded rationality and inter-agent relationships. We capture the interagent relationships by introducing a novel model called a relationship game and encode… ▽ More Autonomous agents can adopt socially-aware behaviors to reduce social costs, mimicking the way animals interact in nature and humans in society. We present a new approach to model socially-aware decision-making that includes two key elements: bounded rationality and inter-agent relationships. We capture the interagent relationships by introducing a novel model called a relationship game and encode agents' bounded rationality using quantal response equilibria. For each relationship game, we define a social cost function and formulate a mechanism design problem to optimize weights for relationships that minimize social cost at the equilibrium. We address the multiplicity of equilibria by presenting the problem in two forms: Min-Max and Min-Min, aimed respectively at minimization of the highest and lowest social costs in the equilibria. We compute the quantal response equilibrium by solving a least-squares problem defined with its Karush-Kuhn-Tucker conditions, and propose two projected gradient descent algorithms to solve the mechanism design problems. Numerical results, including two-lane congestion and congestion with an ambulance, confirm that these algorithms consistently reach the equilibrium with the intended social costs. △ Less

Submitted 25 January, 2024; v1 submitted 13 July, 2022; originally announced July 2022.

arXiv:2205.00291 [pdf, other]

Learning Mixed Strategies in Trajectory Games

Authors: Lasse Peters, David Fridovich-Keil, Laura Ferranti, Cyrill Stachniss, Javier Alonso-Mora, Forrest Laine

Abstract: In multi-agent settings, game theory is a natural framework for describing the strategic interactions of agents whose objectives depend upon one another's behavior. Trajectory games capture these complex effects by design. In competitive settings, this makes them a more faithful interaction model than traditional "predict then plan" approaches. However, current game-theoretic planning methods have… ▽ More In multi-agent settings, game theory is a natural framework for describing the strategic interactions of agents whose objectives depend upon one another's behavior. Trajectory games capture these complex effects by design. In competitive settings, this makes them a more faithful interaction model than traditional "predict then plan" approaches. However, current game-theoretic planning methods have important limitations. In this work, we propose two main contributions. First, we introduce an offline training phase which reduces the online computational burden of solving trajectory games. Second, we formulate a lifted game which allows players to optimize multiple candidate trajectories in unison and thereby construct more competitive "mixed" strategies. We validate our approach on a number of experiments using the pursuit-evasion game "tag." △ Less

Submitted 3 May, 2022; v1 submitted 30 April, 2022; originally announced May 2022.

arXiv:2203.16690 [pdf, other]

GTP-SLAM: Game-Theoretic Priors for Simultaneous Localization and Map** in Multi-Agent Scenarios

Authors: Chih-Yuan Chiu, David Fridovich-Keil

Abstract: Robots operating in multi-player settings must simultaneously model the environment and the behavior of human or robotic agents who share that environment. This modeling is often approached using Simultaneous Localization and Map** (SLAM); however, SLAM algorithms usually neglect multi-player interactions. In contrast, the motion planning literature often uses dynamic game theory to explicitly m… ▽ More Robots operating in multi-player settings must simultaneously model the environment and the behavior of human or robotic agents who share that environment. This modeling is often approached using Simultaneous Localization and Map** (SLAM); however, SLAM algorithms usually neglect multi-player interactions. In contrast, the motion planning literature often uses dynamic game theory to explicitly model noncooperative interactions of multiple agents in a known environment with perfect localization. Here, we present GTP-SLAM, a novel, iterative best response-based SLAM algorithm that accurately performs state localization and map reconstruction, while using game theoretic priors to capture the inherent non-cooperative interactions among multiple agents in an uncharted scene. By formulating the underlying SLAM problem as a potential game, we inherit a strong convergence guarantee. Empirical results indicate that, when deployed in a realistic traffic simulation, our approach performs localization and map** more accurately than a standard bundle adjustment algorithm across a wide range of noise levels. △ Less

Submitted 8 August, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

Comments: 6 pages, 3 figures

arXiv:2109.07673 [pdf, other]

Back to the Future: Efficient, Time-Consistent Solutions in Reach-Avoid Games

Authors: Dennis R. Anthony, Duy P. Nguyen, David Fridovich-Keil, Jaime F. Fisac

Abstract: We study the class of reach-avoid dynamic games in which multiple agents interact noncooperatively, and each wishes to satisfy a distinct target criterion while avoiding a failure criterion. Reach-avoid games are commonly used to express safety-critical optimal control problems found in mobile robot motion planning. Here, we focus on finding time-consistent solutions, in which future motion plans… ▽ More We study the class of reach-avoid dynamic games in which multiple agents interact noncooperatively, and each wishes to satisfy a distinct target criterion while avoiding a failure criterion. Reach-avoid games are commonly used to express safety-critical optimal control problems found in mobile robot motion planning. Here, we focus on finding time-consistent solutions, in which future motion plans remain optimal even when a robot diverges from the plan early on due to, e.g., intrinsic dynamic uncertainty or extrinsic environment disturbances. Our main contribution is a computationally-efficient algorithm for multi-agent reach-avoid games which renders time-consistent solutions for all players. We demonstrate our approach in two- and three-player simulated driving scenarios, in which our method provides safe control strategies for all agents. △ Less

Submitted 2 March, 2022; v1 submitted 15 September, 2021; originally announced September 2021.

Comments: accepted to ICRA 2022

arXiv:2011.04815 [pdf, other]

Encoding Defensive Driving as a Dynamic Nash Game

Authors: Chih-Yuan Chiu, David Fridovich-Keil, Claire J. Tomlin

Abstract: Robots deployed in real-world environments should operate safely in a robust manner. In scenarios where an "ego" agent navigates in an environment with multiple other "non-ego" agents, two modes of safety are commonly proposed -- adversarial robustness and probabilistic constraint satisfaction. However, while the former is generally computationally intractable and leads to overconservative solutio… ▽ More Robots deployed in real-world environments should operate safely in a robust manner. In scenarios where an "ego" agent navigates in an environment with multiple other "non-ego" agents, two modes of safety are commonly proposed -- adversarial robustness and probabilistic constraint satisfaction. However, while the former is generally computationally intractable and leads to overconservative solutions, the latter typically relies on strong distributional assumptions and ignores strategic coupling between agents. To avoid these drawbacks, we present a novel formulation of robustness within the framework of general-sum dynamic game theory, modeled on defensive driving. More precisely, we prepend an adversarial phase to the ego agent's cost function. That is, we prepend a time interval during which other agents are assumed to be temporarily distracted, in order to render the ego agent's equilibrium trajectory robust against other agents' potentially dangerous behavior during this time. We demonstrate the effectiveness of our new formulation in encoding safety via multiple traffic scenarios. △ Less

Submitted 30 March, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

Comments: Accepted to ICRA 2021

arXiv:2011.00601 [pdf, other]

Approximate Solutions to a Class of Reachability Games

Authors: David Fridovich-Keil, Claire J. Tomlin

Abstract: In this paper, we present a method for finding approximate Nash equilibria in a broad class of reachability games. These games are often used to formulate both collision avoidance and goal satisfaction. Our method is computationally efficient, running in real-time for scenarios involving multiple players and more than ten state dimensions. The proposed approach forms a family of increasingly exact… ▽ More In this paper, we present a method for finding approximate Nash equilibria in a broad class of reachability games. These games are often used to formulate both collision avoidance and goal satisfaction. Our method is computationally efficient, running in real-time for scenarios involving multiple players and more than ten state dimensions. The proposed approach forms a family of increasingly exact approximations to the original game. Our results characterize the quality of these approximations and show operation in a receding horizon, minimally-invasive control context. Additionally, as a special case, our method reduces to local gradient-based optimization in the single-player (optimal control) setting, for which a wide variety of efficient algorithms exist. △ Less

Submitted 20 March, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

Comments: Conference paper at ICRA 2021

arXiv:2002.04354 [pdf, other]

Inference-Based Strategy Alignment for General-Sum Differential Games

Authors: Lasse Peters, David Fridovich-Keil, Claire J. Tomlin, Zachary N. Sunberg

Abstract: In many settings where multiple agents interact, the optimal choices for each agent depend heavily on the choices of the others. These coupled interactions are well-described by a general-sum differential game, in which players have differing objectives, the state evolves in continuous time, and optimal play may be characterized by one of many equilibrium concepts, e.g., a Nash equilibrium. Often,… ▽ More In many settings where multiple agents interact, the optimal choices for each agent depend heavily on the choices of the others. These coupled interactions are well-described by a general-sum differential game, in which players have differing objectives, the state evolves in continuous time, and optimal play may be characterized by one of many equilibrium concepts, e.g., a Nash equilibrium. Often, problems admit multiple equilibria. From the perspective of a single agent in such a game, this multiplicity of solutions can introduce uncertainty about how other agents will behave. This paper proposes a general framework for resolving ambiguity between equilibria by reasoning about the equilibrium other agents are aiming for. We demonstrate this framework in simulations of a multi-player human-robot navigation problem that yields two main conclusions: First, by inferring which equilibrium humans are operating at, the robot is able to predict trajectories more accurately, and second, by discovering and aligning itself to this equilibrium the robot is able to reduce the cost for all players. △ Less

Submitted 6 May, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

arXiv:1910.13272 [pdf, other]

Feedback Linearization for Unknown Systems via Reinforcement Learning

Authors: Tyler Westenbroek, David Fridovich-Keil, Eric Mazumdar, Shreyas Arora, Valmik Prabhu, S. Shankar Sastry, Claire J. Tomlin

Abstract: We present a novel approach to control design for nonlinear systems which leverages model-free policy optimization techniques to learn a linearizing controller for a physical plant with unknown dynamics. Feedback linearization is a technique from nonlinear control which renders the input-output dynamics of a nonlinear plant \emph{linear} under application of an appropriate feedback controller. Onc… ▽ More We present a novel approach to control design for nonlinear systems which leverages model-free policy optimization techniques to learn a linearizing controller for a physical plant with unknown dynamics. Feedback linearization is a technique from nonlinear control which renders the input-output dynamics of a nonlinear plant \emph{linear} under application of an appropriate feedback controller. Once a linearizing controller has been constructed, desired output trajectories for the nonlinear plant can be tracked using a variety of linear control techniques. However, the calculation of a linearizing controller requires a precise dynamics model for the system. As a result, model-based approaches for learning exact linearizing controllers generally require a simple, highly structured model of the system with easily identifiable parameters. In contrast, the model-free approach presented in this paper is able to approximate the linearizing controller for the plant using general function approximation architectures. Specifically, we formulate a continuous-time optimization problem over the parameters of a learned linearizing controller whose optima are the set of parameters which best linearize the plant. We derive conditions under which the learning problem is (strongly) convex and provide guarantees which ensure the true linearizing controller for the plant is recovered. We then discuss how model-free policy optimization algorithms can be used to solve a discrete-time approximation to the problem using data collected from the real-world plant. The utility of the framework is demonstrated in simulation and on a real-world robotic platform. △ Less

Submitted 21 April, 2020; v1 submitted 29 October, 2019; originally announced October 2019.

arXiv:1910.00681 [pdf, other]

An Iterative Quadratic Method for General-Sum Differential Games with Feedback Linearizable Dynamics

Authors: David Fridovich-Keil, Vicenc Rubies-Royo, Claire J. Tomlin

Abstract: Iterative linear-quadratic (ILQ) methods are widely used in the nonlinear optimal control community. Recent work has applied similar methodology in the setting of multiplayer general-sum differential games. Here, ILQ methods are capable of finding local equilibria in interactive motion planning problems in real-time. As in most iterative procedures, however, this approach can be sensitive to initi… ▽ More Iterative linear-quadratic (ILQ) methods are widely used in the nonlinear optimal control community. Recent work has applied similar methodology in the setting of multiplayer general-sum differential games. Here, ILQ methods are capable of finding local equilibria in interactive motion planning problems in real-time. As in most iterative procedures, however, this approach can be sensitive to initial conditions and hyperparameter choices, which can result in poor computational performance or even unsafe trajectories. In this paper, we focus our attention on a broad class of dynamical systems which are feedback linearizable, and exploit this structure to improve both algorithmic reliability and runtime. We showcase our new algorithm in three distinct traffic scenarios, and observe that in practice our method converges significantly more often and more quickly than was possible without exploiting the feedback linearizable structure. △ Less

Submitted 19 March, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

Comments: 7 pages, 5 figures, accepted to IEEE International Conference on Robotics and Automation (2020)

arXiv:1909.04694 [pdf, other]

Efficient Iterative Linear-Quadratic Approximations for Nonlinear Multi-Player General-Sum Differential Games

Authors: David Fridovich-Keil, Ellis Ratner, Lasse Peters, Anca D. Dragan, Claire J. Tomlin

Abstract: Many problems in robotics involve multiple decision making agents. To operate efficiently in such settings, a robot must reason about the impact of its decisions on the behavior of other agents. Differential games offer an expressive theoretical framework for formulating these types of multi-agent problems. Unfortunately, most numerical solution techniques scale poorly with state dimension and are… ▽ More Many problems in robotics involve multiple decision making agents. To operate efficiently in such settings, a robot must reason about the impact of its decisions on the behavior of other agents. Differential games offer an expressive theoretical framework for formulating these types of multi-agent problems. Unfortunately, most numerical solution techniques scale poorly with state dimension and are rarely used in real-time applications. For this reason, it is common to predict the future decisions of other agents and solve the resulting decoupled, i.e., single-agent, optimal control problem. This decoupling neglects the underlying interactive nature of the problem; however, efficient solution techniques do exist for broad classes of optimal control problems. We take inspiration from one such technique, the iterative linear-quadratic regulator (ILQR), which solves repeated approximations with linear dynamics and quadratic costs. Similarly, our proposed algorithm solves repeated linear-quadratic games. We experimentally benchmark our algorithm in several examples with a variety of initial conditions and show that the resulting strategies exhibit complex interactive behavior. Our results indicate that our algorithm converges reliably and runs in real-time. In a three-player, 14-state simulated intersection problem, our algorithm initially converges in < 0.25s. Receding horizon invocations converge in < 50 ms in a hardware collision-avoidance test. △ Less

Submitted 18 March, 2020; v1 submitted 10 September, 2019; originally announced September 2019.

Comments: 8 pages, 4 figures, accepted to the IEEE International Conference on Robotics and Automation

arXiv:1811.07834 [pdf, other]

Safely Probabilistically Complete Real-Time Planning and Exploration in Unknown Environments

Authors: David Fridovich-Keil, Jaime F. Fisac, Claire J. Tomlin

Abstract: We present a new framework for motion planning that wraps around existing kinodynamic planners and guarantees recursive feasibility when operating in a priori unknown, static environments. Our approach makes strong guarantees about overall safety and collision avoidance by utilizing a robust controller derived from reachability analysis. We ensure that motion plans never exit the safe backward rea… ▽ More We present a new framework for motion planning that wraps around existing kinodynamic planners and guarantees recursive feasibility when operating in a priori unknown, static environments. Our approach makes strong guarantees about overall safety and collision avoidance by utilizing a robust controller derived from reachability analysis. We ensure that motion plans never exit the safe backward reachable set of the initial state, while safely exploring the space. This preserves the safety of the initial state, and guarantees that that we will eventually find the goal if it is possible to do so while exploring safely. We implement our framework in the Robot Operating System (ROS) software environment and demonstrate it in a real-time simulation. △ Less

Submitted 6 March, 2019; v1 submitted 19 November, 2018; originally announced November 2018.

Comments: 7 pages, accepted to ICRA 2019

arXiv:1806.06790 [pdf, other]

Towards Distributed Energy Services: Decentralizing Optimal Power Flow with Machine Learning

Authors: Roel Dobbe, Oscar Sondermeijer, David Fridovich-Keil, Daniel Arnold, Duncan Callaway, Claire Tomlin

Abstract: The implementation of optimal power flow (OPF) methods to perform voltage and power flow regulation in electric networks is generally believed to require extensive communication. We consider distribution systems with multiple controllable Distributed Energy Resources (DERs) and present a data-driven approach to learn control policies for each DER to reconstruct and mimic the solution to a centrali… ▽ More The implementation of optimal power flow (OPF) methods to perform voltage and power flow regulation in electric networks is generally believed to require extensive communication. We consider distribution systems with multiple controllable Distributed Energy Resources (DERs) and present a data-driven approach to learn control policies for each DER to reconstruct and mimic the solution to a centralized OPF problem from solely locally available information. Collectively, all local controllers closely match the centralized OPF solution, providing near optimal performance and satisfaction of system constraints. A rate distortion framework enables the analysis of how well the resulting fully decentralized control policies are able to reconstruct the OPF solution. The methodology provides a natural extension to decide what nodes a DER should communicate with to improve the reconstruction of its individual policy. The method is applied on both single- and three-phase test feeder networks using data from real loads and distributed generators, focusing on DERs that do not exhibit inter-temporal dependencies. It provides a framework for Distribution System Operators to efficiently plan and operate the contributions of DERs to achieve Distributed Energy Services in distribution networks. △ Less

Submitted 13 August, 2019; v1 submitted 14 June, 2018; originally announced June 2018.

Comments: Accepted for publication. To appear in the IEEE Transactions on Smart Grid

arXiv:1710.04731 [pdf, other]

Planning, Fast and Slow: A Framework for Adaptive Real-Time Safe Trajectory Planning

Authors: David Fridovich-Keil, Sylvia L. Herbert, Jaime F. Fisac, Sampada Deglurkar, Claire J. Tomlin

Abstract: Motion planning is an extremely well-studied problem in the robotics community, yet existing work largely falls into one of two categories: computationally efficient but with few if any safety guarantees, or able to give stronger guarantees but at high computational cost. This work builds on a recent development called FaSTrack in which a slow offline computation provides a modular safety guarante… ▽ More Motion planning is an extremely well-studied problem in the robotics community, yet existing work largely falls into one of two categories: computationally efficient but with few if any safety guarantees, or able to give stronger guarantees but at high computational cost. This work builds on a recent development called FaSTrack in which a slow offline computation provides a modular safety guarantee for a faster online planner. We introduce the notion of "meta-planning" in which a refined offline computation enables safe switching between different online planners. This provides autonomous systems with the ability to adapt motion plans to a priori unknown environments in real-time as sensor measurements detect new obstacles, and the flexibility to maneuver differently in the presence of obstacles than they would in free space, all while maintaining a strict safety guarantee. We demonstrate the meta-planning algorithm both in simulation and in hardware using a small Crazyflie 2.0 quadrotor. △ Less

Submitted 6 March, 2018; v1 submitted 12 October, 2017; originally announced October 2017.

Comments: ICRA, International Conference on Robotics and Automation, ICRA 2018, 8 pages, 9 figures

arXiv:1707.06334 [pdf, other]

Fully Decentralized Policies for Multi-Agent Systems: An Information Theoretic Approach

Authors: Roel Dobbe, David Fridovich-Keil, Claire Tomlin

Abstract: Learning cooperative policies for multi-agent systems is often challenged by partial observability and a lack of coordination. In some settings, the structure of a problem allows a distributed solution with limited communication. Here, we consider a scenario where no communication is available, and instead we learn local policies for all agents that collectively mimic the solution to a centralized… ▽ More Learning cooperative policies for multi-agent systems is often challenged by partial observability and a lack of coordination. In some settings, the structure of a problem allows a distributed solution with limited communication. Here, we consider a scenario where no communication is available, and instead we learn local policies for all agents that collectively mimic the solution to a centralized multi-agent static optimization problem. Our main contribution is an information theoretic framework based on rate distortion theory which facilitates analysis of how well the resulting fully decentralized policies are able to reconstruct the optimal solution. Moreover, this framework provides a natural extension that addresses which nodes an agent should communicate with to improve the performance of its individual policy. △ Less

Submitted 29 July, 2017; v1 submitted 19 July, 2017; originally announced July 2017.

MSC Class: 68

Showing 1–26 of 26 results for author: Fridovich-Keil, D