-
Layered Graph Security Games
Authors:
Jakub Černý,
Chun Kai Ling,
Christian Kroer,
Garud Iyengar
Abstract:
Security games model strategic interactions in adversarial real-world applications. Such applications often involve extremely large but highly structured strategy sets (e.g., selecting a distribution over all patrol routes in a given graph). In this paper, we represent each player's strategy space using a layered graph whose paths represent an exponentially large strategy space. Our formulation en…
▽ More
Security games model strategic interactions in adversarial real-world applications. Such applications often involve extremely large but highly structured strategy sets (e.g., selecting a distribution over all patrol routes in a given graph). In this paper, we represent each player's strategy space using a layered graph whose paths represent an exponentially large strategy space. Our formulation entails not only classic pursuit-evasion games, but also other security games, such as those modeling anti-terrorism and logistical interdiction. We study two-player zero-sum games under two distinct utility models: linear and binary utilities. We show that under linear utilities, Nash equilibrium can be computed in polynomial time, while binary utilities may lead to situations where even computing a best-response is computationally intractable. To this end, we propose a practical algorithm based on incremental strategy generation and mixed integer linear programs. We show through extensive experiments that our algorithm efficiently computes $ε$-equilibrium for many games of interest. We find that target values and graph structure often have a larger influence on running times as compared to the size of the graph per se.
△ Less
Submitted 9 May, 2024; v1 submitted 5 May, 2024;
originally announced May 2024.
-
Grasper: A Generalist Pursuer for Pursuit-Evasion Problems
Authors:
Pengdeng Li,
Shuxin Li,
Xinrun Wang,
Jakub Cerny,
Youzhi Zhang,
Stephen McAleer,
Hau Chan,
Bo An
Abstract:
Pursuit-evasion games (PEGs) model interactions between a team of pursuers and an evader in graph-based environments such as urban street networks. Recent advancements have demonstrated the effectiveness of the pre-training and fine-tuning paradigm in PSRO to improve scalability in solving large-scale PEGs. However, these methods primarily focus on specific PEGs with fixed initial conditions that…
▽ More
Pursuit-evasion games (PEGs) model interactions between a team of pursuers and an evader in graph-based environments such as urban street networks. Recent advancements have demonstrated the effectiveness of the pre-training and fine-tuning paradigm in PSRO to improve scalability in solving large-scale PEGs. However, these methods primarily focus on specific PEGs with fixed initial conditions that may vary substantially in real-world scenarios, which significantly hinders the applicability of the traditional methods. To address this issue, we introduce Grasper, a GeneRAlist purSuer for Pursuit-Evasion pRoblems, capable of efficiently generating pursuer policies tailored to specific PEGs. Our contributions are threefold: First, we present a novel architecture that offers high-quality solutions for diverse PEGs, comprising critical components such as (i) a graph neural network (GNN) to encode PEGs into hidden vectors, and (ii) a hypernetwork to generate pursuer policies based on these hidden vectors. As a second contribution, we develop an efficient three-stage training method involving (i) a pre-pretraining stage for learning robust PEG representations through self-supervised graph learning techniques like GraphMAE, (ii) a pre-training stage utilizing heuristic-guided multi-task pre-training (HMP) where heuristic-derived reference policies (e.g., through Dijkstra's algorithm) regularize pursuer policies, and (iii) a fine-tuning stage that employs PSRO to generate pursuer policies on designated PEGs. Finally, we perform extensive experiments on synthetic and real-world maps, showcasing Grasper's significant superiority over baselines in terms of solution quality and generalizability. We demonstrate that Grasper provides a versatile approach for solving pursuit-evasion problems across a broad range of scenarios, enabling practical deployment in real-world situations.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
Reducing Optimism Bias in Incomplete Cooperative Games
Authors:
Filip Úradník,
David Sychrovský,
Jakub Černý,
Martin Černý
Abstract:
Cooperative game theory has diverse applications in contemporary artificial intelligence, including domains like interpretable machine learning, resource allocation, and collaborative decision-making. However, specifying a cooperative game entails assigning values to exponentially many coalitions, and obtaining even a single value can be resource-intensive in practice. Yet simply leaving certain c…
▽ More
Cooperative game theory has diverse applications in contemporary artificial intelligence, including domains like interpretable machine learning, resource allocation, and collaborative decision-making. However, specifying a cooperative game entails assigning values to exponentially many coalitions, and obtaining even a single value can be resource-intensive in practice. Yet simply leaving certain coalition values undisclosed introduces ambiguity regarding individual contributions to the collective grand coalition. This ambiguity often leads to players holding overly optimistic expectations, stemming from either inherent biases or strategic considerations, frequently resulting in collective claims exceeding the actual grand coalition value. In this paper, we present a framework aimed at optimizing the sequence for revealing coalition values, with the overarching goal of efficiently closing the gap between players' expectations and achievable outcomes in cooperative games. Our contributions are threefold: (i) we study the individual players' optimistic completions of games with missing coalition values along with the arising gap, and investigate its analytical characteristics that facilitate more efficient optimization; (ii) we develop methods to minimize this gap over classes of games with a known prior by disclosing values of additional coalitions in both offline and online fashion; and (iii) we empirically demonstrate the algorithms' performance in practical scenarios, together with an investigation into the typical order of revealing coalition values.
△ Less
Submitted 19 February, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Price of Anarchy in a Double-Sided Critical Distribution System
Authors:
David Sychrovský,
Jakub Černý,
Sylvain Lichau,
Martin Loebl
Abstract:
Measures of allocation optimality differ significantly when distributing standard tradable goods in peaceful times and scarce resources in crises. While realistic markets offer asymptotic efficiency, they may not necessarily guarantee fair allocation desirable when distributing the critical resources. To achieve fairness, mechanisms often rely on a central authority, which may act inefficiently in…
▽ More
Measures of allocation optimality differ significantly when distributing standard tradable goods in peaceful times and scarce resources in crises. While realistic markets offer asymptotic efficiency, they may not necessarily guarantee fair allocation desirable when distributing the critical resources. To achieve fairness, mechanisms often rely on a central authority, which may act inefficiently in times of need when swiftness and good organization are crucial. In this work, we study a hybrid trading system called Crisdis, introduced by Jedličková et al., which combines fair allocation of buying rights with a market - leveraging the best of both worlds. A frustration of a buyer in Crisdis is defined as a difference between the amount of goods they are entitled to according to the assigned buying rights and the amount of goods they are able to acquire by trading. We define a Price of Anarchy (PoA) in this system as a conceptual analogue of the original definition in the context of frustration. Our main contribution is a study of PoA in realistic complex double-sided market mechanisms for Crisdis. The performed empirical analysis suggests that in contrast to market free of governmental interventions, the PoA in our system decreases.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
Offline Equilibrium Finding
Authors:
Shuxin Li,
Xinrun Wang,
Youzhi Zhang,
Jakub Cerny,
Pengdeng Li,
Hau Chan,
Bo An
Abstract:
Offline reinforcement learning (offline RL) is an emerging field that has recently begun gaining attention across various application domains due to its ability to learn strategies from earlier collected datasets. Offline RL proved very successful, paving a path to solving previously intractable real-world problems, and we aim to generalize this paradigm to a multiplayer-game setting. To this end,…
▽ More
Offline reinforcement learning (offline RL) is an emerging field that has recently begun gaining attention across various application domains due to its ability to learn strategies from earlier collected datasets. Offline RL proved very successful, paving a path to solving previously intractable real-world problems, and we aim to generalize this paradigm to a multiplayer-game setting. To this end, we introduce a problem of offline equilibrium finding (OEF) and construct multiple types of datasets across a wide range of games using several established methods. To solve the OEF problem, we design a model-based framework that can directly apply any online equilibrium finding algorithm to the OEF setting while making minimal changes. The three most prominent contemporary online equilibrium finding algorithms are adapted to the context of OEF, creating three model-based variants: OEF-PSRO and OEF-CFR, which generalize the widely-used algorithms PSRO and Deep CFR to compute Nash equilibria (NEs), and OEF-JPSRO, which generalizes the JPSRO to calculate (Coarse) Correlated equilibria ((C)CEs). We also combine the behavior cloning policy with the model-based policy to further improve the performance and provide a theoretical guarantee of the solution quality. Extensive experimental results demonstrate the superiority of our approach over offline RL algorithms and the importance of using model-based methods for OEF problems. We hope our work will contribute to advancing research in large-scale equilibrium finding.
△ Less
Submitted 3 February, 2023; v1 submitted 11 July, 2022;
originally announced July 2022.
-
A Unified Perspective on Deep Equilibrium Finding
Authors:
Xinrun Wang,
Jakub Cerny,
Shuxin Li,
Chang Yang,
Zhuyun Yin,
Hau Chan,
Bo An
Abstract:
Extensive-form games provide a versatile framework for modeling interactions of multiple agents subjected to imperfect observations and stochastic events. In recent years, two paradigms, policy space response oracles (PSRO) and counterfactual regret minimization (CFR), showed that extensive-form games may indeed be solved efficiently. Both of them are capable of leveraging deep neural networks to…
▽ More
Extensive-form games provide a versatile framework for modeling interactions of multiple agents subjected to imperfect observations and stochastic events. In recent years, two paradigms, policy space response oracles (PSRO) and counterfactual regret minimization (CFR), showed that extensive-form games may indeed be solved efficiently. Both of them are capable of leveraging deep neural networks to tackle the scalability issues inherent to extensive-form games and we refer to them as deep equilibrium-finding algorithms. Even though PSRO and CFR share some similarities, they are often regarded as distinct and the answer to the question of which is superior to the other remains ambiguous. Instead of answering this question directly, in this work we propose a unified perspective on deep equilibrium finding that generalizes both PSRO and CFR. Our four main contributions include: i) a novel response oracle (RO) which computes Q values as well as reaching probability values and baseline values; ii) two transform modules -- a pre-transform and a post-transform -- represented by neural networks transforming the outputs of RO to a latent additive space (LAS), and then the LAS to action probabilities for execution; iii) two average oracles -- local average oracle (LAO) and global average oracle (GAO) -- where LAO operates on LAS and GAO is used for evaluation only; and iv) a novel method inspired by fictitious play that optimizes the transform modules and average oracles, and automatically selects the optimal combination of components of the two frameworks. Experiments on Leduc poker game demonstrate that our approach can outperform both frameworks.
△ Less
Submitted 11 April, 2022;
originally announced April 2022.
-
Complexity and Algorithms for Exploiting Quantal Opponents in Large Two-Player Games
Authors:
David Milec,
Jakub Černý,
Viliam Lisý,
Bo An
Abstract:
Solution concepts of traditional game theory assume entirely rational players; therefore, their ability to exploit subrational opponents is limited. One type of subrationality that describes human behavior well is the quantal response. While there exist algorithms for computing solutions against quantal opponents, they either do not scale or may provide strategies that are even worse than the enti…
▽ More
Solution concepts of traditional game theory assume entirely rational players; therefore, their ability to exploit subrational opponents is limited. One type of subrationality that describes human behavior well is the quantal response. While there exist algorithms for computing solutions against quantal opponents, they either do not scale or may provide strategies that are even worse than the entirely-rational Nash strategies. This paper aims to analyze and propose scalable algorithms for computing effective and robust strategies against a quantal opponent in normal-form and extensive-form games. Our contributions are: (1) we define two different solution concepts related to exploiting quantal opponents and analyze their properties; (2) we prove that computing these solutions is computationally hard; (3) therefore, we evaluate several heuristic approximations based on scalable counterfactual regret minimization (CFR); and (4) we identify a CFR variant that exploits the bounded opponents better than the previously used variants while being less exploitable by the worst-case perfectly-rational opponent.
△ Less
Submitted 16 December, 2020; v1 submitted 30 September, 2020;
originally announced September 2020.
-
Computing Ex Ante Coordinated Team-Maxmin Equilibria in Zero-Sum Multiplayer Extensive-Form Games
Authors:
Youzhi Zhang,
Bo An,
Jakub Černý
Abstract:
Computational game theory has many applications in the modern world in both adversarial situations and the optimization of social good. While there exist many algorithms for computing solutions in two-player interactions, finding optimal strategies in multiplayer interactions efficiently remains an open challenge. This paper focuses on computing the multiplayer Team-Maxmin Equilibrium with Coordin…
▽ More
Computational game theory has many applications in the modern world in both adversarial situations and the optimization of social good. While there exist many algorithms for computing solutions in two-player interactions, finding optimal strategies in multiplayer interactions efficiently remains an open challenge. This paper focuses on computing the multiplayer Team-Maxmin Equilibrium with Coordination device (TMECor) in zero-sum extensive-form games. TMECor models scenarios when a team of players coordinates ex ante against an adversary. Such situations can be found in card games (e.g., in Bridge and Poker), when a team works together to beat a target player but communication is prohibited; and also in real world, e.g., in forest-protection operations, when coordinated groups have limited contact during interdicting illegal loggers. The existing algorithms struggle to find a TMECor efficiently because of their high computational costs. To compute a TMECor in larger games, we make the following key contributions: (1) we propose a hybrid-form strategy representation for the team, which preserves the set of equilibria; (2) we introduce a column-generation algorithm with a guaranteed finite-time convergence in the infinite strategy space based on a novel best-response oracle; (3) we develop an associated-representation technique for the exact representation of the multilinear terms in the best-response oracle; and (4) we experimentally show that our algorithm is several orders of magnitude faster than prior state-of-the-art algorithms in large games.
△ Less
Submitted 18 July, 2021; v1 submitted 26 September, 2020;
originally announced September 2020.
-
Improvement on the decay of crossing numbers
Authors:
Jakub Černý,
Jan Kynčl,
Géza Tóth
Abstract:
We prove that the crossing number of a graph decays in a continuous fashion in the following sense. For any epsilon>0 there is a delta>0 such that for a sufficiently large n, every graph G with n vertices and m > n^{1+epsilon} edges, has a subgraph G' of at most (1-delta)m edges and crossing number at least (1-epsilon)cr(G). This generalizes the result of J. Fox and Cs. Toth.
We prove that the crossing number of a graph decays in a continuous fashion in the following sense. For any epsilon>0 there is a delta>0 such that for a sufficiently large n, every graph G with n vertices and m > n^{1+epsilon} edges, has a subgraph G' of at most (1-delta)m edges and crossing number at least (1-epsilon)cr(G). This generalizes the result of J. Fox and Cs. Toth.
△ Less
Submitted 25 January, 2012;
originally announced January 2012.