Search | arXiv e-print repository

Last-iterate Convergence Separation between Extra-gradient and Optimism in Constrained Periodic Games

Authors: Yi Feng, ** Li, Ioannis Panageas, Xiao Wang

Abstract: Last-iterate behaviors of learning algorithms in repeated two-player zero-sum games have been extensively studied due to their wide applications in machine learning and related tasks. Typical algorithms that exhibit the last-iterate convergence property include optimistic and extra-gradient methods. However, most existing results establish these properties under the assumption that the game is tim… ▽ More Last-iterate behaviors of learning algorithms in repeated two-player zero-sum games have been extensively studied due to their wide applications in machine learning and related tasks. Typical algorithms that exhibit the last-iterate convergence property include optimistic and extra-gradient methods. However, most existing results establish these properties under the assumption that the game is time-independent. Recently, (Feng et al, 2023) studied the last-iterate behaviors of optimistic and extra-gradient methods in games with a time-varying payoff matrix, and proved that in an unconstrained periodic game, extra-gradient method converges to the equilibrium while optimistic method diverges. This finding challenges the conventional wisdom that these two methods are expected to behave similarly as they do in time-independent games. However, compared to unconstrained games, games with constrains are more common both in practical and theoretical studies. In this paper, we investigate the last-iterate behaviors of optimistic and extra-gradient methods in the constrained periodic games, demonstrating that similar separation results for last-iterate convergence also hold in this setting. △ Less

Submitted 15 June, 2024; originally announced June 2024.

Comments: Accepted for UAI 2024

arXiv:2402.07797 [pdf, other]

Computing Nash Equilibria in Potential Games with Private Uncoupled Constraints

Authors: Nikolas Patris, Stelios Stavroulakis, Fivos Kalogiannis, Rose Zhang, Ioannis Panageas

Abstract: We consider the problem of computing Nash equilibria in potential games where each player's strategy set is subject to private uncoupled constraints. This scenario is frequently encountered in real-world applications like road network congestion games where individual drivers adhere to personal budget and fuel limitations. Despite the plethora of algorithms that efficiently compute Nash equilibria… ▽ More We consider the problem of computing Nash equilibria in potential games where each player's strategy set is subject to private uncoupled constraints. This scenario is frequently encountered in real-world applications like road network congestion games where individual drivers adhere to personal budget and fuel limitations. Despite the plethora of algorithms that efficiently compute Nash equilibria (NE) in potential games, the domain of constrained potential games remains largely unexplored. We introduce an algorithm that leverages the Lagrangian formulation of NE. The algorithm is implemented independently by each player and runs in polynomial time with respect to the approximation error, the sum of the size of the action-spaces, and the game's inherent parameters. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: Accepted to appear in AAAI 2024

arXiv:2401.09628 [pdf, ps, other]

Polynomial Convergence of Bandit No-Regret Dynamics in Congestion Games

Authors: Leello Dadi, Ioannis Panageas, Stratis Skoulakis, Luca Viano, Volkan Cevher

Abstract: We introduce an online learning algorithm in the bandit feedback model that, once adopted by all agents of a congestion game, results in game-dynamics that converge to an $ε$-approximate Nash Equilibrium in a polynomial number of rounds with respect to $1/ε$, the number of players and the number of available resources. The proposed algorithm also guarantees sublinear regret to any agent adopting i… ▽ More We introduce an online learning algorithm in the bandit feedback model that, once adopted by all agents of a congestion game, results in game-dynamics that converge to an $ε$-approximate Nash Equilibrium in a polynomial number of rounds with respect to $1/ε$, the number of players and the number of available resources. The proposed algorithm also guarantees sublinear regret to any agent adopting it. As a result, our work answers an open question from arXiv:2206.01880 and extends the recent results of arXiv:2306.15543 to the bandit feedback model. We additionally establish that our online learning algorithm can be implemented in polynomial time for the important special case of Network Congestion Games on Directed Acyclic Graphs (DAG) by constructing an exact $1$-barycentric spanner for DAGs. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2312.12067 [pdf, other]

Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence Beyond the Minty Property

Authors: Ioannis Anagnostides, Ioannis Panageas, Gabriele Farina, Tuomas Sandholm

Abstract: Policy gradient methods enjoy strong practical performance in numerous tasks in reinforcement learning. Their theoretical understanding in multiagent settings, however, remains limited, especially beyond two-player competitive and potential Markov games. In this paper, we develop a new framework to characterize optimistic policy gradient methods in multi-player Markov games with a single controlle… ▽ More Policy gradient methods enjoy strong practical performance in numerous tasks in reinforcement learning. Their theoretical understanding in multiagent settings, however, remains limited, especially beyond two-player competitive and potential Markov games. In this paper, we develop a new framework to characterize optimistic policy gradient methods in multi-player Markov games with a single controller. Specifically, under the further assumption that the game exhibits an equilibrium collapse, in that the marginals of coarse correlated equilibria (CCE) induce Nash equilibria (NE), we show convergence to stationary $ε$-NE in $O(1/ε^2)$ iterations, where $O(\cdot)$ suppresses polynomial factors in the natural parameters of the game. Such an equilibrium collapse is well-known to manifest itself in two-player zero-sum Markov games, but also occurs even in a class of multi-player Markov games with separable interactions, as established by recent work. As a result, we bypass known complexity barriers for computing stationary NE when either of our assumptions fails. Our approach relies on a natural generalization of the classical Minty property that we introduce, which we anticipate to have further applications beyond Markov games. △ Less

Submitted 21 December, 2023; v1 submitted 19 December, 2023; originally announced December 2023.

Comments: To appear at AAAI 2024

arXiv:2310.02604 [pdf, other]

On the Last-iterate Convergence in Time-varying Zero-sum Games: Extra Gradient Succeeds where Optimism Fails

Authors: Yi Feng, Hu Fu, Qun Hu, ** Li, Ioannis Panageas, Bo Peng, Xiao Wang

Abstract: Last-iterate convergence has received extensive study in two player zero-sum games starting from bilinear, convex-concave up to settings that satisfy the MVI condition. Typical methods that exhibit last-iterate convergence for the aforementioned games include extra-gradient (EG) and optimistic gradient descent ascent (OGDA). However, all the established last-iterate convergence results hold for th… ▽ More Last-iterate convergence has received extensive study in two player zero-sum games starting from bilinear, convex-concave up to settings that satisfy the MVI condition. Typical methods that exhibit last-iterate convergence for the aforementioned games include extra-gradient (EG) and optimistic gradient descent ascent (OGDA). However, all the established last-iterate convergence results hold for the restrictive setting where the underlying repeated game does not change over time. Recently, a line of research has focused on regret analysis of OGDA in time-varying games, i.e., games where payoffs evolve with time; the last-iterate behavior of OGDA and EG in time-varying environments remains unclear though. In this paper, we study the last-iterate behavior of various algorithms in two types of unconstrained, time-varying, bilinear zero-sum games: periodic and convergent perturbed games. These models expand upon the usual repeated game formulation and incorporate external environmental factors, such as the seasonal effects on species competition and vanishing external noise. In periodic games, we prove that EG will converge while OGDA and momentum method will diverge. This is quite surprising, as to the best of our knowledge, it is the first result that indicates EG and OGDA have qualitatively different last-iterate behaviors and do not exhibit similar behavior. In convergent perturbed games, we prove all these algorithms converge as long as the game itself stabilizes with a faster rate than $1/t$. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: 44 pages, accepted for NeurIPS 2023

arXiv:2310.02387 [pdf, other]

Exponential Lower Bounds for Fictitious Play in Potential Games

Authors: Ioannis Panageas, Nikolas Patris, Stratis Skoulakis, Volkan Cevher

Abstract: Fictitious Play (FP) is a simple and natural dynamic for repeated play with many applications in game theory and multi-agent reinforcement learning. It was introduced by Brown (1949,1951) and its convergence properties for two-player zero-sum games was established later by Robinson (1951). Potential games Monderer and Shapley (1996b) is another class of games which exhibit the FP property (Mondere… ▽ More Fictitious Play (FP) is a simple and natural dynamic for repeated play with many applications in game theory and multi-agent reinforcement learning. It was introduced by Brown (1949,1951) and its convergence properties for two-player zero-sum games was established later by Robinson (1951). Potential games Monderer and Shapley (1996b) is another class of games which exhibit the FP property (Monderer and Shapley (1996a)), i.e., FP dynamics converges to a Nash equilibrium if all agents follows it. Nevertheless, except for two-player zero-sum games and for specific instances of payoff matrices (Abernethy et al. (2021)) or for adversarial tie-breaking rules (Daskalakis and Pan (2014)), the convergence rate of FP is unknown. In this work, we focus on the rate of convergence of FP when applied to potential games and more specifically identical payoff games. We prove that FP can take exponential time (in the number of strategies) to reach a Nash equilibrium, even if the game is restricted to two agents and for arbitrary tie-breaking rules. To prove this, we recursively construct a two-player coordination game with a unique Nash equilibrium. Moreover, every approximate Nash equilibrium in the constructed game must be close to the pure Nash equilibrium in $\ell_1$-distance. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: Accepted to appear in NeurIPS 2023

arXiv:2306.15543 [pdf, other]

Semi Bandit Dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees

Authors: Ioannis Panageas, Stratis Skoulakis, Luca Viano, Xiao Wang, Volkan Cevher

Abstract: In this work, we introduce a new variant of online gradient descent, which provably converges to Nash Equilibria and simultaneously attains sublinear regret for the class of congestion games in the semi-bandit feedback setting. Our proposed method admits convergence rates depending only polynomially on the number of players and the number of facilities, but not on the size of the action set, which… ▽ More In this work, we introduce a new variant of online gradient descent, which provably converges to Nash Equilibria and simultaneously attains sublinear regret for the class of congestion games in the semi-bandit feedback setting. Our proposed method admits convergence rates depending only polynomially on the number of players and the number of facilities, but not on the size of the action set, which can be exponentially large in terms of the number of facilities. Moreover, the running time of our method has polynomial-time dependence on the implicit description of the game. As a result, our work answers an open question from (Du et. al, 2022). △ Less

Submitted 27 June, 2023; originally announced June 2023.

Comments: ICML 2023

arXiv:2305.14329 [pdf, ps, other]

Zero-sum Polymatrix Markov Games: Equilibrium Collapse and Efficient Computation of Nash Equilibria

Authors: Fivos Kalogiannis, Ioannis Panageas

Abstract: The works of (Daskalakis et al., 2009, 2022; ** et al., 2022; Deng et al., 2023) indicate that computing Nash equilibria in multi-player Markov games is a computationally hard task. This fact raises the question of whether or not computational intractability can be circumvented if one focuses on specific classes of Markov games. One such example is two-player zero-sum Markov games, in which effic… ▽ More The works of (Daskalakis et al., 2009, 2022; ** et al., 2022; Deng et al., 2023) indicate that computing Nash equilibria in multi-player Markov games is a computationally hard task. This fact raises the question of whether or not computational intractability can be circumvented if one focuses on specific classes of Markov games. One such example is two-player zero-sum Markov games, in which efficient ways to compute a Nash equilibrium are known. Inspired by zero-sum polymatrix normal-form games (Cai et al., 2016), we define a class of zero-sum multi-agent Markov games in which there are only pairwise interactions described by a graph that changes per state. For this class of Markov games, we show that an $ε$-approximate Nash equilibrium can be found efficiently. To do so, we generalize the techniques of (Cai et al., 2016), by showing that the set of coarse-correlated equilibria collapses to the set of Nash equilibria. Afterwards, it is possible to use any algorithm in the literature that computes approximate coarse-correlated equilibria Markovian policies to get an approximate Nash equilibrium. △ Less

Submitted 29 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

Comments: Added missing proofs for the infinite-horizon

arXiv:2301.11241 [pdf, other]

On the Convergence of No-Regret Learning Dynamics in Time-Varying Games

Authors: Ioannis Anagnostides, Ioannis Panageas, Gabriele Farina, Tuomas Sandholm

Abstract: Most of the literature on learning in games has focused on the restrictive setting where the underlying repeated game does not change over time. Much less is known about the convergence of no-regret learning algorithms in dynamic multiagent settings. In this paper, we characterize the convergence of optimistic gradient descent (OGD) in time-varying games. Our framework yields sharp convergence bou… ▽ More Most of the literature on learning in games has focused on the restrictive setting where the underlying repeated game does not change over time. Much less is known about the convergence of no-regret learning algorithms in dynamic multiagent settings. In this paper, we characterize the convergence of optimistic gradient descent (OGD) in time-varying games. Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games parameterized on natural variation measures of the sequence of games, subsuming known results for static games. Furthermore, we establish improved second-order variation bounds under strong convexity-concavity, as long as each game is repeated multiple times. Our results also apply to time-varying general-sum multi-player games via a bilinear formulation of correlated equilibria, which has novel implications for meta-learning and for obtaining refined variation-dependent regret bounds, addressing questions left open in prior papers. Finally, we leverage our framework to also provide new insights on dynamic regret guarantees in static games. △ Less

Submitted 18 October, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

Comments: To appear at NeurIPS 2023; V3 incorporates reviewers' feedback and minor corrections

arXiv:2301.02129 [pdf, ps, other]

Algorithms and Complexity for Computing Nash Equilibria in Adversarial Team Games

Authors: Ioannis Anagnostides, Fivos Kalogiannis, Ioannis Panageas, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Stephen McAleer

Abstract: Adversarial team games model multiplayer strategic interactions in which a team of identically-interested players is competing against an adversarial player in a zero-sum game. Such games capture many well-studied settings in game theory, such as congestion games, but go well-beyond to environments wherein the cooperation of one team -- in the absence of explicit communication -- is obstructed by… ▽ More Adversarial team games model multiplayer strategic interactions in which a team of identically-interested players is competing against an adversarial player in a zero-sum game. Such games capture many well-studied settings in game theory, such as congestion games, but go well-beyond to environments wherein the cooperation of one team -- in the absence of explicit communication -- is obstructed by competing entities; the latter setting remains poorly understood despite its numerous applications. Since the seminal work of Von Stengel and Koller (GEB `97), different solution concepts have received attention from an algorithmic standpoint. Yet, the complexity of the standard Nash equilibrium has remained open. In this paper, we settle this question by showing that computing a Nash equilibrium in adversarial team games belongs to the class continuous local search (CLS), thereby establishing CLS-completeness by virtue of the recent CLS-hardness result of Rubinstein and Babichenko (STOC `21) in potential games. To do so, we leverage linear programming duality to prove that any $ε$-approximate stationary strategy for the team can be extended in polynomial time to an $O(ε)$-approximate Nash equilibrium, where the $O(\cdot)$ notation suppresses polynomial factors in the description of the game. As a consequence, we show that the Moreau envelop of a suitable best response function acts as a potential under certain natural gradient-based dynamics. △ Less

Submitted 30 May, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

Comments: To appear at the conference on Economics and Computation (EC) 2023

arXiv:2210.05212 [pdf, other]

On Scrambling Phenomena for Randomly Initialized Recurrent Networks

Authors: Vaggos Chatziafratis, Ioannis Panageas, Clayton Sanford, Stelios Andrew Stavroulakis

Abstract: Recurrent Neural Networks (RNNs) frequently exhibit complicated dynamics, and their sensitivity to the initialization process often renders them notoriously hard to train. Recent works have shed light on such phenomena analyzing when exploding or vanishing gradients may occur, either of which is detrimental for training dynamics. In this paper, we point to a formal connection between RNNs and chao… ▽ More Recurrent Neural Networks (RNNs) frequently exhibit complicated dynamics, and their sensitivity to the initialization process often renders them notoriously hard to train. Recent works have shed light on such phenomena analyzing when exploding or vanishing gradients may occur, either of which is detrimental for training dynamics. In this paper, we point to a formal connection between RNNs and chaotic dynamical systems and prove a qualitatively stronger phenomenon about RNNs than what exploding gradients seem to suggest. Our main result proves that under standard initialization (e.g., He, Xavier etc.), RNNs will exhibit \textit{Li-Yorke chaos} with \textit{constant} probability \textit{independent} of the network's width. This explains the experimentally observed phenomenon of \textit{scrambling}, under which trajectories of nearby points may appear to be arbitrarily close during some timesteps, yet will be far away in future timesteps. In stark contrast to their feedforward counterparts, we show that chaotic behavior in RNNs is preserved under small perturbations and that their expressive power remains exponential in the number of feedback iterations. Our technical arguments rely on viewing RNNs as random walks under non-linear activations, and studying the existence of certain types of higher-order fixed points called \textit{periodic points} that lead to phase transitions from order to chaos. △ Less

Submitted 11 October, 2022; originally announced October 2022.

Comments: Accepted for publication, Neurips 2022

arXiv:2208.02204 [pdf, ps, other]

Efficiently Computing Nash Equilibria in Adversarial Team Markov Games

Authors: Fivos Kalogiannis, Ioannis Anagnostides, Ioannis Panageas, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Vaggos Chatziafratis, Stelios Stavroulakis

Abstract: Computing Nash equilibrium policies is a central problem in multi-agent reinforcement learning that has received extensive attention both in theory and in practice. However, provable guarantees have been thus far either limited to fully competitive or cooperative scenarios or impose strong assumptions that are difficult to meet in most practical applications. In this work, we depart from those pri… ▽ More Computing Nash equilibrium policies is a central problem in multi-agent reinforcement learning that has received extensive attention both in theory and in practice. However, provable guarantees have been thus far either limited to fully competitive or cooperative scenarios or impose strong assumptions that are difficult to meet in most practical applications. In this work, we depart from those prior results by investigating infinite-horizon \emph{adversarial team Markov games}, a natural and well-motivated class of games in which a team of identically-interested players -- in the absence of any explicit coordination or communication -- is competing against an adversarial player. This setting allows for a unifying treatment of zero-sum Markov games and Markov potential games, and serves as a step to model more realistic strategic interactions that feature both competing and cooperative interests. Our main contribution is the first algorithm for computing stationary $ε$-approximate Nash equilibria in adversarial team Markov games with computational complexity that is polynomial in all the natural parameters of the game, as well as $1/ε$. The proposed algorithm is particularly natural and practical, and it is based on performing independent policy gradient steps for each player in the team, in tandem with best responses from the side of the adversary; in turn, the policy for the adversary is then obtained by solving a carefully constructed linear program. Our analysis leverages non-standard techniques to establish the KKT optimality conditions for a nonlinear program with nonconvex constraints, thereby leading to a natural interpretation of the induced Lagrange multipliers. Along the way, we significantly extend an important characterization of optimal policies in adversarial (normal-form) team games due to Von Stengel and Koller (GEB `97). △ Less

Submitted 3 August, 2022; originally announced August 2022.

arXiv:2204.11407 [pdf, other]

Accelerated Multiplicative Weights Update Avoids Saddle Points almost always

Authors: Yi Feng, Ioannis Panageas, Xiao Wang

Abstract: We consider non-convex optimization problems with constraint that is a product of simplices. A commonly used algorithm in solving this type of problem is the Multiplicative Weights Update (MWU), an algorithm that is widely used in game theory, machine learning and multi-agent systems. Despite it has been known that MWU avoids saddle points, there is a question that remains unaddressed:"Is there an… ▽ More We consider non-convex optimization problems with constraint that is a product of simplices. A commonly used algorithm in solving this type of problem is the Multiplicative Weights Update (MWU), an algorithm that is widely used in game theory, machine learning and multi-agent systems. Despite it has been known that MWU avoids saddle points, there is a question that remains unaddressed:"Is there an accelerated version of MWU that avoids saddle points provably?" In this paper we provide a positive answer to above question. We provide an accelerated MWU based on Riemannian Accelerated Gradient Descent, and prove that the Riemannian Accelerated Gradient Descent, thus the accelerated MWU, almost always avoid saddle points. △ Less

Submitted 24 April, 2022; originally announced April 2022.

Comments: 21 pages, 12 figures

arXiv:2203.12074 [pdf, other]

Optimistic Mirror Descent Either Converges to Nash or to Strong Coarse Correlated Equilibria in Bimatrix Games

Authors: Ioannis Anagnostides, Gabriele Farina, Ioannis Panageas, Tuomas Sandholm

Abstract: We show that, for any sufficiently small fixed $ε> 0$, when both players in a general-sum two-player (bimatrix) game employ optimistic mirror descent (OMD) with smooth regularization, learning rate $η= O(ε^2)$ and $T = Ω(\text{poly}(1/ε))$ repetitions, either the dynamics reach an $ε$-approximate Nash equilibrium (NE), or the average correlated distribution of play is an $Ω(\text{poly}(ε))$-strong… ▽ More We show that, for any sufficiently small fixed $ε> 0$, when both players in a general-sum two-player (bimatrix) game employ optimistic mirror descent (OMD) with smooth regularization, learning rate $η= O(ε^2)$ and $T = Ω(\text{poly}(1/ε))$ repetitions, either the dynamics reach an $ε$-approximate Nash equilibrium (NE), or the average correlated distribution of play is an $Ω(\text{poly}(ε))$-strong coarse correlated equilibrium (CCE): any possible unilateral deviation does not only leave the player worse, but will decrease its utility by $Ω(\text{poly}(ε))$. As an immediate consequence, when the iterates of OMD are bounded away from being Nash equilibria in a bimatrix game, we guarantee convergence to an exact CCE after only $O(1)$ iterations. Our results reveal that uncoupled no-regret learning algorithms can converge to CCE in general-sum games remarkably faster than to NE in, for example, zero-sum games. To establish this, we show that when OMD does not reach arbitrarily close to a NE, the (cumulative) regret of both players is not only negative, but decays linearly with time. Given that regret is the canonical measure of performance in online learning, our results suggest that cycling behavior of no-regret learning algorithms in games can be justified in terms of efficiency. △ Less

Submitted 6 October, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

Comments: To appear at NeurIPS 2022. V2 incorporates reviewers' feedback

arXiv:2203.12056 [pdf, other]

On Last-Iterate Convergence Beyond Zero-Sum Games

Authors: Ioannis Anagnostides, Ioannis Panageas, Gabriele Farina, Tuomas Sandholm

Abstract: Most existing results about \emph{last-iterate convergence} of learning dynamics are limited to two-player zero-sum games, and only apply under rigid assumptions about what dynamics the players follow. In this paper we provide new results and techniques that apply to broader families of games and learning dynamics. First, we use a regret-based analysis to show that in a class of games that include… ▽ More Most existing results about \emph{last-iterate convergence} of learning dynamics are limited to two-player zero-sum games, and only apply under rigid assumptions about what dynamics the players follow. In this paper we provide new results and techniques that apply to broader families of games and learning dynamics. First, we use a regret-based analysis to show that in a class of games that includes constant-sum polymatrix and strategically zero-sum games, dynamics such as \emph{optimistic mirror descent (OMD)} have \emph{bounded second-order path lengths}, a property which holds even when players employ different algorithms and prediction mechanisms. This enables us to obtain $O(1/\sqrt{T})$ rates and optimal $O(1)$ regret bounds. Our analysis also reveals a surprising property: OMD either reaches arbitrarily close to a Nash equilibrium, or it outperforms the \emph{robust price of anarchy} in efficiency. Moreover, for potential games we establish convergence to an $ε$-equilibrium after $O(1/ε^2)$ iterations for mirror descent under a broad class of regularizers, as well as optimal $O(1)$ regret bounds for OMD variants. Our framework also extends to near-potential games, and unifies known analyses for distributed learning in Fisher's market model. Finally, we analyze the convergence, efficiency, and robustness of \emph{optimistic gradient descent (OGD)} in general-sum continuous games. △ Less

Submitted 22 March, 2022; originally announced March 2022.

arXiv:2111.04178 [pdf, other]

Towards convergence to Nash equilibria in two-team zero-sum games

Authors: Fivos Kalogiannis, Ioannis Panageas, Emmanouil-Vasileios Vlatakis-Gkaragkounis

Abstract: Contemporary applications of machine learning in two-team e-sports and the superior expressivity of multi-agent generative adversarial networks raise important and overlooked theoretical questions regarding optimization in two-team games. Formally, two-team zero-sum games are defined as multi-player games where players are split into two competing sets of agents, each experiencing a utility identi… ▽ More Contemporary applications of machine learning in two-team e-sports and the superior expressivity of multi-agent generative adversarial networks raise important and overlooked theoretical questions regarding optimization in two-team games. Formally, two-team zero-sum games are defined as multi-player games where players are split into two competing sets of agents, each experiencing a utility identical to that of their teammates and opposite to that of the opposing team. We focus on the solution concept of Nash equilibria (NE). We first show that computing NE for this class of games is $\textit{hard}$ for the complexity class ${\mathrm{CLS}}$. To further examine the capabilities of online learning algorithms in games with full-information feedback, we propose a benchmark of a simple -- yet nontrivial -- family of such games. These games do not enjoy the properties used to prove convergence for relevant algorithms. In particular, we use a dynamical systems perspective to demonstrate that gradient descent-ascent, its optimistic variant, optimistic multiplicative weights update, and extra gradient fail to converge (even locally) to a Nash equilibrium. On a brighter note, we propose a first-order method that leverages control theory techniques and under some conditions enjoys last-iterate local convergence to a Nash equilibrium. We also believe our proposed method is of independent interest for general min-max optimization. △ Less

Submitted 16 April, 2023; v1 submitted 7 November, 2021; originally announced November 2021.

Comments: Paper accepted in ICLR 2023

arXiv:2110.10614 [pdf, other]

Independent Natural Policy Gradient Always Converges in Markov Potential Games

Authors: Roy Fox, Stephen McAleer, Will Overman, Ioannis Panageas

Abstract: Multi-agent reinforcement learning has been successfully applied to fully-cooperative and fully-competitive environments, but little is currently known about mixed cooperative/competitive environments. In this paper, we focus on a particular class of multi-agent mixed cooperative/competitive stochastic games called Markov Potential Games (MPGs), which include cooperative games as a special case. R… ▽ More Multi-agent reinforcement learning has been successfully applied to fully-cooperative and fully-competitive environments, but little is currently known about mixed cooperative/competitive environments. In this paper, we focus on a particular class of multi-agent mixed cooperative/competitive stochastic games called Markov Potential Games (MPGs), which include cooperative games as a special case. Recent results have shown that independent policy gradient converges in MPGs but it was not known whether Independent Natural Policy Gradient converges in MPGs as well. We prove that Independent Natural Policy Gradient always converges in the last iterate using constant learning rates. The proof deviates from the existing approaches and the main challenge lies in the fact that Markov Potential Games do not have unique optimal values (as single-agent settings exhibit) so different initializations can lead to different limit point values. We complement our theoretical results with experiments that indicate that Natural Policy Gradient outperforms Policy Gradient in routing games and congestion games. △ Less

Submitted 20 October, 2021; originally announced October 2021.

Comments: 24 pages

arXiv:2106.02024 [pdf, other]

Combinatorial Algorithms for Matching Markets via Nash Bargaining: One-Sided, Two-Sided and Non-Bipartite

Authors: Ioannis Panageas, Thorben Tröbst, Vijay V. Vazirani

Abstract: In the area of matching-based market design, existing models using cardinal utilities suffer from two deficiencies, which restrict applicability: First, the Hylland-Zeckhauser (HZ) mechanism, which has remained a classic in economics for one-sided matching markets, is intractable; computation of even an approximate equilibrium is PPAD-complete [Vazirani, Yannakakis 2021], [Chen et al 2022]. Second… ▽ More In the area of matching-based market design, existing models using cardinal utilities suffer from two deficiencies, which restrict applicability: First, the Hylland-Zeckhauser (HZ) mechanism, which has remained a classic in economics for one-sided matching markets, is intractable; computation of even an approximate equilibrium is PPAD-complete [Vazirani, Yannakakis 2021], [Chen et al 2022]. Second, there is an extreme paucity of such models. This led [Hosseini and Vazirani 2021] to define a rich collection of Nash-bargaining-based models for one-sided and two-sided matching markets, in both Fisher and Arrow-Debreu settings, together with implementations using available solvers and very encouraging experimental results. [Hosseini and Vazirani 2021] raised the question of finding efficient combinatorial algorithms, with proven running times, for these models. In this paper, we address this question by giving algorithms based on the techniques of multiplicative weights update (MWU) and conditional gradient descent (CGD). Additionally, we make the following conceptual contributions to the proposal of [Hosseini and Vazirani 2021] in order to set it on a more firm foundation: 1) We establish a connection between HZ and Nash-bargaining-based models via the celebrated Eisenberg-Gale convex program, thereby providing a theoretical ratification. 2) Whereas HZ satisfies envy-freeness, due to the presence of demand constraints, the Nash-bargaining-based models do not. We rectify this to the extent possible by showing that these models satisfy approximate equal-share fairness notions. 3) We define, for the first time, a model for non-bipartite matching markets under cardinal utilities. It is also Nash-bargaining-based and we solve it using CGD. △ Less

Submitted 3 August, 2022; v1 submitted 3 June, 2021; originally announced June 2021.

Comments: 53 pages. arXiv admin note: text overlap with arXiv:2105.10704

ACM Class: F.2

arXiv:2106.01969 [pdf, other]

Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games

Authors: Stefanos Leonardos, Will Overman, Ioannis Panageas, Georgios Piliouras

Abstract: Potential games are arguably one of the most important and widely studied classes of normal form games. They define the archetypal setting of multi-agent coordination as all agent utilities are perfectly aligned with each other via a common potential function. Can this intuitive framework be transplanted in the setting of Markov Games? What are the similarities and differences between multi-agent… ▽ More Potential games are arguably one of the most important and widely studied classes of normal form games. They define the archetypal setting of multi-agent coordination as all agent utilities are perfectly aligned with each other via a common potential function. Can this intuitive framework be transplanted in the setting of Markov Games? What are the similarities and differences between multi-agent coordination with and without state dependence? We present a novel definition of Markov Potential Games (MPG) that generalizes prior attempts at capturing complex stateful multi-agent coordination. Counter-intuitively, insights from normal-form potential games do not carry over as MPGs can consist of settings where state-games can be zero-sum games. In the opposite direction, Markov games where every state-game is a potential game are not necessarily MPGs. Nevertheless, MPGs showcase standard desirable properties such as the existence of deterministic Nash policies. In our main technical result, we prove fast convergence of independent policy gradient to Nash policies by adapting recent gradient dominance property arguments developed for single agent MDPs to multi-agent learning settings. △ Less

Submitted 28 September, 2021; v1 submitted 3 June, 2021; originally announced June 2021.

Comments: Fixed typos and a minor error in Proposition 3.2, condition C2

arXiv:2010.05263 [pdf, other]

Fast Convergence of Langevin Dynamics on Manifold: Geodesics meet Log-Sobolev

Authors: Xiao Wang, Qi Lei, Ioannis Panageas

Abstract: Sampling is a fundamental and arguably very important task with numerous applications in Machine Learning. One approach to sample from a high dimensional distribution $e^{-f}$ for some function $f$ is the Langevin Algorithm (LA). Recently, there has been a lot of progress in showing fast convergence of LA even in cases where $f$ is non-convex, notably [53], [39] in which the former paper focuses o… ▽ More Sampling is a fundamental and arguably very important task with numerous applications in Machine Learning. One approach to sample from a high dimensional distribution $e^{-f}$ for some function $f$ is the Langevin Algorithm (LA). Recently, there has been a lot of progress in showing fast convergence of LA even in cases where $f$ is non-convex, notably [53], [39] in which the former paper focuses on functions $f$ defined in $\mathbb{R}^n$ and the latter paper focuses on functions with symmetries (like matrix completion type objectives) with manifold structure. Our work generalizes the results of [53] where $f$ is defined on a manifold $M$ rather than $\mathbb{R}^n$. From technical point of view, we show that KL decreases in a geometric rate whenever the distribution $e^{-f}$ satisfies a log-Sobolev inequality on $M$. △ Less

Submitted 6 December, 2020; v1 submitted 11 October, 2020; originally announced October 2020.

arXiv:2006.09735 [pdf, other]

Efficient Statistics for Sparse Graphical Models from Truncated Samples

Authors: Arnab Bhattacharyya, Rathin Desai, Sai Ganesh Nagarajan, Ioannis Panageas

Abstract: In this paper, we study high-dimensional estimation from truncated samples. We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models. (i) For Gaussian graphical models, suppose $d$-dimensional samples ${\bf x}$ are generated from a Gaussian $N(μ,Σ)$ and observed only if they belong to a subset… ▽ More In this paper, we study high-dimensional estimation from truncated samples. We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models. (i) For Gaussian graphical models, suppose $d$-dimensional samples ${\bf x}$ are generated from a Gaussian $N(μ,Σ)$ and observed only if they belong to a subset $S \subseteq \mathbb{R}^d$. We show that $μ$ and $Σ$ can be estimated with error $ε$ in the Frobenius norm, using $\tilde{O}\left(\frac{\textrm{nz}(Σ^{-1})}{ε^2}\right)$ samples from a truncated $\mathcal{N}(μ,Σ)$ and having access to a membership oracle for $S$. The set $S$ is assumed to have non-trivial measure under the unknown distribution but is otherwise arbitrary. (ii) For sparse linear regression, suppose samples $({\bf x},y)$ are generated where $y = {\bf x}^\top{Ω^*} + \mathcal{N}(0,1)$ and $({\bf x}, y)$ is seen only if $y$ belongs to a truncation set $S \subseteq \mathbb{R}$. We consider the case that $Ω^*$ is sparse with a support set of size $k$. Our main result is to establish precise conditions on the problem dimension $d$, the support size $k$, the number of observations $n$, and properties of the samples and the truncation that are sufficient to recover the support of $Ω^*$. Specifically, we show that under some mild assumptions, only $O(k^2 \log d)$ samples are needed to estimate $Ω^*$ in the $\ell_\infty$-norm up to a bounded error. For both problems, our estimator minimizes the sum of the finite population negative log-likelihood function and an $\ell_1$-regularization term. △ Less

Submitted 17 June, 2020; originally announced June 2020.

arXiv:2003.08259 [pdf, ps, other]

Logistic-Regression with peer-group effects via inference in higher order Ising models

Authors: Constantinos Daskalakis, Nishanth Dikkala, Ioannis Panageas

Abstract: Spin glass models, such as the Sherrington-Kirkpatrick, Hopfield and Ising models, are all well-studied members of the exponential family of discrete distributions, and have been influential in a number of application domains where they are used to model correlation phenomena on networks. Conventionally these models have quadratic sufficient statistics and consequently capture correlations arising… ▽ More Spin glass models, such as the Sherrington-Kirkpatrick, Hopfield and Ising models, are all well-studied members of the exponential family of discrete distributions, and have been influential in a number of application domains where they are used to model correlation phenomena on networks. Conventionally these models have quadratic sufficient statistics and consequently capture correlations arising from pairwise interactions. In this work we study extensions of these to models with higher-order sufficient statistics, modeling behavior on a social network with peer-group effects. In particular, we model binary outcomes on a network as a higher-order spin glass, where the behavior of an individual depends on a linear function of their own vector of covariates and some polynomial function of the behavior of others, capturing peer-group effects. Using a {\em single}, high-dimensional sample from such model our goal is to recover the coefficients of the linear function as well as the strength of the peer-group effects. The heart of our result is a novel approach for showing strong concavity of the log pseudo-likelihood of the model, implying statistical error rate of $\sqrt{d/n}$ for the Maximum Pseudo-Likelihood Estimator (MPLE), where $d$ is the dimensionality of the covariate vectors and $n$ is the size of the network (number of nodes). Our model generalizes vanilla logistic regression as well as the peer-effect models studied in recent works, and our results extend these results to accommodate higher-order interactions. △ Less

Submitted 18 March, 2020; originally announced March 2020.

Comments: 16 pages

arXiv:2003.00777 [pdf, other]

Better Depth-Width Trade-offs for Neural Networks through the lens of Dynamical Systems

Authors: Vaggos Chatziafratis, Sai Ganesh Nagarajan, Ioannis Panageas

Abstract: The expressivity of neural networks as a function of their depth, width and type of activation units has been an important question in deep learning theory. Recently, depth separation results for ReLU networks were obtained via a new connection with dynamical systems, using a generalized notion of fixed points of a continuous map $f$, called periodic points. In this work, we strengthen the connect… ▽ More The expressivity of neural networks as a function of their depth, width and type of activation units has been an important question in deep learning theory. Recently, depth separation results for ReLU networks were obtained via a new connection with dynamical systems, using a generalized notion of fixed points of a continuous map $f$, called periodic points. In this work, we strengthen the connection with dynamical systems and we improve the existing width lower bounds along several aspects. Our first main result is period-specific width lower bounds that hold under the stronger notion of $L^1$-approximation error, instead of the weaker classification error. Our second contribution is that we provide sharper width lower bounds, still yielding meaningful exponential depth-width separations, in regimes where previous results wouldn't apply. A byproduct of our results is that there exists a universal constant characterizing the depth-width trade-offs, as long as $f$ has odd periods. Technically, our results follow by unveiling a tighter connection between the following three quantities of a given function: its period, its Lipschitz constant and the growth rate of the number of oscillations arising under compositions of the function $f$ with itself. △ Less

Submitted 20 July, 2020; v1 submitted 2 March, 2020; originally announced March 2020.

Comments: Appeared in ICML 2020

arXiv:2002.11323 [pdf, other]

Convergence to Second-Order Stationarity for Non-negative Matrix Factorization: Provably and Concurrently

Authors: Ioannis Panageas, Stratis Skoulakis, Antonios Varvitsiotis, Xiao Wang

Abstract: Non-negative matrix factorization (NMF) is a fundamental non-convex optimization problem with numerous applications in Machine Learning (music analysis, document clustering, speech-source separation etc). Despite having received extensive study, it is poorly understood whether or not there exist natural algorithms that can provably converge to a local minimum. Part of the reason is because the obj… ▽ More Non-negative matrix factorization (NMF) is a fundamental non-convex optimization problem with numerous applications in Machine Learning (music analysis, document clustering, speech-source separation etc). Despite having received extensive study, it is poorly understood whether or not there exist natural algorithms that can provably converge to a local minimum. Part of the reason is because the objective is heavily symmetric and its gradient is not Lipschitz. In this paper we define a multiplicative weight update type dynamics (modification of the seminal Lee-Seung algorithm) that runs concurrently and provably avoids saddle points (first order stationary points that are not second order). Our techniques combine tools from dynamical systems such as stability and exploit the geometry of the NMF objective by reducing the standard NMF formulation over the non-negative orthant to a new formulation over (a scaled) simplex. An important advantage of our method is the use of concurrent updates, which permits implementations in parallel computing environments. △ Less

Submitted 19 March, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

arXiv:2002.06768 [pdf, other]

Last iterate convergence in no-regret learning: constrained min-max optimization for convex-concave landscapes

Authors: Qi Lei, Sai Ganesh Nagarajan, Ioannis Panageas, Xiao Wang

Abstract: In a recent series of papers it has been established that variants of Gradient Descent/Ascent and Mirror Descent exhibit last iterate convergence in convex-concave zero-sum games. Specifically, \cite{DISZ17, LiangS18} show last iterate convergence of the so called "Optimistic Gradient Descent/Ascent" for the case of \textit{unconstrained} min-max optimization. Moreover, in \cite{Metal} the authors… ▽ More In a recent series of papers it has been established that variants of Gradient Descent/Ascent and Mirror Descent exhibit last iterate convergence in convex-concave zero-sum games. Specifically, \cite{DISZ17, LiangS18} show last iterate convergence of the so called "Optimistic Gradient Descent/Ascent" for the case of \textit{unconstrained} min-max optimization. Moreover, in \cite{Metal} the authors show that Mirror Descent with an extra gradient step displays last iterate convergence for convex-concave problems (both constrained and unconstrained), though their algorithm does not follow the online learning framework; it uses extra information rather than \textit{only} the history to compute the next iteration. In this work, we show that "Optimistic Multiplicative-Weights Update (OMWU)" which follows the no-regret online learning framework, exhibits last iterate convergence locally for convex-concave games, generalizing the results of \cite{DP19} where last iterate convergence of OMWU was shown only for the \textit{bilinear case}. We complement our results with experiments that indicate fast convergence of the method. △ Less

Submitted 21 February, 2020; v1 submitted 16 February, 2020; originally announced February 2020.

arXiv:1912.04378 [pdf, ps, other]

Depth-Width Trade-offs for ReLU Networks via Sharkovsky's Theorem

Authors: Vaggos Chatziafratis, Sai Ganesh Nagarajan, Ioannis Panageas, Xiao Wang

Abstract: Understanding the representational power of Deep Neural Networks (DNNs) and how their structural properties (e.g., depth, width, type of activation unit) affect the functions they can compute, has been an important yet challenging question in deep learning and approximation theory. In a seminal paper, Telgarsky highlighted the benefits of depth by presenting a family of functions (based on simple… ▽ More Understanding the representational power of Deep Neural Networks (DNNs) and how their structural properties (e.g., depth, width, type of activation unit) affect the functions they can compute, has been an important yet challenging question in deep learning and approximation theory. In a seminal paper, Telgarsky highlighted the benefits of depth by presenting a family of functions (based on simple triangular waves) for which DNNs achieve zero classification error, whereas shallow networks with fewer than exponentially many nodes incur constant error. Even though Telgarsky's work reveals the limitations of shallow neural networks, it does not inform us on why these functions are difficult to represent and in fact he states it as a tantalizing open question to characterize those functions that cannot be well-approximated by smaller depths. In this work, we point to a new connection between DNNs expressivity and Sharkovsky's Theorem from dynamical systems, that enables us to characterize the depth-width trade-offs of ReLU networks for representing functions based on the presence of generalized notion of fixed points, called periodic points (a fixed point is a point of period 1). Motivated by our observation that the triangle waves used in Telgarsky's work contain points of period 3 - a period that is special in that it implies chaotic behavior based on the celebrated result by Li-Yorke - we proceed to give general lower bounds for the width needed to represent periodic functions as a function of the depth. Technically, the crux of our approach is based on an eigenvalue analysis of the dynamical system associated with such functions. △ Less

Submitted 9 December, 2019; originally announced December 2019.

arXiv:1905.03353 [pdf, ps, other]

Regression from Dependent Observations

Authors: Constantinos Daskalakis, Nishanth Dikkala, Ioannis Panageas

Abstract: The standard linear and logistic regression models assume that the response variables are independent, but share the same linear relationship to their corresponding vectors of covariates. The assumption that the response variables are independent is, however, too strong. In many applications, these responses are collected on nodes of a network, or some spatial or temporal domain, and are dependent… ▽ More The standard linear and logistic regression models assume that the response variables are independent, but share the same linear relationship to their corresponding vectors of covariates. The assumption that the response variables are independent is, however, too strong. In many applications, these responses are collected on nodes of a network, or some spatial or temporal domain, and are dependent. Examples abound in financial and meteorological applications, and dependencies naturally arise in social networks through peer effects. Regression with dependent responses has thus received a lot of attention in the Statistics and Economics literature, but there are no strong consistency results unless multiple independent samples of the vectors of dependent responses can be collected from these models. We present computationally and statistically efficient methods for linear and logistic regression models when the response variables are dependent on a network. Given one sample from a networked linear or logistic regression model and under mild assumptions, we prove strong consistency results for recovering the vector of coefficients and the strength of the dependencies, recovering the rates of standard regression under independent observations. We use projected gradient descent on the negative log-likelihood, or negative log-pseudolikelihood, and establish their strong convexity and consistency using concentration of measure for dependent random variables. △ Less

Submitted 8 October, 2019; v1 submitted 8 May, 2019; originally announced May 2019.

Comments: 33 pages, in proceedings of STOC 2019

arXiv:1902.06958 [pdf, other]

On the Analysis of EM for truncated mixtures of two Gaussians

Authors: Sai Ganesh Nagarajan, Ioannis Panageas

Abstract: Motivated by a recent result of Daskalakis et al. 2018, we analyze the population version of Expectation-Maximization (EM) algorithm for the case of \textit{truncated} mixtures of two Gaussians. Truncated samples from a $d$-dimensional mixture of two Gaussians $\frac{1}{2} \mathcal{N}(\vecμ, \vecΣ)+ \frac{1}{2} \mathcal{N}(-\vecμ, \vecΣ)$ means that a sample is only revealed if it falls in some su… ▽ More Motivated by a recent result of Daskalakis et al. 2018, we analyze the population version of Expectation-Maximization (EM) algorithm for the case of \textit{truncated} mixtures of two Gaussians. Truncated samples from a $d$-dimensional mixture of two Gaussians $\frac{1}{2} \mathcal{N}(\vecμ, \vecΣ)+ \frac{1}{2} \mathcal{N}(-\vecμ, \vecΣ)$ means that a sample is only revealed if it falls in some subset $S \subset \mathbb{R}^d$ of positive (Lebesgue) measure. We show that for $d=1$, EM converges almost surely (under random initialization) to the true mean (variance $σ^2$ is known) for any measurable set $S$. Moreover, for $d>1$ we show EM almost surely converges to the true mean for any measurable set $S$ when the map of EM has only three fixed points, namely $-\vecμ, \vec{0}, \vecμ$ (covariance matrix $\vecΣ$ is known), and prove local convergence if there are more than three fixed points. We also provide convergence rates of our findings. Our techniques deviate from those of Daskalakis et al. 2017, which heavily depend on symmetry that the untruncated problem exhibits. For example, for an arbitrary measurable set $S$, it is impossible to compute a closed form of the update rule of EM. Moreover, arbitrarily truncating the mixture, induces further correlations among the variables. We circumvent these challenges by using techniques from dynamical systems, probability and statistics; implicit function theorem, stability analysis around the fixed points of the update rule of EM and correlation inequalities (FKG). △ Less

Submitted 9 May, 2020; v1 submitted 19 February, 2019; originally announced February 2019.

Comments: Appeared in ALT 2020. Last version fixes statement about rates for single dimensional case

arXiv:1807.04252 [pdf, other]

Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization

Authors: Constantinos Daskalakis, Ioannis Panageas

Abstract: Motivated by applications in Game Theory, Optimization, and Generative Adversarial Networks, recent work of Daskalakis et al \cite{DISZ17} and follow-up work of Liang and Stokes \cite{LiangS18} have established that a variant of the widely used Gradient Descent/Ascent procedure, called "Optimistic Gradient Descent/Ascent (OGDA)", exhibits last-iterate convergence to saddle points in {\em unconstra… ▽ More Motivated by applications in Game Theory, Optimization, and Generative Adversarial Networks, recent work of Daskalakis et al \cite{DISZ17} and follow-up work of Liang and Stokes \cite{LiangS18} have established that a variant of the widely used Gradient Descent/Ascent procedure, called "Optimistic Gradient Descent/Ascent (OGDA)", exhibits last-iterate convergence to saddle points in {\em unconstrained} convex-concave min-max optimization problems. We show that the same holds true in the more general problem of {\em constrained} min-max optimization under a variant of the no-regret Multiplicative-Weights-Update method called "Optimistic Multiplicative-Weights Update (OMWU)". This answers an open question of Syrgkanis et al \cite{SALS15}. The proof of our result requires fundamentally different techniques from those that exist in no-regret learning literature and the aforementioned papers. We show that OMWU monotonically improves the Kullback-Leibler divergence of the current iterate to the (appropriately normalized) min-max solution until it enters a neighborhood of the solution. Inside that neighborhood we show that OMWU is locally (asymptotically) stable converging to the exact solution. We believe that our techniques will be useful in the analysis of the last iterate of other learning algorithms. △ Less

Submitted 2 December, 2020; v1 submitted 11 July, 2018; originally announced July 2018.

Comments: Appeared in ITCS 2019

arXiv:1807.03907 [pdf, other]

The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization

Authors: Constantinos Daskalakis, Ioannis Panageas

Abstract: Motivated by applications in Optimization, Game Theory, and the training of Generative Adversarial Networks, the convergence properties of first order methods in min-max problems have received extensive study. It has been recognized that they may cycle, and there is no good understanding of their limit points when they do not. When they converge, do they converge to local min-max solutions? We cha… ▽ More Motivated by applications in Optimization, Game Theory, and the training of Generative Adversarial Networks, the convergence properties of first order methods in min-max problems have received extensive study. It has been recognized that they may cycle, and there is no good understanding of their limit points when they do not. When they converge, do they converge to local min-max solutions? We characterize the limit points of two basic first order methods, namely Gradient Descent/Ascent (GDA) and Optimistic Gradient Descent Ascent (OGDA). We show that both dynamics avoid unstable critical points for almost all initializations. Moreover, for small step sizes and under mild assumptions, the set of \{OGDA\}-stable critical points is a superset of \{GDA\}-stable critical points, which is a superset of local min-max solutions (strict in some cases). The connecting thread is that the behavior of these dynamics can be studied from a dynamical systems perspective. △ Less

Submitted 10 July, 2018; originally announced July 2018.

arXiv:1710.11249 [pdf, other]

Rock-Paper-Scissors, Differential Games and Biological Diversity

Authors: Tung Mai, Ioannis Panageas, Will Ratcliff, Vijay V. Vazirani, Peter Yunker

Abstract: We model a situation in which a collection of species derive their fitnesses via a rock-paper-scissors-type game; however, the precise payoffs are a function of the environment. The new aspect of our model lies in adding a feedback loop: the environment changes according to the relative fitnesses of the species; in particular, it gives a boost to the species having small populations. We cast our m… ▽ More We model a situation in which a collection of species derive their fitnesses via a rock-paper-scissors-type game; however, the precise payoffs are a function of the environment. The new aspect of our model lies in adding a feedback loop: the environment changes according to the relative fitnesses of the species; in particular, it gives a boost to the species having small populations. We cast our model in the setting of a differential game and we show that for a certain setting of parameters, this dynamics cycles. Our model is a natural one, since depletion of resources used by more frequent species will shift the payoff matrix towards favoring less frequent ones. Since the dynamics cycles, no species goes extinct and diversity is maintained. △ Less

Submitted 30 October, 2017; originally announced October 2017.

arXiv:1710.07406 [pdf, ps, other]

First-order Methods Almost Always Avoid Saddle Points

Authors: Jason D. Lee, Ioannis Panageas, Georgios Piliouras, Max Simchowitz, Michael I. Jordan, Benjamin Recht

Abstract: We establish that first-order methods avoid saddle points for almost all initializations. Our results apply to a wide variety of first-order methods, including gradient descent, block coordinate descent, mirror descent and variants thereof. The connecting thread is that such algorithms can be studied from a dynamical systems perspective in which appropriate instantiations of the Stable Manifold Th… ▽ More We establish that first-order methods avoid saddle points for almost all initializations. Our results apply to a wide variety of first-order methods, including gradient descent, block coordinate descent, mirror descent and variants thereof. The connecting thread is that such algorithms can be studied from a dynamical systems perspective in which appropriate instantiations of the Stable Manifold Theorem allow for a global stability analysis. Thus, neither access to second-order derivative information nor randomness beyond initialization is necessary to provably avoid saddle points. △ Less

Submitted 19 October, 2017; originally announced October 2017.

arXiv:1703.01138 [pdf, other]

Multiplicative Weights Update with Constant Step-Size in Congestion Games: Convergence, Limit Cycles and Chaos

Authors: Gerasimos Palaiopanos, Ioannis Panageas, Georgios Piliouras

Abstract: The Multiplicative Weights Update (MWU) method is a ubiquitous meta-algorithm that works as follows: A distribution is maintained on a certain set, and at each step the probability assigned to element $γ$ is multiplied by $(1 -εC(γ))>0$ where $C(γ)$ is the "cost" of element $γ$ and then rescaled to ensure that the new values form a distribution. We analyze MWU in congestion games where agents use… ▽ More The Multiplicative Weights Update (MWU) method is a ubiquitous meta-algorithm that works as follows: A distribution is maintained on a certain set, and at each step the probability assigned to element $γ$ is multiplied by $(1 -εC(γ))>0$ where $C(γ)$ is the "cost" of element $γ$ and then rescaled to ensure that the new values form a distribution. We analyze MWU in congestion games where agents use \textit{arbitrary admissible constants} as learning rates $ε$ and prove convergence to \textit{exact Nash equilibria}. Our proof leverages a novel connection between MWU and the Baum-Welch algorithm, the standard instantiation of the Expectation-Maximization (EM) algorithm for hidden Markov models (HMM). Interestingly, this convergence result does not carry over to the nearly homologous MWU variant where at each step the probability assigned to element $γ$ is multiplied by $(1 -ε)^{C(γ)}$ even for the most innocuous case of two-agent, two-strategy load balancing games, where such dynamics can provably lead to limit cycles or even chaotic behavior. △ Less

Submitted 3 March, 2017; originally announced March 2017.

Comments: 17 pages, 9 figures

arXiv:1607.03881 [pdf, other]

Opinion Dynamics in Networks: Convergence, Stability and Lack of Explosion

Authors: Tung Mai, Ioannis Panageas, Vijay V. Vazirani

Abstract: Inspired by the work of [Kempe, Kleinberg, Oren, Slivkins, EC13] we introduce and analyze a model on opinion formation; the update rule of our dynamics is a simplified version of that of Kempe et. al. We assume that the population is partitioned into types whose interaction pattern is specified by a graph. Interaction leads to population mass moving from types of smaller mass to those of bigger. W… ▽ More Inspired by the work of [Kempe, Kleinberg, Oren, Slivkins, EC13] we introduce and analyze a model on opinion formation; the update rule of our dynamics is a simplified version of that of Kempe et. al. We assume that the population is partitioned into types whose interaction pattern is specified by a graph. Interaction leads to population mass moving from types of smaller mass to those of bigger. We show that starting uniformly at random over all population vectors on the simplex, our dynamics converges point-wise with probability one to an independent set. This settles an open problem of Kempe et. al., as applicable to our dynamics. We believe that our techniques can be used to settle the open problem for the Kempe et. al. dynamics as well. Next, we extend the model of Kempe et. al. by introducing the notion of birth and death of types, with the interaction graph evolving appropriately. Birth of types is determined by a Bernoulli process and types die when their population mass is less than a parameter $ε$. We show that if the births are infrequent, then there are long periods of "stability" in which there is no population mass that moves. Finally we show that even if births are frequent and "stability" is not attained, the total number of types does not explode: it remains logarithmic in $1/ε$. △ Less

Submitted 27 April, 2017; v1 submitted 13 July, 2016; originally announced July 2016.

Comments: 24 pages, 3 figures. Preliminary version in ICALP 2017

arXiv:1605.00405 [pdf, other]

Gradient Descent Only Converges to Minimizers: Non-Isolated Critical Points and Invariant Regions

Authors: Ioannis Panageas, Georgios Piliouras

Abstract: Given a non-convex twice differentiable cost function f, we prove that the set of initial conditions so that gradient descent converges to saddle points where \nabla^2 f has at least one strictly negative eigenvalue has (Lebesgue) measure zero, even for cost functions f with non-isolated critical points, answering an open question in [Lee, Simchowitz, Jordan, Recht, COLT2016]. Moreover, this resul… ▽ More Given a non-convex twice differentiable cost function f, we prove that the set of initial conditions so that gradient descent converges to saddle points where \nabla^2 f has at least one strictly negative eigenvalue has (Lebesgue) measure zero, even for cost functions f with non-isolated critical points, answering an open question in [Lee, Simchowitz, Jordan, Recht, COLT2016]. Moreover, this result extends to forward-invariant convex subspaces, allowing for weak (non-globally Lipschitz) smoothness assumptions. Finally, we produce an upper bound on the allowable step-size. △ Less

Submitted 7 June, 2016; v1 submitted 2 May, 2016; originally announced May 2016.

Comments: 2 figures

arXiv:1511.01409 [pdf, other]

Mutation, Sexual Reproduction and Survival in Dynamic Environments

Authors: Ruta Mehta, Ioannis Panageas, Georgios Piliouras, Prasad Tetali, Vijay V. Vazirani

Abstract: A new approach to understanding evolution [Val09], namely viewing it through the lens of computation, has already started yielding new insights, e.g., natural selection under sexual reproduction can be interpreted as the Multiplicative Weight Update (MWU) Algorithm in coordination games played among genes [CLPV14]. Using this machinery, we study the role of mutation in changing environments in the… ▽ More A new approach to understanding evolution [Val09], namely viewing it through the lens of computation, has already started yielding new insights, e.g., natural selection under sexual reproduction can be interpreted as the Multiplicative Weight Update (MWU) Algorithm in coordination games played among genes [CLPV14]. Using this machinery, we study the role of mutation in changing environments in the presence of sexual reproduction. Following [WVA05], we model changing environments via a Markov chain, with the states representing environments, each with its own fitness matrix. In this setting, we show that in the absence of mutation, the population goes extinct, but in the presence of mutation, the population survives with positive probability. On the way to proving the above theorem, we need to establish some facts about dynamics in games. We provide the first, to our knowledge, polynomial convergence bound for noisy MWU in a coordination game. Finally, we also show that in static environments, sexual evolution with mutation converges, for any level of mutation. △ Less

Submitted 22 April, 2016; v1 submitted 4 November, 2015; originally announced November 2015.

Comments: 28 pages, 3 figures. Change introduction and abstract. Restructuring the paper

arXiv:1411.6322 [pdf, other]

The Complexity of Genetic Diversity

Authors: Ruta Mehta, Ioannis Panageas, Georgios Piliouras, Sadra Yazdanbod

Abstract: A key question in biological systems is whether genetic diversity persists in the long run under evolutionary competition or whether a single dominant genotype emerges. Classic work by Kalmus in 1945 has established that even in simple diploid species (species with two chromosomes) diversity can be guaranteed as long as the heterozygote individuals enjoy a selective advantage. Despite the classic… ▽ More A key question in biological systems is whether genetic diversity persists in the long run under evolutionary competition or whether a single dominant genotype emerges. Classic work by Kalmus in 1945 has established that even in simple diploid species (species with two chromosomes) diversity can be guaranteed as long as the heterozygote individuals enjoy a selective advantage. Despite the classic nature of the problem, as we move towards increasingly polymorphic traits (e.g. human blood types) predicting diversity and understanding its implications is still not fully understood. Our key contribution is to establish complexity theoretic hardness results implying that even in the textbook case of single locus diploid models predicting whether diversity survives or not given its fitness landscape is algorithmically intractable. We complement our results by establishing that under randomly chosen fitness landscapes diversity survives with significant probability. Our results are structurally robust along several dimensions (e.g., choice of parameter distribution, different definitions of stability/persistence, restriction to typical subclasses of fitness landscapes). Technically, our results exploit connections between game theory, nonlinear dynamical systems, complexity theory and biology and establish hardness results for predicting the evolution of a deterministic variant of the well known multiplicative weights update algorithm in symmetric coordination games which could be of independent interest. △ Less

Submitted 22 October, 2015; v1 submitted 23 November, 2014; originally announced November 2014.

Comments: 24 pages, 2 figues

arXiv:1408.6270 [pdf, other]

Natural Selection as an Inhibitor of Genetic Diversity: Multiplicative Weights Updates Algorithm and a Conjecture of Haploid Genetics

Authors: Ruta Mehta, Ioannis Panageas, Georgios Piliouras

Abstract: In a recent series of papers a surprisingly strong connection was discovered between standard models of evolution in mathematical biology and Multiplicative Weights Updates Algorithm, a ubiquitous model of online learning and optimization. These papers establish that mathematical models of biological evolution are tantamount to applying discrete Multiplicative Weights Updates Algorithm, a close va… ▽ More In a recent series of papers a surprisingly strong connection was discovered between standard models of evolution in mathematical biology and Multiplicative Weights Updates Algorithm, a ubiquitous model of online learning and optimization. These papers establish that mathematical models of biological evolution are tantamount to applying discrete Multiplicative Weights Updates Algorithm, a close variant of MWUA, on coordination games. This connection allows for introducing insights from the study of game theoretic dynamics into the field of mathematical biology. Using these results as a step** stone, we show that mathematical models of haploid evolution imply the extinction of genetic diversity in the long term limit, a widely believed conjecture in genetics. In game theoretic terms we show that in the case of coordination games, under minimal genericity assumptions, discrete MWUA converges to pure Nash equilibria for all but a zero measure of initial conditions. This result holds despite the fact that mixed Nash equilibria can be exponentially (or even uncountably) many, completely dominating in number the set of pure Nash equilibria. Thus, in haploid organisms the long term preservation of genetic diversity needs to be safeguarded by other evolutionary mechanisms such as mutations and speciation. △ Less

Submitted 7 October, 2014; v1 submitted 26 August, 2014; originally announced August 2014.

Comments: 18 pages, 1 figure

arXiv:1403.3885 [pdf, other]

Average Case Performance of Replicator Dynamics in Potential Games via Computing Regions of Attraction

Authors: Ioannis Panageas, Georgios Piliouras

Abstract: What does it mean to fully understand the behavior of a network of adaptive agents? The golden standard typically is the behavior of learning dynamics in potential games, where many evolutionary dynamics, e.g., replicator, are known to converge to sets of equilibria. Even in such classic settings many critical questions remain unanswered. We examine issues such as: Point-wise convergence: Does t… ▽ More What does it mean to fully understand the behavior of a network of adaptive agents? The golden standard typically is the behavior of learning dynamics in potential games, where many evolutionary dynamics, e.g., replicator, are known to converge to sets of equilibria. Even in such classic settings many critical questions remain unanswered. We examine issues such as: Point-wise convergence: Does the system actually equilibrate even in the presence of continuums of equilibria? Computing regions of attraction: Given point-wise convergence can we compute the region of asymptotic stability of each equilibrium (e.g., estimate its volume, geometry)? System invariants: Invariant functions remain constant along every system trajectory. This notion is orthogonal to the game theoretic concept of a potential function, which always strictly increases/decreases along system trajectories. Do dynamics in potential games exhibit invariant functions? If so, how many? How do these functions look like? Based on these geometric characterizations, we propose a novel quantitative framework for analyzing the efficiency of potential games with many equilibria. The predictions of different equilibria are weighted by their probability to arise under evolutionary dynamics given uniformly random initial conditions. This average case analysis is shown to offer novel insights in classic game theoretic challenges, including quantifying the risk dominance in stag-hunt games and allowing for more nuanced performance analysis in networked coordination and congestion games with large gaps between price of stability and price of anarchy. △ Less

Submitted 2 October, 2016; v1 submitted 16 March, 2014; originally announced March 2014.

Comments: 33 pages, 3 figures

Showing 1–39 of 39 results for author: Panageas, I