-
Learning, Diversity and Adaptation in Changing Environments: The Role of Weak Links
Authors:
Daron Acemoglu,
Asuman Ozdaglar,
Sarath Pattathil
Abstract:
Adaptation to dynamic conditions requires a certain degree of diversity. If all agents take the best current action, learning that the underlying state has changed and behavior should adapt will be slower. Diversity is harder to maintain when there is fast communication between agents, because they tend to find out and pursue the best action rapidly. We explore these issues using a model of (Bayes…
▽ More
Adaptation to dynamic conditions requires a certain degree of diversity. If all agents take the best current action, learning that the underlying state has changed and behavior should adapt will be slower. Diversity is harder to maintain when there is fast communication between agents, because they tend to find out and pursue the best action rapidly. We explore these issues using a model of (Bayesian) learning over a social network. Agents learn rapidly from and may also have incentives to coordinate with others to whom they are connected via strong links. We show, however, that when the underlying environment changes sufficiently rapidly, any network consisting of just strong links will do only a little better than random choice in the long run. In contrast, networks combining strong and weak links, whereby the latter type of links transmit information only slowly, can achieve much higher long-run average payoffs. The best social networks are those that combine a large fraction of agents into a strongly-connected component, while still maintaining a sufficient number of smaller communities that make diverse choices and communicate with this component via weak links.
△ Less
Submitted 30 April, 2023;
originally announced May 2023.
-
Autobidders with Budget and ROI Constraints: Efficiency, Regret, and Pacing Dynamics
Authors:
Brendan Lucier,
Sarath Pattathil,
Aleksandrs Slivkins,
Mengxiao Zhang
Abstract:
We study a game between autobidding algorithms that compete in an online advertising platform. Each autobidder is tasked with maximizing its advertiser's total value over multiple rounds of a repeated auction, subject to budget and return-on-investment constraints. We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret.…
▽ More
We study a game between autobidding algorithms that compete in an online advertising platform. Each autobidder is tasked with maximizing its advertiser's total value over multiple rounds of a repeated auction, subject to budget and return-on-investment constraints. We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret. Our algorithm uses only bandit feedback and can be used with the first- or second-price auction, as well as with any "intermediate" auction format. Our main result is that when these autobidders play against each other, the resulting expected liquid welfare over all rounds is at least half of the expected optimal liquid welfare achieved by any allocation. This holds whether or not the bidding dynamics converges to an equilibrium.
△ Less
Submitted 11 June, 2024; v1 submitted 30 January, 2023;
originally announced January 2023.
-
Revisiting the Linear-Programming Framework for Offline RL with General Function Approximation
Authors:
Asuman Ozdaglar,
Sarath Pattathil,
Jiawei Zhang,
Kaiqing Zhang
Abstract:
Offline reinforcement learning (RL) aims to find an optimal policy for sequential decision-making using a pre-collected dataset, without further interaction with the environment. Recent theoretical progress has focused on develo** sample-efficient offline RL algorithms with various relaxed assumptions on data coverage and function approximators, especially to handle the case with excessively lar…
▽ More
Offline reinforcement learning (RL) aims to find an optimal policy for sequential decision-making using a pre-collected dataset, without further interaction with the environment. Recent theoretical progress has focused on develo** sample-efficient offline RL algorithms with various relaxed assumptions on data coverage and function approximators, especially to handle the case with excessively large state-action spaces. Among them, the framework based on the linear-programming (LP) reformulation of Markov decision processes has shown promise: it enables sample-efficient offline RL with function approximation, under only partial data coverage and realizability assumptions on the function classes, with favorable computational tractability. In this work, we revisit the LP framework for offline RL, and provide a new reformulation that advances the existing results in several aspects, relaxing certain assumptions and achieving optimal statistical rates in terms of sample size. Our key enabler is to introduce proper constraints in the reformulation, instead of using any regularization as in the literature, also with careful choices of the function classes and initial state distributions. We hope our insights bring into light the use of LP formulations and the induced primal-dual minimax optimization, in offline RL.
△ Less
Submitted 8 February, 2023; v1 submitted 28 December, 2022;
originally announced December 2022.
-
Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence
Authors:
Sarath Pattathil,
Kaiqing Zhang,
Asuman Ozdaglar
Abstract:
Multi-agent interactions are increasingly important in the context of reinforcement learning, and the theoretical foundations of policy gradient methods have attracted surging research interest. We investigate the global convergence of natural policy gradient (NPG) algorithms in multi-agent learning. We first show that vanilla NPG may not have parameter convergence, i.e., the convergence of the ve…
▽ More
Multi-agent interactions are increasingly important in the context of reinforcement learning, and the theoretical foundations of policy gradient methods have attracted surging research interest. We investigate the global convergence of natural policy gradient (NPG) algorithms in multi-agent learning. We first show that vanilla NPG may not have parameter convergence, i.e., the convergence of the vector that parameterizes the policy, even when the costs are regularized (which enabled strong convergence guarantees in the policy space in the literature). This non-convergence of parameters leads to stability issues in learning, which becomes especially relevant in the function approximation setting, where we can only operate on low-dimensional parameters, instead of the high-dimensional policy. We then propose variants of the NPG algorithm, for several standard multi-agent learning scenarios: two-player zero-sum matrix and Markov games, and multi-player monotone games, with global last-iterate parameter convergence guarantees. We also generalize the results to certain function approximation settings. Note that in our algorithms, the agents take symmetric roles. Our results might also be of independent interest for solving nonconvex-nonconcave minimax optimization problems with certain structures. Simulations are also provided to corroborate our theoretical findings.
△ Less
Submitted 20 March, 2023; v1 submitted 23 October, 2022;
originally announced October 2022.
-
What is a Good Metric to Study Generalization of Minimax Learners?
Authors:
Asuman Ozdaglar,
Sarath Pattathil,
Jiawei Zhang,
Kaiqing Zhang
Abstract:
Minimax optimization has served as the backbone of many machine learning (ML) problems. Although the convergence behavior of optimization algorithms has been extensively studied in the minimax settings, their generalization guarantees in stochastic minimax optimization problems, i.e., how the solution trained on empirical data performs on unseen testing data, have been relatively underexplored. A…
▽ More
Minimax optimization has served as the backbone of many machine learning (ML) problems. Although the convergence behavior of optimization algorithms has been extensively studied in the minimax settings, their generalization guarantees in stochastic minimax optimization problems, i.e., how the solution trained on empirical data performs on unseen testing data, have been relatively underexplored. A fundamental question remains elusive: What is a good metric to study generalization of minimax learners? In this paper, we aim to answer this question by first showing that primal risk, a universal metric to study generalization in minimization problems, which has also been adopted recently to study generalization in minimax ones, fails in simple examples. We thus propose a new metric to study generalization of minimax learners: the primal gap, defined as the difference between the primal risk and its minimum over all models, to circumvent the issues. Next, we derive generalization error bounds for the primal gap in nonconvex-concave settings. As byproducts of our analysis, we also solve two open questions: establishing generalization error bounds for primal risk and primal-dual risk, another existing metric that is only well-defined when the global saddle-point exists, in the strong sense, i.e., without strong concavity or assuming that the maximization and expectation can be interchanged, while either of these assumptions was needed in the literature. Finally, we leverage this new metric to compare the generalization behavior of two popular algorithms -- gradient descent-ascent (GDA) and gradient descent-max (GDMax) in stochastic minimax optimization.
△ Less
Submitted 20 June, 2022; v1 submitted 9 June, 2022;
originally announced June 2022.
-
Tight last-iterate convergence rates for no-regret learning in multi-player games
Authors:
Noah Golowich,
Sarath Pattathil,
Constantinos Daskalakis
Abstract:
We study the question of obtaining last-iterate convergence rates for no-regret learning algorithms in multi-player games. We show that the optimistic gradient (OG) algorithm with a constant step-size, which is no-regret, achieves a last-iterate rate of $O(1/\sqrt{T})$ with respect to the gap function in smooth monotone games. This result addresses a question of Mertikopoulos & Zhou (2018), who as…
▽ More
We study the question of obtaining last-iterate convergence rates for no-regret learning algorithms in multi-player games. We show that the optimistic gradient (OG) algorithm with a constant step-size, which is no-regret, achieves a last-iterate rate of $O(1/\sqrt{T})$ with respect to the gap function in smooth monotone games. This result addresses a question of Mertikopoulos & Zhou (2018), who asked whether extra-gradient approaches (such as OG) can be applied to achieve improved guarantees in the multi-agent learning setting. The proof of our upper bound uses a new technique centered around an adaptive choice of potential function at each iteration. We also show that the $O(1/\sqrt{T})$ rate is tight for all $p$-SCLI algorithms, which includes OG as a special case. As a byproduct of our lower bound analysis we additionally present a proof of a conjecture of Arjevani et al. (2015) which is more direct than previous approaches.
△ Less
Submitted 26 October, 2020;
originally announced October 2020.
-
An Optimal Multistage Stochastic Gradient Method for Minimax Problems
Authors:
Alireza Fallah,
Asuman Ozdaglar,
Sarath Pattathil
Abstract:
In this paper, we study the minimax optimization problem in the smooth and strongly convex-strongly concave setting when we have access to noisy estimates of gradients. In particular, we first analyze the stochastic Gradient Descent Ascent (GDA) method with constant stepsize, and show that it converges to a neighborhood of the solution of the minimax problem. We further provide tight bounds on the…
▽ More
In this paper, we study the minimax optimization problem in the smooth and strongly convex-strongly concave setting when we have access to noisy estimates of gradients. In particular, we first analyze the stochastic Gradient Descent Ascent (GDA) method with constant stepsize, and show that it converges to a neighborhood of the solution of the minimax problem. We further provide tight bounds on the convergence rate and the size of this neighborhood. Next, we propose a multistage variant of stochastic GDA (M-GDA) that runs in multiple stages with a particular learning rate decay schedule and converges to the exact solution of the minimax problem. We show M-GDA achieves the lower bounds in terms of noise dependence without any assumptions on the knowledge of noise characteristics. We also show that M-GDA obtains a linear decay rate with respect to the error's dependence on the initial error, although the dependence on condition number is suboptimal. In order to improve this dependence, we apply the multistage machinery to the stochastic Optimistic Gradient Descent Ascent (OGDA) algorithm and propose the M-OGDA algorithm which also achieves the optimal linear decay rate with respect to the initial error. To the best of our knowledge, this method is the first to simultaneously achieve the best dependence on noise characteristic as well as the initial error and condition number.
△ Less
Submitted 13 February, 2020;
originally announced February 2020.
-
Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems
Authors:
Noah Golowich,
Sarath Pattathil,
Constantinos Daskalakis,
Asuman Ozdaglar
Abstract:
In this paper we study the smooth convex-concave saddle point problem. Specifically, we analyze the last iterate convergence properties of the Extragradient (EG) algorithm. It is well known that the ergodic (averaged) iterates of EG converge at a rate of $O(1/T)$ (Nemirovski, 2004). In this paper, we show that the last iterate of EG converges at a rate of $O(1/\sqrt{T})$. To the best of our knowle…
▽ More
In this paper we study the smooth convex-concave saddle point problem. Specifically, we analyze the last iterate convergence properties of the Extragradient (EG) algorithm. It is well known that the ergodic (averaged) iterates of EG converge at a rate of $O(1/T)$ (Nemirovski, 2004). In this paper, we show that the last iterate of EG converges at a rate of $O(1/\sqrt{T})$. To the best of our knowledge, this is the first paper to provide a convergence rate guarantee for the last iterate of EG for the smooth convex-concave saddle point problem. Moreover, we show that this rate is tight by proving a lower bound of $Ω(1/\sqrt{T})$ for the last iterate. This lower bound therefore shows a quadratic separation of the convergence rates of ergodic and last iterates in smooth convex-concave saddle point problems.
△ Less
Submitted 6 July, 2020; v1 submitted 31 January, 2020;
originally announced February 2020.
-
A Decentralized Proximal Point-type Method for Saddle Point Problems
Authors:
Weijie Liu,
Aryan Mokhtari,
Asuman Ozdaglar,
Sarath Pattathil,
Zebang Shen,
Nenggan Zheng
Abstract:
In this paper, we focus on solving a class of constrained non-convex non-concave saddle point problems in a decentralized manner by a group of nodes in a network. Specifically, we assume that each node has access to a summand of a global objective function and nodes are allowed to exchange information only with their neighboring nodes. We propose a decentralized variant of the proximal point metho…
▽ More
In this paper, we focus on solving a class of constrained non-convex non-concave saddle point problems in a decentralized manner by a group of nodes in a network. Specifically, we assume that each node has access to a summand of a global objective function and nodes are allowed to exchange information only with their neighboring nodes. We propose a decentralized variant of the proximal point method for solving this problem. We show that when the objective function is $ρ$-weakly convex-weakly concave the iterates converge to approximate stationarity with a rate of $\mathcal{O}(1/\sqrt{T})$ where the approximation error depends linearly on $\sqrtρ$. We further show that when the objective function satisfies the Minty VI condition (which generalizes the convex-concave case) we obtain convergence to stationarity with a rate of $\mathcal{O}(1/\sqrt{T})$. To the best of our knowledge, our proposed method is the first decentralized algorithm with theoretical guarantees for solving a non-convex non-concave decentralized saddle point problem. Our numerical results for training a general adversarial network (GAN) in a decentralized manner match our theoretical guarantees.
△ Less
Submitted 31 October, 2019;
originally announced October 2019.
-
Convergence Rate of $\mathcal{O}(1/k)$ for Optimistic Gradient and Extra-gradient Methods in Smooth Convex-Concave Saddle Point Problems
Authors:
Aryan Mokhtari,
Asuman Ozdaglar,
Sarath Pattathil
Abstract:
We study the iteration complexity of the optimistic gradient descent-ascent (OGDA) method and the extra-gradient (EG) method for finding a saddle point of a convex-concave unconstrained min-max problem. To do so, we first show that both OGDA and EG can be interpreted as approximate variants of the proximal point method. This is similar to the approach taken in [Nemirovski, 2004] which analyzes EG…
▽ More
We study the iteration complexity of the optimistic gradient descent-ascent (OGDA) method and the extra-gradient (EG) method for finding a saddle point of a convex-concave unconstrained min-max problem. To do so, we first show that both OGDA and EG can be interpreted as approximate variants of the proximal point method. This is similar to the approach taken in [Nemirovski, 2004] which analyzes EG as an approximation of the `conceptual mirror prox'. In this paper, we highlight how gradients used in OGDA and EG try to approximate the gradient of the Proximal Point method. We then exploit this interpretation to show that both algorithms produce iterates that remain within a bounded set. We further show that the primal dual gap of the averaged iterates generated by both of these algorithms converge with a rate of $\mathcal{O}(1/k)$. Our theoretical analysis is of interest as it provides a the first convergence rate estimate for OGDA in the general convex-concave setting. Moreover, it provides a simple convergence analysis for the EG algorithm in terms of function value without using compactness assumption.
△ Less
Submitted 29 September, 2020; v1 submitted 3 June, 2019;
originally announced June 2019.
-
A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach
Authors:
Aryan Mokhtari,
Asuman Ozdaglar,
Sarath Pattathil
Abstract:
In this paper we consider solving saddle point problems using two variants of Gradient Descent-Ascent algorithms, Extra-gradient (EG) and Optimistic Gradient Descent Ascent (OGDA) methods. We show that both of these algorithms admit a unified analysis as approximations of the classical proximal point method for solving saddle point problems. This viewpoint enables us to develop a new framework for…
▽ More
In this paper we consider solving saddle point problems using two variants of Gradient Descent-Ascent algorithms, Extra-gradient (EG) and Optimistic Gradient Descent Ascent (OGDA) methods. We show that both of these algorithms admit a unified analysis as approximations of the classical proximal point method for solving saddle point problems. This viewpoint enables us to develop a new framework for analyzing EG and OGDA for bilinear and strongly convex-strongly concave settings. Moreover, we use the proximal point approximation interpretation to generalize the results for OGDA for a wide range of parameters.
△ Less
Submitted 5 September, 2019; v1 submitted 24 January, 2019;
originally announced January 2019.
-
Persistence of the Jordan center in Random Growing Trees
Authors:
Sarath Pattathil,
Nikhil Karamchandani,
Dhruti Shah
Abstract:
The Jordan center of a graph is defined as a vertex whose maximum distance to other nodes in the graph is minimal, and it finds applications in facility location and source detection problems. We study properties of the Jordan Center in the case of random growing trees. In particular, we consider a regular tree graph on which an infection starts from a root node and then spreads along the edges of…
▽ More
The Jordan center of a graph is defined as a vertex whose maximum distance to other nodes in the graph is minimal, and it finds applications in facility location and source detection problems. We study properties of the Jordan Center in the case of random growing trees. In particular, we consider a regular tree graph on which an infection starts from a root node and then spreads along the edges of the graph according to various random spread models. For the Independent Cascade (IC) model and the discrete Susceptible Infected (SI) model, both of which are discrete time models, we show that as the infected subgraph grows with time, the Jordan center persists on a single vertex after a finite number of timesteps. Finally, we also study the continuous time version of the SI model and bound the maximum distance between the Jordan center and the root node at any time.
△ Less
Submitted 21 October, 2018; v1 submitted 22 December, 2017;
originally announced December 2017.
-
Distributed Server Allocation for Content Delivery Networks
Authors:
Sarath Pattathil,
Vivek S. Borkar,
Gaurav S. Kasbekar
Abstract:
We propose a dynamic formulation of file-sharing networks in terms of an average cost Markov decision process with constraints. By analyzing a Whittle-like relaxation thereof, we propose an index policy in the spirit of Whittle and compare it by simulations with other natural heuristics.
We propose a dynamic formulation of file-sharing networks in terms of an average cost Markov decision process with constraints. By analyzing a Whittle-like relaxation thereof, we propose an index policy in the spirit of Whittle and compare it by simulations with other natural heuristics.
△ Less
Submitted 9 February, 2019; v1 submitted 28 October, 2017;
originally announced October 2017.
-
Massive Machine-Type Communication (mMTC) Access with Integrated Authentication
Authors:
Nuno K. Pratas,
Sarath Pattathil,
Cedomir Stefanovic,
Petar Popovski
Abstract:
We present a connection establishment protocol with integrated authentication, suited for Massive Machine-Type Communications (mMTC). The protocol is contention-based and its main feature is that a device contends with a unique signature that also enables the authentication of the device towards the network. The signatures are inspired by Bloom filters and are created based on the output of the MI…
▽ More
We present a connection establishment protocol with integrated authentication, suited for Massive Machine-Type Communications (mMTC). The protocol is contention-based and its main feature is that a device contends with a unique signature that also enables the authentication of the device towards the network. The signatures are inspired by Bloom filters and are created based on the output of the MILENAGE authentication and encryption algorithm set, which is used in the authentication and security procedures in the LTE protocol family. We show that our method utilizes the system resources more efficiently, achieves lower latency of connection establishment for Poisson arrivals and allows a $87\%$ signalling overhead reduction. An important conclusion is that the mMTC traffic benefits profoundly from integration of security features into the connection establishment/access protocols, instead of addressing them post-hoc, which has been a common practice.
△ Less
Submitted 15 March, 2017; v1 submitted 31 October, 2016;
originally announced October 2016.