Search | arXiv e-print repository

Statistical Inference and A/B Testing in Fisher Markets and Paced Auctions

Abstract: We initiate the study of statistical inference and A/B testing for two market equilibrium models: linear Fisher market (LFM) equilibrium and first-price pacing equilibrium (FPPE). LFM arises from fair resource allocation systems such as allocation of food to food banks and notification opportunities to different types of notifications. For LFM, we assume that the data observed is captured by the c… ▽ More We initiate the study of statistical inference and A/B testing for two market equilibrium models: linear Fisher market (LFM) equilibrium and first-price pacing equilibrium (FPPE). LFM arises from fair resource allocation systems such as allocation of food to food banks and notification opportunities to different types of notifications. For LFM, we assume that the data observed is captured by the classical finite-dimensional Fisher market equilibrium, and its steady-state behavior is modeled by a continuous limit Fisher market. The second type of equilibrium we study, FPPE, arises from internet advertising where advertisers are constrained by budgets and advertising opportunities are sold via first-price auctions. For platforms that use pacing-based methods to smooth out the spending of advertisers, FPPE provides a hindsight-optimal configuration of the pacing method. We propose a statistical framework for the FPPE model, in which a continuous limit FPPE models the steady-state behavior of the auction platform, and a finite FPPE provides the data to estimate primitives of the limit FPPE. Both LFM and FPPE have an Eisenberg-Gale convex program characterization, the pillar upon which we derive our statistical theory. We start by deriving basic convergence results for the finite market to the limit market. We then derive asymptotic distributions, and construct confidence intervals. Furthermore, we establish the asymptotic local minimax optimality of estimation based on finite markets. We then show that the theory can be used for conducting statistically valid A/B testing on auction platforms. Synthetic and semi-synthetic experiments verify the validity and practicality of our theory. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: arXiv admin note: text overlap with arXiv:2301.02276, arXiv:2209.15422

arXiv:2406.12526 [pdf, other]

On the Convergence of Tâtonnement for Linear Fisher Markets

Authors: Tianlong Nan, Yuan Gao, Christian Kroer

Abstract: Tâtonnement is a simple, intuitive market process where prices are iteratively adjusted based on the difference between demand and supply. Many variants under different market assumptions have been studied and shown to converge to a market equilibrium, in some cases at a fast rate. However, the classical case of linear Fisher markets have long eluded the analyses, and it remains unclear whether tâ… ▽ More Tâtonnement is a simple, intuitive market process where prices are iteratively adjusted based on the difference between demand and supply. Many variants under different market assumptions have been studied and shown to converge to a market equilibrium, in some cases at a fast rate. However, the classical case of linear Fisher markets have long eluded the analyses, and it remains unclear whether tâtonnement converges in this case. We show that, for a sufficiently small step size, the prices given by the tâtonnement process are guaranteed to converge to equilibrium prices, up to a small approximation radius that depends on the stepsize. To achieve this, we consider the dual Eisenberg-Gale convex program in the price space, view tâtonnement as subgradient descent on this convex program, and utilize novel last-iterate convergence results for subgradient descent under error bound conditions. In doing so, we show that the convex program satisfies a particular error bound condition, the quadratic growth condition, and that the price sequence generated by tâtonnement is bounded above and away from zero. We also show that a similar convergence result holds for tâtonnement in quasi-linear Fisher markets. Numerical experiments are conducted to demonstrate that the theoretical linear convergence aligns with empirical observations. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 31 pages, 16 figures

arXiv:2406.10631 [pdf, other]

Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms

Authors: Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng

Abstract: Self-play via online learning is one of the premier ways to solve large-scale two-player zero-sum games, both in theory and practice. Particularly popular algorithms include optimistic multiplicative weights update (OMWU) and optimistic gradient-descent-ascent (OGDA). While both algorithms enjoy $O(1/T)$ ergodic convergence to Nash equilibrium in two-player zero-sum games, OMWU offers several adva… ▽ More Self-play via online learning is one of the premier ways to solve large-scale two-player zero-sum games, both in theory and practice. Particularly popular algorithms include optimistic multiplicative weights update (OMWU) and optimistic gradient-descent-ascent (OGDA). While both algorithms enjoy $O(1/T)$ ergodic convergence to Nash equilibrium in two-player zero-sum games, OMWU offers several advantages including logarithmic dependence on the size of the payoff matrix and $\widetilde{O}(1/T)$ convergence to coarse correlated equilibria even in general-sum games. However, in terms of last-iterate convergence in two-player zero-sum games, an increasingly popular topic in this area, OGDA guarantees that the duality gap shrinks at a rate of $O(1/\sqrt{T})$, while the best existing last-iterate convergence for OMWU depends on some game-dependent constant that could be arbitrarily large. This begs the question: is this potentially slow last-iterate convergence an inherent disadvantage of OMWU, or is the current analysis too loose? Somewhat surprisingly, we show that the former is true. More generally, we prove that a broad class of algorithms that do not forget the past quickly all suffer the same issue: for any arbitrarily small $δ>0$, there exists a $2\times 2$ matrix game such that the algorithm admits a constant duality gap even after $1/δ$ rounds. This class of algorithms includes OMWU and other standard optimistic follow-the-regularized-leader algorithms. △ Less

Submitted 15 June, 2024; originally announced June 2024.

Comments: 27 pages, 4 figures

arXiv:2405.03070 [pdf, other]

Layered Graph Security Games

Authors: Jakub Černý, Chun Kai Ling, Christian Kroer, Garud Iyengar

Abstract: Security games model strategic interactions in adversarial real-world applications. Such applications often involve extremely large but highly structured strategy sets (e.g., selecting a distribution over all patrol routes in a given graph). In this paper, we represent each player's strategy space using a layered graph whose paths represent an exponentially large strategy space. Our formulation en… ▽ More Security games model strategic interactions in adversarial real-world applications. Such applications often involve extremely large but highly structured strategy sets (e.g., selecting a distribution over all patrol routes in a given graph). In this paper, we represent each player's strategy space using a layered graph whose paths represent an exponentially large strategy space. Our formulation entails not only classic pursuit-evasion games, but also other security games, such as those modeling anti-terrorism and logistical interdiction. We study two-player zero-sum games under two distinct utility models: linear and binary utilities. We show that under linear utilities, Nash equilibrium can be computed in polynomial time, while binary utilities may lead to situations where even computing a best-response is computationally intractable. To this end, we propose a practical algorithm based on incremental strategy generation and mixed integer linear programs. We show through extensive experiments that our algorithm efficiently computes $ε$-equilibrium for many games of interest. We find that target values and graph structure often have a larger influence on running times as compared to the size of the graph per se. △ Less

Submitted 9 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

Comments: In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence. IJCAI Press, 2024

arXiv:2403.04680 [pdf, other]

Extensive-Form Game Solving via Blackwell Approachability on Treeplexes

Authors: Darshan Chakrabarti, Julien Grand-Clément, Christian Kroer

Abstract: In this paper, we introduce the first algorithmic framework for Blackwell approachability on the sequence-form polytope, the class of convex polytopes capturing the strategies of players in extensive-form games (EFGs). This leads to a new class of regret-minimization algorithms that are stepsize-invariant, in the same sense as the Regret Matching and Regret Matching$^+$ algorithms for the simplex.… ▽ More In this paper, we introduce the first algorithmic framework for Blackwell approachability on the sequence-form polytope, the class of convex polytopes capturing the strategies of players in extensive-form games (EFGs). This leads to a new class of regret-minimization algorithms that are stepsize-invariant, in the same sense as the Regret Matching and Regret Matching$^+$ algorithms for the simplex. Our modular framework can be combined with any existing regret minimizer over cones to compute a Nash equilibrium in two-player zero-sum EFGs with perfect recall, through the self-play framework. Leveraging predictive online mirror descent, we introduce Predictive Treeplex Blackwell$^+$ (PTB$^+$), and show a $O(1/\sqrt{T})$ convergence rate to Nash equilibrium in self-play. We then show how to stabilize PTB$^+$ with a stepsize, resulting in an algorithm with a state-of-the-art $O(1/T)$ convergence rate. We provide an extensive set of experiments to compare our framework with several algorithmic benchmarks, including CFR$^+$ and its predictive variant, and we highlight interesting connections between practical performance and the stepsize-dependence or stepsize-invariance properties of classical algorithms. △ Less

Submitted 7 March, 2024; originally announced March 2024.

arXiv:2402.10439 [pdf, other]

Competitive Equilibrium for Chores: from Dual Eisenberg-Gale to a Fast, Greedy, LP-based Algorithm

Authors: Bhaskar Ray Chaudhury, Christian Kroer, Ruta Mehta, Tianlong Nan

Abstract: We study the computation of competitive equilibrium for Fisher markets with $n$ agents and $m$ divisible chores. Prior work showed that competitive equilibria correspond to the nonzero KKT points of a non-convex analogue of the Eisenberg-Gale convex program. We introduce an analogue of the Eisenberg-Gale dual for chores: we show that all KKT points of this dual correspond to competitive equilibria… ▽ More We study the computation of competitive equilibrium for Fisher markets with $n$ agents and $m$ divisible chores. Prior work showed that competitive equilibria correspond to the nonzero KKT points of a non-convex analogue of the Eisenberg-Gale convex program. We introduce an analogue of the Eisenberg-Gale dual for chores: we show that all KKT points of this dual correspond to competitive equilibria, and while it is not a dual of the non-convex primal program in a formal sense, the objectives touch at all KKT points. Similar to the primal, the dual has problems from an optimization perspective: there are many feasible directions where the objective tends to positive infinity. We then derive a new constraint for the dual, which restricts optimization to a hyperplane that avoids all these directions. We show that restriction to this hyperplane retains all KKT points, and surprisingly, does not introduce any new ones. This allows, for the first time ever, application of iterative optimization methods over a convex region for computing competitive equilibria for chores. We next introduce a greedy Frank-Wolfe algorithm for optimization over our program and show a state-of-the-art convergence rate to competitive equilibrium. In the case of equal incomes, we show a $\mathcal{\tilde O}(n/ε^2)$ rate of convergence, which improves over the two prior state-of-the-art rates of $\mathcal{\tilde O}(n^3/ε^2)$ for an exterior-point method and $\mathcal{\tilde O}(nm/ε^2)$ for a combinatorial method. Moreover, our method is significantly simpler: each iteration of our method only requires solving a simple linear program. We show through numerical experiments on simulated data and a paper review bidding dataset that our method is extremely practical. This is the first highly practical method for solving competitive equilibrium for Fisher markets with chores. △ Less

Submitted 15 February, 2024; originally announced February 2024.

Comments: 25 pages, 17 figures

arXiv:2402.08129 [pdf, ps, other]

Automated Design of Affine Maximizer Mechanisms in Dynamic Settings

Authors: Michael Curry, Vinzenz Thoma, Darshan Chakrabarti, Stephen McAleer, Christian Kroer, Tuomas Sandholm, Niao He, Sven Seuken

Abstract: Dynamic mechanism design is a challenging extension to ordinary mechanism design in which the mechanism designer must make a sequence of decisions over time in the face of possibly untruthful reports of participating agents. Optimizing dynamic mechanisms for welfare is relatively well understood. However, there has been less work on optimizing for other goals (e.g. revenue), and without restrictiv… ▽ More Dynamic mechanism design is a challenging extension to ordinary mechanism design in which the mechanism designer must make a sequence of decisions over time in the face of possibly untruthful reports of participating agents. Optimizing dynamic mechanisms for welfare is relatively well understood. However, there has been less work on optimizing for other goals (e.g. revenue), and without restrictive assumptions on valuations, it is remarkably challenging to characterize good mechanisms. Instead, we turn to automated mechanism design to find mechanisms with good performance in specific problem instances. In fact, the situation is similar even in static mechanism design. However, in the static case, optimization/machine learning-based automated mechanism design techniques have been successful in finding high-revenue mechanisms in cases beyond the reach of analytical results. We extend the class of affine maximizer mechanisms to MDPs where agents may untruthfully report their rewards. This extension results in a challenging bilevel optimization problem in which the upper problem involves choosing optimal mechanism parameters, and the lower problem involves solving the resulting MDP. Our approach can find truthful dynamic mechanisms that achieve strong performance on goals other than welfare, and can be applied to essentially any problem setting-without restrictions on valuations-for which RL can learn optimal policies. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: To be published in the Thirty-Eighth Proceedings of the AAAI Conference on Artificial Intelligence 2024

arXiv:2402.07322 [pdf, other]

Interference Among First-Price Pacing Equilibria: A Bias and Variance Analysis

Authors: Luofeng Liao, Christian Kroer, Sergei Leonenkov, Okke Schrijvers, Liang Shi, Nicolas Stier-Moses, Congshan Zhang

Abstract: Online A/B testing is widely used in the internet industry to inform decisions on new feature roll-outs. For online marketplaces (such as advertising markets), standard approaches to A/B testing may lead to biased results when buyers operate under a budget constraint, as budget consumption in one arm of the experiment impacts performance of the other arm. To counteract this interference, one can u… ▽ More Online A/B testing is widely used in the internet industry to inform decisions on new feature roll-outs. For online marketplaces (such as advertising markets), standard approaches to A/B testing may lead to biased results when buyers operate under a budget constraint, as budget consumption in one arm of the experiment impacts performance of the other arm. To counteract this interference, one can use a budget-split design where the budget constraint operates on a per-arm basis and each arm receives an equal fraction of the budget, leading to ``budget-controlled A/B testing.'' Despite clear advantages of budget-controlled A/B testing, performance degrades when budget are split too small, limiting the overall throughput of such systems. In this paper, we propose a parallel budget-controlled A/B testing design where we use market segmentation to identify submarkets in the larger market, and we run parallel experiments on each submarket. Our contributions are as follows: First, we introduce and demonstrate the effectiveness of the parallel budget-controlled A/B test design with submarkets in a large online marketplace environment. Second, we formally define market interference in first-price auction markets using the first price pacing equilibrium (FPPE) framework. Third, we propose a debiased surrogate that eliminates the first-order bias of FPPE, drawing upon the principles of sensitivity analysis in mathematical programs. Fourth, we derive a plug-in estimator for the surrogate and establish its asymptotic normality. Fifth, we provide an estimation procedure for submarket parallel budget-controlled A/B tests. Finally, we present numerical examples on semi-synthetic data, confirming that the debiasing technique achieves the desired coverage properties. △ Less

Submitted 11 February, 2024; originally announced February 2024.

arXiv:2402.02303 [pdf, other]

Bootstrap** Fisher Market Equilibrium and First-Price Pacing Equilibrium

Authors: Luofeng Liao, Christian Kroer

Abstract: The linear Fisher market (LFM) is a basic equilibrium model from economics, which also has applications in fair and efficient resource allocation. First-price pacing equilibrium (FPPE) is a model capturing budget-management mechanisms in first-price auctions. In certain practical settings such as advertising auctions, there is an interest in performing statistical inference over these models. A po… ▽ More The linear Fisher market (LFM) is a basic equilibrium model from economics, which also has applications in fair and efficient resource allocation. First-price pacing equilibrium (FPPE) is a model capturing budget-management mechanisms in first-price auctions. In certain practical settings such as advertising auctions, there is an interest in performing statistical inference over these models. A popular methodology for general statistical inference is the bootstrap procedure. Yet, for LFM and FPPE there is no existing theory for the valid application of bootstrap procedures. In this paper, we introduce and devise several statistically valid bootstrap inference procedures for LFM and FPPE. The most challenging part is to bootstrap general FPPE, which reduces to bootstrap** constrained M-estimators, a largely unexplored problem. We devise a bootstrap procedure for FPPE under mild degeneracy conditions by using the powerful tool of epi-convergence theory. Experiments with synthetic and semi-real data verify our theory. △ Less

Submitted 11 February, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

Comments: fix author names

arXiv:2312.03696 [pdf, other]

Efficient Learning in Polyhedral Games via Best Response Oracles

Authors: Darshan Chakrabarti, Gabriele Farina, Christian Kroer

Abstract: We study online learning and equilibrium computation in games with polyhedral decision sets, a property shared by both normal-form games and extensive-form games (EFGs), when the learning agent is restricted to using a best-response oracle. We show how to achieve constant regret in zero-sum games and $O(T^{1/4})$ regret in general-sum games while using only $O(\log t)$ best-response queries at a g… ▽ More We study online learning and equilibrium computation in games with polyhedral decision sets, a property shared by both normal-form games and extensive-form games (EFGs), when the learning agent is restricted to using a best-response oracle. We show how to achieve constant regret in zero-sum games and $O(T^{1/4})$ regret in general-sum games while using only $O(\log t)$ best-response queries at a given iteration $t$, thus improving over the best prior result, which required $O(T)$ queries per iteration. Moreover, our framework yields the first last-iterate convergence guarantees for self-play with best-response oracles in zero-sum games. This convergence occurs at a linear rate, though with a condition-number dependence. We go on to show a $O(1/\sqrt{T})$ best-iterate convergence rate without such a dependence. Our results build on linear-rate convergence results for variants of the Frank-Wolfe (FW) algorithm for strongly convex and smooth minimization problems over polyhedral domains. These FW results depend on a condition number of the polytope, known as facial distance. In order to enable application to settings such as EFGs, we show two broad new results: 1) the facial distance for polytopes in standard form is at least $γ/\sqrt{k}$ where $γ$ is the minimum value of a nonzero coordinate of a vertex of the polytope and $k\leq n$ is the number of tight inequality constraints in the optimal face, and 2) the facial distance for polytopes of the form $\mathbf{A}\boldsymbol{x}=\boldsymbol{b},\mathbf{C}\boldsymbol{x}\leq\boldsymbol{d}, \boldsymbol{x}\geq \mathbf{0}$ where $\boldsymbol{x}\in\mathbb{R}^n$, $\mathbf{C}\geq\boldsymbol{0}$ is a nonzero integral matrix, and $\boldsymbol{d}\geq \boldsymbol{0}$, is at least $1/(\|\mathbf{C}\|_\infty\sqrt{n})$. This yields the first such results for several problems such as sequence-form polytopes, flow polytopes, and matching polytopes. △ Less

Submitted 6 December, 2023; originally announced December 2023.

arXiv:2311.06426 [pdf, other]

Incentivizing Investment and Reliability: A Study on Electricity Capacity Markets

Authors: Cheng Guo, Christian Kroer, Yury Dvorkin, Daniel Bienstock

Abstract: The capacity market, a marketplace to exchange available generation capacity for electricity production, provides a major revenue stream for generators and is adopted in several U.S. regions. A subject of ongoing debate, the capacity market is viewed by its proponents as a crucial mechanism to ensure system reliability, while critics highlight its drawbacks such as market distortion. Under a novel… ▽ More The capacity market, a marketplace to exchange available generation capacity for electricity production, provides a major revenue stream for generators and is adopted in several U.S. regions. A subject of ongoing debate, the capacity market is viewed by its proponents as a crucial mechanism to ensure system reliability, while critics highlight its drawbacks such as market distortion. Under a novel analytical framework, we rigorously evaluate the impact of the capacity market on generators' revenue and system reliability. More specifically, based on market designs at New York Independent System Operator (NYISO), we propose market equilibrium-based models to capture salient aspects of the capacity market and its interaction with the energy market. We also develop a leader-follower model to study market power. We show that the capacity market incentivizes the investment of generators with lower net cost of new entry. It also facilitates reliability by preventing significant physical withholding when the demand is relatively high. Nevertheless, the capacity market may not provide enough revenue for peaking plants. Moreover, it is susceptible to market power, which necessitates tailored market power mitigation measures depending on market dynamics. We provide further insights via large-scale experiments on data from NYISO markets. △ Less

Submitted 10 November, 2023; originally announced November 2023.

arXiv:2311.00676 [pdf, other]

Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games

Authors: Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng

Abstract: Algorithms based on regret matching, specifically regret matching$^+$ (RM$^+$), and its variants are the most popular approaches for solving large-scale two-player zero-sum games in practice. Unlike algorithms such as optimistic gradient descent ascent, which have strong last-iterate and ergodic convergence properties for zero-sum games, virtually nothing is known about the last-iterate properties… ▽ More Algorithms based on regret matching, specifically regret matching$^+$ (RM$^+$), and its variants are the most popular approaches for solving large-scale two-player zero-sum games in practice. Unlike algorithms such as optimistic gradient descent ascent, which have strong last-iterate and ergodic convergence properties for zero-sum games, virtually nothing is known about the last-iterate properties of regret-matching algorithms. Given the importance of last-iterate convergence for numerical optimization reasons and relevance as modeling real-word learning in games, in this paper, we study the last-iterate convergence properties of various popular variants of RM$^+$. First, we show numerically that several practical variants such as simultaneous RM$^+$, alternating RM$^+$, and simultaneous predictive RM$^+$, all lack last-iterate convergence guarantees even on a simple $3\times 3$ game. We then prove that recent variants of these algorithms based on a smoothing technique do enjoy last-iterate convergence: we prove that extragradient RM$^{+}$ and smooth Predictive RM$^+$ enjoy asymptotic last-iterate convergence (without a rate) and $1/\sqrt{t}$ best-iterate convergence. Finally, we introduce restarted variants of these algorithms, and show that they enjoy linear-rate last-iterate convergence. △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2308.09277 [pdf, ps, other]

Greedy-Based Online Fair Allocation with Adversarial Input: Enabling Best-of-Many-Worlds Guarantees

Authors: Zongjun Yang, Luofeng Liao, Christian Kroer

Abstract: We study an online allocation problem with sequentially arriving items and adversarially chosen agent values, with the goal of balancing fairness and efficiency. Our goal is to study the performance of algorithms that achieve strong guarantees under other input models such as stochastic inputs, in order to achieve robust guarantees against a variety of inputs. To that end, we study the PACE (Pacin… ▽ More We study an online allocation problem with sequentially arriving items and adversarially chosen agent values, with the goal of balancing fairness and efficiency. Our goal is to study the performance of algorithms that achieve strong guarantees under other input models such as stochastic inputs, in order to achieve robust guarantees against a variety of inputs. To that end, we study the PACE (Pacing According to Current Estimated utility) algorithm, an existing algorithm designed for stochastic input. We show that in the equal-budgets case, PACE is equivalent to the integral greedy algorithm. We go on to show that with natural restrictions on the adversarial input model, both integral greedy allocation and PACE have asymptotically bounded multiplicative envy as well as competitive ratio for Nash welfare, with the multiplicative factors either constant or with optimal order dependence on the number of agents. This completes a "best-of-many-worlds" guarantee for PACE, since past work showed that PACE achieves guarantees for stationary and stochastic-but-non-stationary input models. △ Less

Submitted 17 August, 2023; originally announced August 2023.

arXiv:2307.16754 [pdf, other]

Block-Coordinate Methods and Restarting for Solving Extensive-Form Games

Authors: Darshan Chakrabarti, Jelena Diakonikolas, Christian Kroer

Abstract: Coordinate descent methods are popular in machine learning and optimization for their simple sparse updates and excellent practical performance. In the context of large-scale sequential game solving, these same properties would be attractive, but until now no such methods were known, because the strategy spaces do not satisfy the typical separable block structure exploited by such methods. We pres… ▽ More Coordinate descent methods are popular in machine learning and optimization for their simple sparse updates and excellent practical performance. In the context of large-scale sequential game solving, these same properties would be attractive, but until now no such methods were known, because the strategy spaces do not satisfy the typical separable block structure exploited by such methods. We present the first cyclic coordinate-descent-like method for the polytope of sequence-form strategies, which form the strategy spaces for the players in an extensive-form game (EFG). Our method exploits the recursive structure of the proximal update induced by what are known as dilated regularizers, in order to allow for a pseudo block-wise update. We show that our method enjoys a $O(1/T)$ convergence rate to a two-player zero-sum Nash equilibrium, while avoiding the worst-case polynomial scaling with the number of blocks common to cyclic methods. We empirically show that our algorithm usually performs better than other state-of-the-art first-order methods (i.e., mirror prox), and occasionally can even beat CFR$^+$, a state-of-the-art algorithm for numerical equilibrium computation in zero-sum EFGs. We then introduce a restarting heuristic for EFG solving. We show empirically that restarting can lead to speedups, sometimes huge, both for our cyclic method, as well as for existing methods such as mirror prox and predictive CFR$^+$. △ Less

Submitted 31 July, 2023; originally announced July 2023.

arXiv:2306.01796 [pdf, other]

Convergence of Extragradient SVRG for Variational Inequalities: Error Bounds and Increasing Iterate Averaging

Authors: Tianlong Nan, Yuan Gao, Christian Kroer

Abstract: We study the last-iterate convergence of variance reduction methods for extragradient (EG) algorithms for a class of variational inequalities satisfying error-bound conditions. Previously, last-iterate linear convergence was only known under strong monotonicity. We show that EG algorithms with SVRG-style variance reduction, denoted SVRG-EG, attain last-iterate linear convergence under a general er… ▽ More We study the last-iterate convergence of variance reduction methods for extragradient (EG) algorithms for a class of variational inequalities satisfying error-bound conditions. Previously, last-iterate linear convergence was only known under strong monotonicity. We show that EG algorithms with SVRG-style variance reduction, denoted SVRG-EG, attain last-iterate linear convergence under a general error-bound condition much weaker than strong monotonicity. This condition captures a broad class of non-strongly monotone problems, such as bilinear saddle-point problems commonly encountered in two-player zero-sum Nash equilibrium computation. Next, we establish linear last-iterate convergence of SVRG-EG with an improved guarantee under the weak sharpness assumption. Furthermore, motivated by the empirical efficiency of increasing iterate averaging techniques in solving saddle-point problems, we also establish new convergence results for SVRG-EG with such techniques. △ Less

Submitted 30 December, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: 44 pages

arXiv:2305.14709 [pdf, ps, other]

Regret Matching+: (In)Stability and Fast Convergence in Games

Authors: Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo

Abstract: Regret Matching+ (RM+) and its variants are important algorithms for solving large-scale games. However, a theoretical understanding of their success in practice is still a mystery. Moreover, recent advances on fast convergence in games are limited to no-regret algorithms such as online mirror descent, which satisfy stability. In this paper, we first give counterexamples showing that RM+ and its p… ▽ More Regret Matching+ (RM+) and its variants are important algorithms for solving large-scale games. However, a theoretical understanding of their success in practice is still a mystery. Moreover, recent advances on fast convergence in games are limited to no-regret algorithms such as online mirror descent, which satisfy stability. In this paper, we first give counterexamples showing that RM+ and its predictive version can be unstable, which might cause other players to suffer large regret. We then provide two fixes: restarting and chop** off the positive orthant that RM+ works in. We show that these fixes are sufficient to get $O(T^{1/4})$ individual regret and $O(1)$ social regret in normal-form games via RM+ with predictions. We also apply our stabilizing techniques to clairvoyant updates in the uncoupled learning setting for RM+ and prove desirable results akin to recent works for Clairvoyant online mirror descent. Our experiments show the advantages of our algorithms over vanilla RM+-based algorithms in matrix and extensive-form games. △ Less

Submitted 24 May, 2023; originally announced May 2023.

arXiv:2303.00506 [pdf, other]

Fast and Interpretable Dynamics for Fisher Markets via Block-Coordinate Updates

Authors: Tianlong Nan, Yuan Gao, Christian Kroer

Abstract: We consider the problem of large-scale Fisher market equilibrium computation through scalable first-order optimization methods. It is well-known that market equilibria can be captured using structured convex programs such as the Eisenberg-Gale and Shmyrev convex programs. Highly performant deterministic full-gradient first-order methods have been developed for these programs. In this paper, we dev… ▽ More We consider the problem of large-scale Fisher market equilibrium computation through scalable first-order optimization methods. It is well-known that market equilibria can be captured using structured convex programs such as the Eisenberg-Gale and Shmyrev convex programs. Highly performant deterministic full-gradient first-order methods have been developed for these programs. In this paper, we develop new block-coordinate first-order methods for computing Fisher market equilibria, and show that these methods have interpretations as tâtonnement-style or proportional response-style dynamics where either buyers or items show up one at a time. We reformulate these convex programs and solve them using proximal block coordinate descent methods, a class of methods that update only a small number of coordinates of the decision variable in each iteration. Leveraging recent advances in the convergence analysis of these methods and structures of the equilibrium-capturing convex programs, we establish fast convergence rates of these methods. △ Less

Submitted 1 March, 2023; originally announced March 2023.

Comments: 25 pages, 18 figures, to be published in AAAI 2023

arXiv:2302.04835 [pdf, other]

Fair Notification Optimization: An Auction Approach

Authors: Christian Kroer, Deeksha Sinha, Xuan Zhang, Shiwen Cheng, Ziyu Zhou

Abstract: Notifications are important for the user experience in mobile apps and can influence their engagement. However, too many notifications can be disruptive for users. A typical mobile app usually has several types of notification, managed by distinct teams with objectives that are possibly conflicting with each other, or even with the overall platform objective. Therefore, there is a need for careful… ▽ More Notifications are important for the user experience in mobile apps and can influence their engagement. However, too many notifications can be disruptive for users. A typical mobile app usually has several types of notification, managed by distinct teams with objectives that are possibly conflicting with each other, or even with the overall platform objective. Therefore, there is a need for careful curation of notifications sent to users of these different types. In this work, we study a novel centralized approach for notification optimization, where we view the opportunities to send user notifications as items and types of notifications as buyers in an auction market. Furthermore, the auction setup is unique, and the platform has the ability to subsidize the bids from the notification types. Using tools from fair division, we study the application of competitive equilibrium for addressing this problem. We show that an Eisenberg-Gale-style convex program allows us to find an allocation that is fair to all notification types in hindsight. Using the dual of the formulation, we present an online algorithm that allocates notifications via first-price auctions using a pacing-multiplier approach. Secondly, we introduce an approach based on second-price auctions and pacing, which has the benefit of working well with existing advertising systems built for second-price auctions. Through an A/B test in production, we show that the second price-based auction system improves over a decentralized notification optimization system, leading to its launch in production for some Instagram notifications. Further, through simulations on Instagram notification data and a subsequent production A/B test, we compare the outcomes of first-price and second-price auctions and show that the former has more stable pacing multipliers. △ Less

Submitted 9 February, 2023; originally announced February 2023.

arXiv:2302.01203 [pdf, ps, other]

Online Learning under Budget and ROI Constraints via Weak Adaptivity

Authors: Matteo Castiglioni, Andrea Celli, Christian Kroer

Abstract: We study online learning problems in which a decision maker has to make a sequence of costly decisions, with the goal of maximizing their expected reward while adhering to budget and return-on-investment (ROI) constraints. Existing primal-dual algorithms designed for constrained online learning problems under adversarial inputs rely on two fundamental assumptions. First, the decision maker must kn… ▽ More We study online learning problems in which a decision maker has to make a sequence of costly decisions, with the goal of maximizing their expected reward while adhering to budget and return-on-investment (ROI) constraints. Existing primal-dual algorithms designed for constrained online learning problems under adversarial inputs rely on two fundamental assumptions. First, the decision maker must know beforehand the value of parameters related to the degree of strict feasibility of the problem (i.e. Slater parameters). Second, a strictly feasible solution to the offline optimization problem must exist at each round. Both requirements are unrealistic for practical applications such as bidding in online ad auctions. In this paper, we show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers. This results in a ``dual-balancing'' framework which ensures that dual variables stay sufficiently small, even in the absence of knowledge about Slater's parameter. We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions, under stochastic and adversarial inputs. Finally, we show how to instantiate the framework to optimally bid in various mechanisms of practical relevance, such as first- and second-price auctions. △ Less

Submitted 2 March, 2024; v1 submitted 2 February, 2023; originally announced February 2023.

arXiv:2301.02276 [pdf, other]

Statistical Inference and A/B Testing for First-Price Pacing Equilibria

Authors: Luofeng Liao, Christian Kroer

Abstract: We initiate the study of statistical inference and A/B testing for first-price pacing equilibria (FPPE). The FPPE model captures the dynamics resulting from large-scale first-price auction markets where buyers use pacing-based budget management. Such markets arise in the context of internet advertising, where budgets are prevalent. We propose a statistical framework for the FPPE model, in which… ▽ More We initiate the study of statistical inference and A/B testing for first-price pacing equilibria (FPPE). The FPPE model captures the dynamics resulting from large-scale first-price auction markets where buyers use pacing-based budget management. Such markets arise in the context of internet advertising, where budgets are prevalent. We propose a statistical framework for the FPPE model, in which a limit FPPE with a continuum of items models the long-run steady-state behavior of the auction platform, and an observable FPPE consisting of a finite number of items provides the data to estimate primitives of the limit FPPE, such as revenue, Nash social welfare (a fair metric of efficiency), and other parameters of interest. We develop central limit theorems and asymptotically valid confidence intervals. Furthermore, we establish the asymptotic local minimax optimality of our estimators. We then show that the theory can be used for conducting statistically valid A/B testing on auction platforms. Numerical simulations verify our central limit theorems, and empirical coverage rates for our confidence intervals agree with our theory. △ Less

Submitted 28 June, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

Comments: - fix reference

arXiv:2210.02586 [pdf, other]

Implementing Fairness Constraints in Markets Using Taxes and Subsidies

Authors: Alexander Peysakhovich, Christian Kroer, Nicolas Usunier

Abstract: Fisher markets are those where buyers with budgets compete for scarce items, a natural model for many real world markets including online advertising. A market equilibrium is a set of prices and allocations of items such that supply meets demand. We show how market designers can use taxes or subsidies in Fisher markets to ensure that market equilibrium outcomes fall within certain constraints. We… ▽ More Fisher markets are those where buyers with budgets compete for scarce items, a natural model for many real world markets including online advertising. A market equilibrium is a set of prices and allocations of items such that supply meets demand. We show how market designers can use taxes or subsidies in Fisher markets to ensure that market equilibrium outcomes fall within certain constraints. We show how these taxes and subsidies can be computed even in an online setting where the market designer does not have access to private valuations. We adapt various types of fairness constraints proposed in existing literature to the market case and show who benefits and who loses from these constraints, as well as the extent to which properties of markets including Pareto optimality, envy-freeness, and incentive compatibility are preserved. We find that some prior discussed constraints have few guarantees in terms of who is made better or worse off by their imposition. △ Less

Submitted 13 March, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

arXiv:2209.15422 [pdf, other]

Statistical Inference for Fisher Market Equilibrium

Authors: Luofeng Liao, Yuan Gao, Christian Kroer

Abstract: Statistical inference under market equilibrium effects has attracted increasing attention recently. In this paper we focus on the specific case of linear Fisher markets. They have been widely use in fair resource allocation of food/blood donations and budget management in large-scale Internet ad auctions. In resource allocation, it is crucial to quantify the variability of the resource received by… ▽ More Statistical inference under market equilibrium effects has attracted increasing attention recently. In this paper we focus on the specific case of linear Fisher markets. They have been widely use in fair resource allocation of food/blood donations and budget management in large-scale Internet ad auctions. In resource allocation, it is crucial to quantify the variability of the resource received by the agents (such as blood banks and food banks) in addition to fairness and efficiency properties of the systems. For ad auction markets, it is important to establish statistical properties of the platform's revenues in addition to their expected values. To this end, we propose a statistical framework based on the concept of infinite-dimensional Fisher markets. In our framework, we observe a market formed by a finite number of items sampled from an underlying distribution (the "observed market") and aim to infer several important equilibrium quantities of the underlying long-run market. These equilibrium quantities include individual utilities, social welfare, and pacing multipliers. Through the lens of sample average approximation (SSA), we derive a collection of statistical results and show that the observed market provides useful statistical information of the long-run market. In other words, the equilibrium quantities of the observed market converge to the true ones of the long-run market with strong statistical guarantees. These include consistency, finite sample bounds, asymptotics, and confidence. As an extension, we discuss revenue inference in quasilinear Fisher markets. △ Less

Submitted 29 September, 2022; originally announced September 2022.

arXiv:2209.15416 [pdf, other]

Optimal Efficiency-Envy Trade-Off via Optimal Transport

Authors: Steven Yin, Christian Kroer

Abstract: We consider the problem of allocating a distribution of items to $n$ recipients where each recipient has to be allocated a fixed, prespecified fraction of all items, while ensuring that each recipient does not experience too much envy. We show that this problem can be formulated as a variant of the semi-discrete optimal transport (OT) problem, whose solution structure in this case has a concise re… ▽ More We consider the problem of allocating a distribution of items to $n$ recipients where each recipient has to be allocated a fixed, prespecified fraction of all items, while ensuring that each recipient does not experience too much envy. We show that this problem can be formulated as a variant of the semi-discrete optimal transport (OT) problem, whose solution structure in this case has a concise representation and a simple geometric interpretation. Unlike existing literature that treats envy-freeness as a hard constraint, our formulation allows us to \emph{optimally} trade off efficiency and envy continuously. Additionally, we study the statistical properties of the space of our OT based allocation policies by showing a polynomial bound on the number of samples needed to approximate the optimal solution from samples. Our approach is suitable for large-scale fair allocation problems such as the blood donation matching problem, and we show numerically that it performs well on a prior realistic data simulator. △ Less

Submitted 24 September, 2022; originally announced September 2022.

arXiv:2209.07647 [pdf, other]

Computing the optimal distributionally-robust strategy to commit to

Authors: Sai Mali Ananthanarayanan, Christian Kroer

Abstract: The Stackelberg game model, where a leader commits to a strategy and the follower best responds, has found widespread application, particularly to security problems. In the security setting, the goal is for the leader to compute an optimal strategy to commit to, in order to protect some asset. In many of these applications, the parameters of the follower utility model are not known with certainty.… ▽ More The Stackelberg game model, where a leader commits to a strategy and the follower best responds, has found widespread application, particularly to security problems. In the security setting, the goal is for the leader to compute an optimal strategy to commit to, in order to protect some asset. In many of these applications, the parameters of the follower utility model are not known with certainty. Distributionally-robust optimization addresses this issue by allowing a distribution over possible model parameters, where this distribution comes from a set of possible distributions. The goal is to maximize the expected utility with respect to the worst-case distribution. We initiate the study of distributionally-robust models for computing the optimal strategy to commit to. We consider the case of normal-form games with uncertainty about the follower utility model. Our main theoretical result is to show that a distributionally-robust Stackelberg equilibrium always exists across a wide array of uncertainty models. For the case of a finite set of possible follower utility functions we present two algorithms to compute a distributionally-robust strong Stackelberg equilibrium (DRSSE) using mathematical programs. Next, in the general case where there is an infinite number of possible follower utility functions and the uncertainty is represented by a Wasserstein ball around a finitely-supported nominal distribution, we give an incremental mixed-integer-programming-based algorithm for computing the optimal distributionally-robust strategy. Experiments substantiate the tractability of our algorithm on a classical Stackelberg game, showing that our approach scales to medium-sized games. △ Less

Submitted 15 September, 2022; originally announced September 2022.

arXiv:2208.14891 [pdf, ps, other]

Clairvoyant Regret Minimization: Equivalence with Nemirovski's Conceptual Prox Method and Extension to General Convex Games

Authors: Gabriele Farina, Christian Kroer, Chung-Wei Lee, Haipeng Luo

Abstract: A recent paper by Piliouras et al. [2021, 2022] introduces an uncoupled learning algorithm for normal-form games -- called Clairvoyant MWU (CMWU). In this note we show that CMWU is equivalent to the conceptual prox method described by Nemirovski [2004]. This connection immediately shows that it is possible to extend the CMWU algorithm to any convex game, a question left open by Piliouras et al. We… ▽ More A recent paper by Piliouras et al. [2021, 2022] introduces an uncoupled learning algorithm for normal-form games -- called Clairvoyant MWU (CMWU). In this note we show that CMWU is equivalent to the conceptual prox method described by Nemirovski [2004]. This connection immediately shows that it is possible to extend the CMWU algorithm to any convex game, a question left open by Piliouras et al. We call the resulting algorithm -- again equivalent to the conceptual prox method -- Clairvoyant OMD. At the same time, we show that our analysis yields an improved regret bound compared to the original bound by Piliouras et al., in that the regret of CMWU scales only with the square root of the number of players, rather than the number of players themselves. △ Less

Submitted 31 August, 2022; originally announced August 2022.

arXiv:2206.13606 [pdf, other]

Online Resource Allocation under Horizon Uncertainty

Authors: Santiago Balseiro, Christian Kroer, Rachitesh Kumar

Abstract: We study stochastic online resource allocation: a decision maker needs to allocate limited resources to stochastically-generated sequentially-arriving requests in order to maximize reward. At each time step, requests are drawn independently from a distribution that is unknown to the decision maker. Online resource allocation and its special cases have been studied extensively in the past, but prio… ▽ More We study stochastic online resource allocation: a decision maker needs to allocate limited resources to stochastically-generated sequentially-arriving requests in order to maximize reward. At each time step, requests are drawn independently from a distribution that is unknown to the decision maker. Online resource allocation and its special cases have been studied extensively in the past, but prior results crucially and universally rely on the strong assumption that the total number of requests (the horizon) is known to the decision maker in advance. In many applications, such as revenue management and online advertising, the number of requests can vary widely because of fluctuations in demand or user traffic intensity. In this work, we develop online algorithms that are robust to horizon uncertainty. In sharp contrast to the known-horizon setting, no algorithm can achieve even a constant asymptotic competitive ratio that is independent of the horizon uncertainty. We introduce a novel generalization of dual mirror descent which allows the decision maker to specify a schedule of time-varying target consumption rates, and prove corresponding performance guarantees. We go on to give a fast algorithm for computing a schedule of target consumption rates that leads to near-optimal performance in the unknown-horizon setting. In particular, our competitive ratio attains the optimal rate of growth (up to logarithmic factors) as the horizon uncertainty grows large. Finally, we also provide a way to incorporate machine-learned predictions about the horizon which interpolates between the known and unknown horizon settings. △ Less

Submitted 22 June, 2023; v1 submitted 27 June, 2022; originally announced June 2022.

arXiv:2206.08742 [pdf, other]

Near-Optimal No-Regret Learning Dynamics for General Convex Games

Authors: Gabriele Farina, Ioannis Anagnostides, Haipeng Luo, Chung-Wei Lee, Christian Kroer, Tuomas Sandholm

Abstract: A recent line of work has established uncoupled learning dynamics such that, when employed by all players in a game, each player's \emph{regret} after $T$ repetitions grows polylogarithmically in $T$, an exponential improvement over the traditional guarantees within the no-regret framework. However, so far these results have only been limited to certain classes of games with structured strategy sp… ▽ More A recent line of work has established uncoupled learning dynamics such that, when employed by all players in a game, each player's \emph{regret} after $T$ repetitions grows polylogarithmically in $T$, an exponential improvement over the traditional guarantees within the no-regret framework. However, so far these results have only been limited to certain classes of games with structured strategy spaces -- such as normal-form and extensive-form games. The question as to whether $O(\text{polylog} T)$ regret bounds can be obtained for general convex and compact strategy sets -- which occur in many fundamental models in economics and multiagent systems -- while retaining efficient strategy updates is an important question. In this paper, we answer this in the positive by establishing the first uncoupled learning algorithm with $O(\log T)$ per-player regret in general \emph{convex games}, that is, games with concave utility functions supported on arbitrary convex and compact strategy sets. Our learning dynamics are based on an instantiation of optimistic follow-the-regularized-leader over an appropriately \emph{lifted} space using a \emph{self-concordant regularizer} that is, peculiarly, not a barrier for the feasible region. Further, our learning dynamics are efficiently implementable given access to a proximal oracle for the convex strategy set, leading to $O(\log\log T)$ per-iteration complexity; we also give extensions when access to only a \emph{linear} optimization oracle is assumed. Finally, we adapt our dynamics to guarantee $O(\sqrt{T})$ regret in the adversarial regime. Even in those special cases where prior results apply, our algorithm improves over the state-of-the-art regret bounds either in terms of the dependence on the number of iterations or on the dimension of the strategy sets. △ Less

Submitted 16 October, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

Comments: To appear at NeurIPS 2022. V2 incorporates reviewers' feedback

arXiv:2206.05825 [pdf, other]

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

Authors: Samuel Sokota, Ryan D'Orazio, J. Zico Kolter, Nicolas Loizou, Marc Lanctot, Ioannis Mitliagkas, Noam Brown, Christian Kroer

Abstract: This work studies an algorithm, which we call magnetic mirror descent, that is inspired by mirror descent and the non-Euclidean proximal gradient algorithm. Our contribution is demonstrating the virtues of magnetic mirror descent as both an equilibrium solver and as an approach to reinforcement learning in two-player zero-sum games. These virtues include: 1) Being the first quantal response equili… ▽ More This work studies an algorithm, which we call magnetic mirror descent, that is inspired by mirror descent and the non-Euclidean proximal gradient algorithm. Our contribution is demonstrating the virtues of magnetic mirror descent as both an equilibrium solver and as an approach to reinforcement learning in two-player zero-sum games. These virtues include: 1) Being the first quantal response equilibria solver to achieve linear convergence for extensive-form games with first order feedback; 2) Being the first standard reinforcement learning algorithm to achieve empirically competitive results with CFR in tabular settings; 3) Achieving favorable performance in 3x3 Dark Hex and Phantom Tic-Tac-Toe as a self-play deep reinforcement learning algorithm. △ Less

Submitted 11 April, 2023; v1 submitted 12 June, 2022; originally announced June 2022.

arXiv:2204.11417 [pdf, other]

Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer Games

Authors: Ioannis Anagnostides, Gabriele Farina, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Tuomas Sandholm

Abstract: In this paper we establish efficient and \emph{uncoupled} learning dynamics so that, when employed by all players in a general-sum multiplayer game, the \emph{swap regret} of each player after $T$ repetitions of the game is bounded by $O(\log T)$, improving over the prior best bounds of $O(\log^4 (T))$. At the same time, we guarantee optimal $O(\sqrt{T})$ swap regret in the adversarial regime as w… ▽ More In this paper we establish efficient and \emph{uncoupled} learning dynamics so that, when employed by all players in a general-sum multiplayer game, the \emph{swap regret} of each player after $T$ repetitions of the game is bounded by $O(\log T)$, improving over the prior best bounds of $O(\log^4 (T))$. At the same time, we guarantee optimal $O(\sqrt{T})$ swap regret in the adversarial regime as well. To obtain these results, our primary contribution is to show that when all players follow our dynamics with a \emph{time-invariant} learning rate, the \emph{second-order path lengths} of the dynamics up to time $T$ are bounded by $O(\log T)$, a fundamental property which could have further implications beyond near-optimally bounding the (swap) regret. Our proposed learning dynamics combine in a novel way \emph{optimistic} regularized learning with the use of \emph{self-concordant barriers}. Further, our analysis is remarkably simple, bypassing the cumbersome framework of higher-order smoothness recently developed by Daskalakis, Fishelson, and Golowich (NeurIPS'21). △ Less

Submitted 5 October, 2022; v1 submitted 24 April, 2022; originally announced April 2022.

Comments: To appear at NeurIPS 2022. V2 incorporates reviewers' feedback and minor corrections

arXiv:2202.13710 [pdf, ps, other]

Best of Many Worlds Guarantees for Online Learning with Knapsacks

Authors: Andrea Celli, Matteo Castiglioni, Christian Kroer

Abstract: We study online learning problems in which a decision maker wants to maximize their expected reward without violating a finite set of $m$ resource constraints. By casting the learning process over a suitably defined space of strategy mixtures, we recover strong duality on a Lagrangian relaxation of the underlying optimization problem, even for general settings with non-convex reward and resource-c… ▽ More We study online learning problems in which a decision maker wants to maximize their expected reward without violating a finite set of $m$ resource constraints. By casting the learning process over a suitably defined space of strategy mixtures, we recover strong duality on a Lagrangian relaxation of the underlying optimization problem, even for general settings with non-convex reward and resource-consumption functions. Then, we provide the first best-of-many-worlds type framework for this setting, with no-regret guarantees under stochastic, adversarial, and non-stationary inputs. Our framework yields the same regret guarantees of prior work in the stochastic case. On the other hand, when budgets grow at least linearly in the time horizon, it allows us to provide a constant competitive ratio in the adversarial case, which improves over the best known upper bound bound of $O(\log m \log T)$. Moreover, our framework allows the decision maker to handle non-convex reward and cost functions. We provide two game-theoretic applications of our framework to give further evidence of its flexibility. In doing so, we show that it can be employed to implement budget-pacing mechanisms in repeated first-price auctions. △ Less

Submitted 10 March, 2023; v1 submitted 28 February, 2022; originally announced February 2022.

arXiv:2202.12277 [pdf, other]

Solving optimization problems with Blackwell approachability

Authors: Julien Grand-Clément, Christian Kroer

Abstract: We introduce the Conic Blackwell Algorithm$^+$ (CBA$^+$) regret minimizer, a new parameter- and scale-free regret minimizer for general convex sets. CBA$^+$ is based on Blackwell approachability and attains $O(\sqrt{T})$ regret. We show how to efficiently instantiate CBA$^+$ for many decision sets of interest, including the simplex, $\ell_{p}$ norm balls, and ellipsoidal confidence regions in the… ▽ More We introduce the Conic Blackwell Algorithm$^+$ (CBA$^+$) regret minimizer, a new parameter- and scale-free regret minimizer for general convex sets. CBA$^+$ is based on Blackwell approachability and attains $O(\sqrt{T})$ regret. We show how to efficiently instantiate CBA$^+$ for many decision sets of interest, including the simplex, $\ell_{p}$ norm balls, and ellipsoidal confidence regions in the simplex. Based on CBA$^+$, we introduce SP-CBA$^+$, a new parameter-free algorithm for solving convex-concave saddle-point problems, which achieves a $O(1/\sqrt{T})$ ergodic rate of convergence. In our simulations, we demonstrate the wide applicability of SP-CBA$^+$ on several standard saddle-point problems, including matrix games, extensive-form games, distributionally robust logistic regression, and Markov decision processes. In each setting, SP-CBA$^+$ achieves state-of-the-art numerical performance, and outperforms classical methods, without the need for any choice of step sizes or other algorithmic parameters. △ Less

Submitted 24 February, 2022; originally announced February 2022.

Comments: arXiv admin note: text overlap with arXiv:2105.13203

arXiv:2202.11614 [pdf, other]

Nonstationary Dual Averaging and Online Fair Allocation

Authors: Luofeng Liao, Yuan Gao, Christian Kroer

Abstract: We consider the problem of fairly allocating items to a set of individuals, when the items are arriving online. A central solution concept in fair allocation is competitive equilibrium: every individual is endowed with a budget of faux currency, and the resulting competitive equilibrium is used to allocate. For the online fair allocation context, the PACE algorithm of Gao et al. [2021] leverages t… ▽ More We consider the problem of fairly allocating items to a set of individuals, when the items are arriving online. A central solution concept in fair allocation is competitive equilibrium: every individual is endowed with a budget of faux currency, and the resulting competitive equilibrium is used to allocate. For the online fair allocation context, the PACE algorithm of Gao et al. [2021] leverages the dual averaging algorithm to approximate competitive equilibria. The authors show that, when items arrive i.i.d, the algorithm asymptotically achieves the fairness and efficiency guarantees of the offline competitive equilibrium allocation. However, real-world data is typically not stationary. One could instead model the data as adversarial, but this is often too pessimistic in practice. Motivated by this consideration, we study an online fair allocation setting with nonstationary item arrivals. To address this setting, we first develop new online learning results for the dual averaging algorithm under nonstationary input models. We show that the dual averaging iterates converge in mean square to both the underlying optimal solution of the "true" stochastic optimization problem as well as the "hindsight" optimal solution of the finite-sum problem given by the sample path. Our results apply to several nonstationary input models: adversarial corruption, ergodic input, and block-independent (including periodic) input. Here, the bound on the mean square error depends on a nonstationarity measure of the input. We recover the classical bound when the input data is i.i.d. We then show that our dual averaging results imply that the PACE algorithm for online fair allocation simultaneously achieves "best of both worlds" guarantees against any of these input models. Finally, we conduct numerical experiments which show strong empirical performance against nonstationary inputs. △ Less

Submitted 17 October, 2022; v1 submitted 23 February, 2022; originally announced February 2022.

Comments: Neurips 2022

arXiv:2202.10939 [pdf, other]

Single-Leg Revenue Management with Advice

Authors: Santiago Balseiro, Christian Kroer, Rachitesh Kumar

Abstract: Single-leg revenue management is a foundational problem of revenue management that has been particularly impactful in the airline and hotel industry: Given $n$ units of a resource, e.g. flight seats, and a stream of sequentially-arriving customers segmented by fares, what is the optimal online policy for allocating the resource. Previous work focused on designing algorithms when forecasts are avai… ▽ More Single-leg revenue management is a foundational problem of revenue management that has been particularly impactful in the airline and hotel industry: Given $n$ units of a resource, e.g. flight seats, and a stream of sequentially-arriving customers segmented by fares, what is the optimal online policy for allocating the resource. Previous work focused on designing algorithms when forecasts are available, which are not robust to inaccuracies in the forecast, or online algorithms with worst-case performance guarantees, which can be too conservative in practice. In this work, we look at the single-leg revenue management problem through the lens of the algorithms-with-advice framework, which attempts to harness the increasing prediction accuracy of machine learning methods by optimally incorporating advice about the future into online algorithms. In particular, we characterize the Pareto frontier that captures the tradeoff between consistency (performance when advice is accurate) and competitiveness (performance when advice is inaccurate) for every advice. Moreover, we provide an online algorithm that always achieves performance on this Pareto frontier. We also study the class of protection level policies, which is the most widely-deployed technique for single-leg revenue management: we provide an algorithm to incorporate advice into protection levels that optimally trades off consistency and competitiveness. Moreover, we empirically evaluate the performance of these algorithms on synthetic data. We find that our algorithm for protection level policies performs remarkably well on most instances, even if it is not guaranteed to be on the Pareto frontier in theory. Our results extend to other unit-cost online allocations problems such as the display advertising and the multiple secretary problem together with more general variable-cost problems such as the online knapsack problem. △ Less

Submitted 22 June, 2023; v1 submitted 17 February, 2022; originally announced February 2022.

arXiv:2202.05446 [pdf, other]

Faster No-Regret Learning Dynamics for Extensive-Form Correlated and Coarse Correlated Equilibria

Authors: Ioannis Anagnostides, Gabriele Farina, Christian Kroer, Andrea Celli, Tuomas Sandholm

Abstract: A recent emerging trend in the literature on learning in games has been concerned with providing faster learning dynamics for correlated and coarse correlated equilibria in normal-form games. Much less is known about the significantly more challenging setting of extensive-form games, which can capture both sequential and simultaneous moves, as well as imperfect information. In this paper we establ… ▽ More A recent emerging trend in the literature on learning in games has been concerned with providing faster learning dynamics for correlated and coarse correlated equilibria in normal-form games. Much less is known about the significantly more challenging setting of extensive-form games, which can capture both sequential and simultaneous moves, as well as imperfect information. In this paper we establish faster no-regret learning dynamics for \textit{extensive-form correlated equilibria (EFCE)} in multiplayer general-sum imperfect-information extensive-form games. When all players follow our accelerated dynamics, the correlated distribution of play is an $O(T^{-3/4})$-approximate EFCE, where the $O(\cdot)$ notation suppresses parameters polynomial in the description of the game. This significantly improves over the best prior rate of $O(T^{-1/2})$. To achieve this, we develop a framework for performing accelerated \emph{Phi-regret minimization} via predictions. One of our key technical contributions -- that enables us to employ our generic template -- is to characterize the stability of fixed points associated with \emph{trigger deviation functions} through a refined perturbation analysis of a structured Markov chain. Furthermore, for the simpler solution concept of extensive-form \emph{coarse} correlated equilibrium (EFCCE) we give a new succinct closed-form characterization of the associated fixed points, bypassing the expensive computation of stationary distributions required for EFCE. Our results place EFCCE closer to \emph{normal-form coarse correlated equilibria} in terms of the per-iteration complexity, although the former prescribes a much more compelling notion of correlation. Finally, experiments conducted on standard benchmarks corroborate our theoretical findings. △ Less

Submitted 10 February, 2022; originally announced February 2022.

Comments: Preliminary parts of this paper will appear at the AAAI-22 Workshop on Reinforcement Learning in Games. This version also contains results from an earlier preprint published by a subset of the authors (arXiv:2109.08138)

arXiv:2202.05194 [pdf, other]

Robust and fair work allocation

Authors: Amine Allouah, Christian Kroer, Xuan Zhang, Vashist Avadhanula, Anil Dania, Caner Gocmen, Sergey Pupyrev, Parikshit Shah, Nicolas Stier

Abstract: In today's digital world, interaction with online platforms is ubiquitous, and thus content moderation is important for protecting users from content that do not comply with pre-established community guidelines. Having a robust content moderation system throughout every stage of planning is particularly important. We study the short-term planning problem of allocating human content reviewers to di… ▽ More In today's digital world, interaction with online platforms is ubiquitous, and thus content moderation is important for protecting users from content that do not comply with pre-established community guidelines. Having a robust content moderation system throughout every stage of planning is particularly important. We study the short-term planning problem of allocating human content reviewers to different harmful content categories. We use tools from fair division and study the application of competitive equilibrium and leximin allocation rules. Furthermore, we incorporate, to the traditional Fisher market setup, novel aspects that are of practical importance. The first aspect is the forecasted workload of different content categories. We show how a formulation that is inspired by the celebrated Eisenberg-Gale program allows us to find an allocation that not only satisfies the forecasted workload, but also fairly allocates the remaining reviewing hours among all content categories. The resulting allocation is also robust as the additional allocation provides a guardrail in cases where the actual workload deviates from the predicted workload. The second practical consideration is time dependent allocation that is motivated by the fact that partners need scheduling guidance for the reviewers across days to achieve efficiency. To address the time component, we introduce new extensions of the various fair allocation approaches for the single-time period setting, and we show that many properties extend in essence, albeit with some modifications. Related to the time component, we additionally investigate how to satisfy markets' desire for smooth allocation (e.g., partners for content reviewers prefer an allocation that does not vary much from time to time, to minimize staffing switch). We demonstrate the performance of our proposed approaches through real-world data obtained from Meta. △ Less

Submitted 14 February, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

arXiv:2202.00237 [pdf, other]

Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging the Gap Between Learning in Extensive-Form and Normal-Form Games

Authors: Gabriele Farina, Chung-Wei Lee, Haipeng Luo, Christian Kroer

Abstract: While extensive-form games (EFGs) can be converted into normal-form games (NFGs), doing so comes at the cost of an exponential blowup of the strategy space. So, progress on NFGs and EFGs has historically followed separate tracks, with the EFG community often having to catch up with advances (e.g., last-iterate convergence and predictive regret bounds) from the larger NFG community. In this paper w… ▽ More While extensive-form games (EFGs) can be converted into normal-form games (NFGs), doing so comes at the cost of an exponential blowup of the strategy space. So, progress on NFGs and EFGs has historically followed separate tracks, with the EFG community often having to catch up with advances (e.g., last-iterate convergence and predictive regret bounds) from the larger NFG community. In this paper we show that the Optimistic Multiplicative Weights Update (OMWU) algorithm -- the premier learning algorithm for NFGs -- can be simulated on the normal-form equivalent of an EFG in linear time per iteration in the game tree size using a kernel trick. The resulting algorithm, Kernelized OMWU (KOMWU), applies more broadly to all convex games whose strategy space is a polytope with 0/1 integral vertices, as long as the kernel can be evaluated efficiently. In the particular case of EFGs, KOMWU closes several standing gaps between NFG and EFG learning, by enabling direct, black-box transfer to EFGs of desirable properties of learning dynamics that were so far known to be achievable only in NFGs. Specifically, KOMWU gives the first algorithm that guarantees at the same time last-iterate convergence, lower dependence on the size of the game tree than all prior algorithms, and $\tilde{\mathcal{O}}(1)$ regret when followed by all players. △ Less

Submitted 1 February, 2022; originally announced February 2022.

arXiv:2108.04862 [pdf, other]

Matching Algorithms for Blood Donation

Authors: Duncan C McElfresh, Christian Kroer, Sergey Pupyrev, Eric Sodomka, Karthik Sankararaman, Zack Chauvin, Neil Dexter, John P Dickerson

Abstract: Global demand for donated blood far exceeds supply, and unmet need is greatest in low- and middle-income countries; experts suggest that large-scale coordination is necessary to alleviate demand. Using the Facebook Blood Donation tool, we conduct the first large-scale algorithmic matching of blood donors with donation opportunities. While measuring actual donation rates remains a challenge, we mea… ▽ More Global demand for donated blood far exceeds supply, and unmet need is greatest in low- and middle-income countries; experts suggest that large-scale coordination is necessary to alleviate demand. Using the Facebook Blood Donation tool, we conduct the first large-scale algorithmic matching of blood donors with donation opportunities. While measuring actual donation rates remains a challenge, we measure donor action (e.g., making a donation appointment) as a proxy for actual donation. We develop automated policies for matching donors with donation opportunities, based on an online matching model. We provide theoretical guarantees for these policies, both regarding the number of expected donations and the equitable treatment of blood recipients. In simulations, a simple matching strategy increases the number of donations by 5-10%; a pilot experiment with real donors shows a 5% relative increase in donor action rate (from 3.7% to 3.9%). When scaled to the global Blood Donation tool user base, this corresponds to an increase of around one hundred thousand users taking action toward donation. Further, observing donor action on a social network can shed light onto donor behavior and response to incentives. Our initial findings align with several observations made in the medical and social science literature regarding donor behavior. △ Less

Submitted 13 August, 2021; v1 submitted 10 August, 2021; originally announced August 2021.

Comments: An early version of this paper appeared at EC'20. (https://doi.org/10.1145/3391403.3399458)

ACM Class: J.3; J.4

arXiv:2107.10923 [pdf, other]

Throttling Equilibria in Auction Markets

Authors: Xi Chen, Christian Kroer, Rachitesh Kumar

Abstract: Throttling is a popular method of budget management for online ad auctions in which the platform modulates the participation probability of an advertiser in order to smoothly spend her budget across many auctions. In this work, we investigate the setting in which all of the advertisers simultaneously employ throttling to manage their budgets, and we do so for both first-price and second-price auct… ▽ More Throttling is a popular method of budget management for online ad auctions in which the platform modulates the participation probability of an advertiser in order to smoothly spend her budget across many auctions. In this work, we investigate the setting in which all of the advertisers simultaneously employ throttling to manage their budgets, and we do so for both first-price and second-price auctions. We analyze the structural and computational properties of the resulting equilibria. For first-price auctions, we show that a unique equilibrium always exists, is well-behaved and can be computed efficiently via tatonnement-style decentralized dynamics. In contrast, for second-price auctions, we prove that even though an equilibrium always exists, the problem of finding an equilibrium is PPAD-complete, there can be multiple equilibria, and it is NP-hard to find the revenue maximizing one. We also compare the equilibrium outcomes of throttling to those of multiplicative pacing, which is the other most popular and well-studied method of budget management. Finally, we characterize the Price of Anarchy of these equilibria for liquid welfare by showing that it is at most 2 for both first-price and second-price auctions, and demonstrating that our bound is tight. △ Less

Submitted 3 February, 2023; v1 submitted 22 July, 2021; originally announced July 2021.

arXiv:2106.14326 [pdf, other]

Last-iterate Convergence in Extensive-Form Games

Authors: Chung-Wei Lee, Christian Kroer, Haipeng Luo

Abstract: Regret-based algorithms are highly efficient at finding approximate Nash equilibria in sequential games such as poker games. However, most regret-based algorithms, including counterfactual regret minimization (CFR) and its variants, rely on iterate averaging to achieve convergence. Inspired by recent advances on last-iterate convergence of optimistic algorithms in zero-sum normal-form games, we st… ▽ More Regret-based algorithms are highly efficient at finding approximate Nash equilibria in sequential games such as poker games. However, most regret-based algorithms, including counterfactual regret minimization (CFR) and its variants, rely on iterate averaging to achieve convergence. Inspired by recent advances on last-iterate convergence of optimistic algorithms in zero-sum normal-form games, we study this phenomenon in sequential games, and provide a comprehensive study of last-iterate convergence for zero-sum extensive-form games with perfect recall (EFGs), using various optimistic regret-minimization algorithms over treeplexes. This includes algorithms using the vanilla entropy or squared Euclidean norm regularizers, as well as their dilated versions which admit more efficient implementation. In contrast to CFR, we show that all of these algorithms enjoy last-iterate convergence, with some of them even converging exponentially fast. We also provide experiments to further support our theoretical results. △ Less

Submitted 27 October, 2021; v1 submitted 27 June, 2021; originally announced June 2021.

arXiv:2106.09503 [pdf, other]

The Parity Ray Regularizer for Pacing in Auction Markets

Authors: Andrea Celli, Riccardo Colini-Baldeschi, Christian Kroer, Eric Sodomka

Abstract: Budget-management systems are one of the key components of modern auction markets. Internet advertising platforms typically offer advertisers the possibility to pace the rate at which their budget is depleted, through budget-pacing mechanisms. We focus on multiplicative pacing mechanisms in an online setting in which a bidder is repeatedly confronted with a series of advertising opportunities. Aft… ▽ More Budget-management systems are one of the key components of modern auction markets. Internet advertising platforms typically offer advertisers the possibility to pace the rate at which their budget is depleted, through budget-pacing mechanisms. We focus on multiplicative pacing mechanisms in an online setting in which a bidder is repeatedly confronted with a series of advertising opportunities. After collecting bids, each item is then allocated through a single-item, second-price auction. If there were no budgetary constraints, bidding truthfully would be an optimal choice for the advertiser. However, since their budget is limited, the advertiser may want to shade their bid downwards in order to preserve their budget for future opportunities, and to spread expenditures evenly over time. The literature on online pacing problems mostly focuses on the setting in which the bidder optimizes an additive separable objective, such as the total click-through rate or the revenue of the allocation. In many settings, however, bidders may also care about other objectives which oftentimes are non-separable, and therefore not amenable to traditional online learning techniques. Building on recent work, we study the frequent case in which advertisers seek to reach a certain distribution of impressions over a target population of users. We introduce a novel regularizer to achieve this desideratum, and show how to integrate it into an online mirror descent scheme attaining the optimal order of sub-linear regret compared to the optimal allocation in hindsight when inputs are drawn independently, from an unknown distribution. Moreover, we show that our approach can easily be incorporated in standard existing pacing systems that are not usually built for this objective. The effectiveness of our algorithm in internet advertising applications is confirmed by numerical experiments on real-world data. △ Less

Submitted 17 June, 2021; originally announced June 2021.

arXiv:2105.13203 [pdf, other]

Conic Blackwell Algorithm: Parameter-Free Convex-Concave Saddle-Point Solving

Authors: Julien Grand-Clément, Christian Kroer

Abstract: We develop new parameter-free and scale-free algorithms for solving convex-concave saddle-point problems. Our results are based on a new simple regret minimizer, the Conic Blackwell Algorithm$^+$ (CBA$^+$), which attains $O(1/\sqrt{T})$ average regret. Intuitively, our approach generalizes to other decision sets of interest ideas from the Counterfactual Regret minimization (CFR$^+$) algorithm, whi… ▽ More We develop new parameter-free and scale-free algorithms for solving convex-concave saddle-point problems. Our results are based on a new simple regret minimizer, the Conic Blackwell Algorithm$^+$ (CBA$^+$), which attains $O(1/\sqrt{T})$ average regret. Intuitively, our approach generalizes to other decision sets of interest ideas from the Counterfactual Regret minimization (CFR$^+$) algorithm, which has very strong practical performance for solving sequential games on simplexes. We show how to implement CBA$^+$ for the simplex, $\ell_{p}$ norm balls, and ellipsoidal confidence regions in the simplex, and we present numerical experiments for solving matrix games and distributionally robust optimization problems. Our empirical results show that CBA$^+$ is a simple algorithm that outperforms state-of-the-art methods on synthetic data and real data instances, without the need for any choice of step sizes or other algorithmic parameters. △ Less

Submitted 14 October, 2021; v1 submitted 27 May, 2021; originally announced May 2021.

arXiv:2105.12954 [pdf, other]

Better Regularization for Sequential Decision Spaces: Fast Convergence Rates for Nash, Correlated, and Team Equilibria

Authors: Gabriele Farina, Christian Kroer, Tuomas Sandholm

Abstract: We study the application of iterative first-order methods to the problem of computing equilibria of large-scale two-player extensive-form games. First-order methods must typically be instantiated with a regularizer that serves as a distance-generating function for the decision sets of the players. For the case of two-player zero-sum games, the state-of-the-art theoretical convergence rate for Nash… ▽ More We study the application of iterative first-order methods to the problem of computing equilibria of large-scale two-player extensive-form games. First-order methods must typically be instantiated with a regularizer that serves as a distance-generating function for the decision sets of the players. For the case of two-player zero-sum games, the state-of-the-art theoretical convergence rate for Nash equilibrium is achieved by using the dilated entropy function. In this paper, we introduce a new entropy-based distance-generating function for two-player zero-sum games, and show that this function achieves significantly better strong convexity properties than the dilated entropy, while maintaining the same easily-implemented closed-form proximal map**. Extensive numerical simulations show that these superior theoretical properties translate into better numerical performance as well. We then generalize our new entropy distance function, as well as general dilated distance functions, to the scaled extension operator. The scaled extension operator is a way to recursively construct convex sets, which generalizes the decision polytope of extensive-form games, as well as the convex polytopes corresponding to correlated and team equilibria. By instantiating first-order methods with our regularizers, we develop the first accelerated first-order methods for computing correlated equilibra and ex-ante coordinated team equilibria. Our methods have a guaranteed $1/T$ rate of convergence, along with linear-time proximal updates. △ Less

Submitted 12 October, 2021; v1 submitted 27 May, 2021; originally announced May 2021.

Comments: Extended version of the EC21 conference version

arXiv:2103.13969 [pdf, other]

The Complexity of Pacing for Second-Price Auctions

Authors: Xi Chen, Christian Kroer, Rachitesh Kumar

Abstract: Budget constraints are ubiquitous in online advertisement auctions. To manage these constraints and smooth out the expenditure across auctions, the bidders (or the platform on behalf of them) often employ pacing: each bidder is assigned a pacing multiplier between zero and one, and her bid on each item is multiplicatively scaled down by the pacing multiplier. This naturally gives rise to a game in… ▽ More Budget constraints are ubiquitous in online advertisement auctions. To manage these constraints and smooth out the expenditure across auctions, the bidders (or the platform on behalf of them) often employ pacing: each bidder is assigned a pacing multiplier between zero and one, and her bid on each item is multiplicatively scaled down by the pacing multiplier. This naturally gives rise to a game in which each bidder strategically selects a multiplier. The appropriate notion of equilibrium in this game is known as a pacing equilibrium. In this work, we show that the problem of finding an approximate pacing equilibrium is PPAD-complete for second-price auctions. This resolves an open question of Conitzer et al. [2021]. As a consequence of our hardness result, we show that the tatonnement-style budget-management dynamics introduced by Borgs et al. [2007] are unlikely to converge efficiently for repeated second-price auctions. This disproves a conjecture by Borgs et al. [2007], under the assumption that the complexity class PPAD is not equal to P. Our hardness result also implies the existence of a refinement of supply-aware market equilibria which is hard to compute with simple linear utilities. △ Less

Submitted 3 February, 2023; v1 submitted 25 March, 2021; originally announced March 2021.

arXiv:2103.12936 [pdf, other]

Online Market Equilibrium with Application to Fair Division

Authors: Yuan Gao, Christian Kroer, Alex Peysakhovich

Abstract: Computing market equilibria is a problem of both theoretical and applied interest. Much research to date focuses on the case of static Fisher markets with full information on buyers' utility functions and item supplies. Motivated by real-world markets, we consider an online setting: individuals have linear, additive utility functions; items arrive sequentially and must be allocated and priced irre… ▽ More Computing market equilibria is a problem of both theoretical and applied interest. Much research to date focuses on the case of static Fisher markets with full information on buyers' utility functions and item supplies. Motivated by real-world markets, we consider an online setting: individuals have linear, additive utility functions; items arrive sequentially and must be allocated and priced irrevocably. We define the notion of an online market equilibrium in such a market as time-indexed allocations and prices which guarantee buyer optimality and market clearance in hindsight. We propose a simple, scalable and interpretable allocation and pricing dynamics termed as PACE. When items are drawn i.i.d. from an unknown distribution (with a possibly continuous support), we show that PACE leads to an online market equilibrium asymptotically. In particular, PACE ensures that buyers' time-averaged utilities converge to the equilibrium utilities w.r.t. a static market with item supplies being the unknown distribution and that buyers' time-averaged expenditures converge to their per-period budget. Hence, many desirable properties of market equilibrium-based fair division such as no envy, Pareto optimality, and the proportional-share guarantee are also attained asymptotically in the online setting. Next, we extend the dynamics to handle quasilinear buyer utilities, which gives the first online algorithm for computing first-price pacing equilibria. Finally, numerical experiments on real and synthetic datasets show that the dynamics converges quickly under various metrics. △ Less

Submitted 2 October, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

Comments: An earlier version was accepted at NeurIPS 2021

arXiv:2102.10476 [pdf, other]

Contextual Standard Auctions with Budgets: Revenue Equivalence and Efficiency Guarantees

Authors: Santiago Balseiro, Christian Kroer, Rachitesh Kumar

Abstract: The internet advertising market is a multi-billion dollar industry, in which advertisers buy thousands of ad placements every day by repeatedly participating in auctions. An important and ubiquitous feature of these auctions is the presence of campaign budgets, which specify the maximum amount the advertisers are willing to pay over a specified time period. In this paper, we present a new model to… ▽ More The internet advertising market is a multi-billion dollar industry, in which advertisers buy thousands of ad placements every day by repeatedly participating in auctions. An important and ubiquitous feature of these auctions is the presence of campaign budgets, which specify the maximum amount the advertisers are willing to pay over a specified time period. In this paper, we present a new model to study the equilibrium bidding strategies in standard auctions, a large class of auctions that includes first- and second-price auctions, for advertisers who satisfy budget constraints on average. Our model dispenses with the common, yet unrealistic assumption that advertisers' values are independent and instead assumes a contextual model in which advertisers determine their values using a common feature vector. We show the existence of a natural value-pacing-based Bayes-Nash equilibrium under very mild assumptions. Furthermore, we prove a revenue equivalence showing that all standard auctions yield the same revenue even in the presence of budget constraints. Leveraging this equivalence, we prove Price of Anarchy bounds for liquid welfare and structural properties of pacing-based equilibria that hold for all standard auctions. In recent years, the internet advertising market has adopted first-price auctions as the preferred paradigm for selling advertising slots. Our work thus takes an important step toward understanding the implications of the shift to first-price auctions in internet advertising markets by studying how the choice of the selling mechanism impacts revenues, welfare, and advertisers' bidding strategies. △ Less

Submitted 9 October, 2022; v1 submitted 20 February, 2021; originally announced February 2021.

arXiv:2010.03025 [pdf, other]

Infinite-Dimensional Fisher Markets and Tractable Fair Division

Authors: Yuan Gao, Christian Kroer

Abstract: Linear Fisher markets are a fundamental economic model with applications in fair division as well as large-scale Internet markets. In the finite-dimensional case of $n$ buyers and $m$ items, a market equilibrium can be computed using the Eisenberg-Gale convex program. Motivated by large-scale Internet advertising and fair division applications, this paper considers a generalization of a linear Fis… ▽ More Linear Fisher markets are a fundamental economic model with applications in fair division as well as large-scale Internet markets. In the finite-dimensional case of $n$ buyers and $m$ items, a market equilibrium can be computed using the Eisenberg-Gale convex program. Motivated by large-scale Internet advertising and fair division applications, this paper considers a generalization of a linear Fisher market where there is a finite set of buyers and a continuum of items. We introduce generalizations of the Eisenberg-Gale convex program and its dual to this infinite-dimensional setting, which leads to Banach-space optimization problems. We establish existence of optimal solutions, strong duality, as well as necessity and sufficiency of KKT-type conditions. All these properties are established via non-standard arguments, which circumvent the limitations of duality theory in optimization over infinite-dimensional Banach spaces. Furthermore, we show that there exists a pure equilibrium allocation, i.e., a division of the item space. When the item space is a closed interval and buyers have piecewise linear valuations, we show that the Eisenberg-Gale-type convex program over the infinite-dimensional allocations can be reformulated as a finite-dimensional convex conic program, which can be solved efficiently using off-the-shelf optimization software based on primal-dual interior-point methods. Based on our convex conic reformulation, we develop the first polynomial-time cake-cutting algorithm that achieves Pareto optimality, envy-freeness, and proportionality. For general buyer valuations or a very large number of buyers, we propose computing market equilibrium using stochastic dual averaging, which finds approximate equilibrium prices with high probability. Finally, we discuss how the above results easily extend to the case of quasilinear utilities. △ Less

Submitted 5 April, 2021; v1 submitted 6 October, 2020; originally announced October 2020.

Comments: Submitted to Operations Research. Revised and reorganized. Added extensions to quasilinear utilities

arXiv:2009.06790 [pdf, other]

First-Order Methods for Wasserstein Distributionally Robust MDP

Authors: Julien Grand-Clément, Christian Kroer

Abstract: Markov decision processes (MDPs) are known to be sensitive to parameter specification. Distributionally robust MDPs alleviate this issue by allowing for \emph{ambiguity sets} which give a set of possible distributions over parameter sets. The goal is to find an optimal policy with respect to the worst-case parameter distribution. We propose a framework for solving Distributionally robust MDPs via… ▽ More Markov decision processes (MDPs) are known to be sensitive to parameter specification. Distributionally robust MDPs alleviate this issue by allowing for \emph{ambiguity sets} which give a set of possible distributions over parameter sets. The goal is to find an optimal policy with respect to the worst-case parameter distribution. We propose a framework for solving Distributionally robust MDPs via first-order methods, and instantiate it for several types of Wasserstein ambiguity sets. By develo** efficient proximal updates, our algorithms achieve a convergence rate of $O\left(NA^{2.5}S^{3.5}\log(S)\log(ε^{-1})ε^{-1.5} \right)$ for the number of kernels $N$ in the support of the nominal distribution, states $S$, and actions $A$; this rate varies slightly based on the Wasserstein setup. Our dependence on $N,A$ and $S$ is significantly better than existing methods, which have a complexity of $O\left(N^{3.5}A^{3.5}S^{4.5}\log^{2}(ε^{-1}) \right)$. Numerical experiments show that our algorithm is significantly more scalable than state-of-the-art approaches across several domains. △ Less

Submitted 3 May, 2021; v1 submitted 14 September, 2020; originally announced September 2020.

arXiv:2007.14358 [pdf, other]

Faster Game Solving via Predictive Blackwell Approachability: Connecting Regret Matching and Mirror Descent

Authors: Gabriele Farina, Christian Kroer, Tuomas Sandholm

Abstract: Blackwell approachability is a framework for reasoning about repeated games with vector-valued payoffs. We introduce predictive Blackwell approachability, where an estimate of the next payoff vector is given, and the decision maker tries to achieve better performance based on the accuracy of that estimator. In order to derive algorithms that achieve predictive Blackwell approachability, we start b… ▽ More Blackwell approachability is a framework for reasoning about repeated games with vector-valued payoffs. We introduce predictive Blackwell approachability, where an estimate of the next payoff vector is given, and the decision maker tries to achieve better performance based on the accuracy of that estimator. In order to derive algorithms that achieve predictive Blackwell approachability, we start by showing a powerful connection between four well-known algorithms. Follow-the-regularized-leader (FTRL) and online mirror descent (OMD) are the most prevalent regret minimizers in online convex optimization. In spite of this prevalence, the regret matching (RM) and regret matching+ (RM+) algorithms have been preferred in the practice of solving large-scale games (as the local regret minimizers within the counterfactual regret minimization framework). We show that RM and RM+ are the algorithms that result from running FTRL and OMD, respectively, to select the halfspace to force at all times in the underlying Blackwell approachability game. By applying the predictive variants of FTRL or OMD to this connection, we obtain predictive Blackwell approachability algorithms, as well as predictive variants of RM and RM+. In experiments across 18 common zero-sum extensive-form benchmark games, we show that predictive RM+ coupled with counterfactual regret minimization converges vastly faster than the fastest prior algorithms (CFR+, DCFR, LCFR) across all games but two of the poker games, sometimes by two or more orders of magnitude. △ Less

Submitted 7 March, 2021; v1 submitted 28 July, 2020; originally announced July 2020.

Comments: Full version. The body of the paper appeared in the proceedings of the AAAI 2021 conference

arXiv:2007.11961 [pdf, other]

Dominant Resource Fairness with Meta-Types

Authors: Steven Yin, Shatian Wang, Lingyi Zhang, Christian Kroer

Abstract: Inspired by the recent COVID-19 pandemic, we study a generalization of the multi-resource allocation problem with heterogeneous demands and Leontief utilities. Unlike existing settings, we allow each agent to specify requirements to only accept allocations from a subset of the total supply for each resource. These requirements can take form in location constraints (e.g. A hospital can only accept… ▽ More Inspired by the recent COVID-19 pandemic, we study a generalization of the multi-resource allocation problem with heterogeneous demands and Leontief utilities. Unlike existing settings, we allow each agent to specify requirements to only accept allocations from a subset of the total supply for each resource. These requirements can take form in location constraints (e.g. A hospital can only accept volunteers who live nearby due to commute limitations). This can also model a type of substitution effect where some agents need 1 unit of resource A \emph{or} B, both belonging to the same meta-type. But some agents specifically want A, and others specifically want B. We propose a new mechanism called Dominant Resource Fairness with Meta Types which determines the allocations by solving a small number of linear programs. The proposed method satisfies Pareto optimality, envy-freeness, strategy-proofness, and a notion of sharing incentive for our setting. To the best of our knowledge, we are the first to study this problem formulation, which improved upon existing work by capturing more constraints that often arise in real life situations. Finally, we show numerically that our method scales better to large problems than alternative approaches. △ Less

Submitted 12 August, 2021; v1 submitted 21 July, 2020; originally announced July 2020.

arXiv:2006.09538 [pdf, other]

Evaluating and Rewarding Teamwork Using Cooperative Game Abstractions

Authors: Tom Yan, Christian Kroer, Alexander Peysakhovich

Abstract: Can we predict how well a team of individuals will perform together? How should individuals be rewarded for their contributions to the team performance? Cooperative game theory gives us a powerful set of tools for answering these questions: the Characteristic Function (CF) and solution concepts like the Shapley Value (SV). There are two major difficulties in applying these techniques to real world… ▽ More Can we predict how well a team of individuals will perform together? How should individuals be rewarded for their contributions to the team performance? Cooperative game theory gives us a powerful set of tools for answering these questions: the Characteristic Function (CF) and solution concepts like the Shapley Value (SV). There are two major difficulties in applying these techniques to real world problems: first, the CF is rarely given to us and needs to be learned from data. Second, the SV is combinatorial in nature. We introduce a parametric model called cooperative game abstractions (CGAs) for estimating CFs from data. CGAs are easy to learn, readily interpretable, and crucially allow linear-time computation of the SV. We provide identification results and sample complexity bounds for CGA models as well as error bounds in the estimation of the SV using CGAs. We apply our methods to study teams of artificial RL agents as well as real world teams from professional sports. △ Less

Submitted 16 June, 2020; originally announced June 2020.

Showing 1–50 of 74 results for author: Kroer, C