Search | arXiv e-print repository

Stackelberg POMDP: A Reinforcement Learning Approach for Economic Design

Authors: Gianluca Brero, Alon Eden, Darshan Chakrabarti, Matthias Gerstgrasser, Amy Greenwald, Vincent Li, David C. Parkes

Abstract: We introduce a reinforcement learning framework for economic design where the interaction between the environment designer and the participants is modeled as a Stackelberg game. In this game, the designer (leader) sets up the rules of the economic system, while the participants (followers) respond strategically. We integrate algorithms for determining followers' response strategies into the leader… ▽ More We introduce a reinforcement learning framework for economic design where the interaction between the environment designer and the participants is modeled as a Stackelberg game. In this game, the designer (leader) sets up the rules of the economic system, while the participants (followers) respond strategically. We integrate algorithms for determining followers' response strategies into the leader's learning environment, providing a formulation of the leader's learning problem as a POMDP that we call the Stackelberg POMDP. We prove that the optimal leader's strategy in the Stackelberg game is the optimal policy in our Stackelberg POMDP under a limited set of possible policies, establishing a connection between solving POMDPs and Stackelberg games. We solve our POMDP under a limited set of policy options via the centralized training with decentralized execution framework. For the specific case of followers that are modeled as no-regret learners, we solve an array of increasingly complex settings, including problems of indirect mechanism design where there is turn-taking and limited communication by agents. We demonstrate the effectiveness of our training framework through ablation studies. We also give convergence results for no-regret learners to a Bayesian version of a coarse-correlated equilibrium, extending known results to the case of correlated types. △ Less

Submitted 9 November, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

arXiv:2202.07106 [pdf, other]

Learning to Mitigate AI Collusion on Economic Platforms

Authors: Gianluca Brero, Nicolas Lepore, Eric Mibuari, David C. Parkes

Abstract: Algorithmic pricing on online e-commerce platforms raises the concern of tacit collusion, where reinforcement learning algorithms learn to set collusive prices in a decentralized manner and through nothing more than profit feedback. This raises the question as to whether collusive pricing can be prevented through the design of suitable "buy boxes," i.e., through the design of the rules that govern… ▽ More Algorithmic pricing on online e-commerce platforms raises the concern of tacit collusion, where reinforcement learning algorithms learn to set collusive prices in a decentralized manner and through nothing more than profit feedback. This raises the question as to whether collusive pricing can be prevented through the design of suitable "buy boxes," i.e., through the design of the rules that govern the elements of e-commerce sites that promote particular products and prices to consumers. In this paper, we demonstrate that reinforcement learning (RL) can also be used by platforms to learn buy box rules that are effective in preventing collusion by RL sellers. For this, we adopt the methodology of Stackelberg POMDPs, and demonstrate success in learning robust rules that continue to provide high consumer welfare together with sellers employing different behavior models or having out-of-distribution costs for goods. △ Less

Submitted 11 June, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

arXiv:2010.01180 [pdf, other]

Reinforcement Learning of Sequential Price Mechanisms

Authors: Gianluca Brero, Alon Eden, Matthias Gerstgrasser, David C. Parkes, Duncan Rheingans-Yoo

Abstract: We introduce the use of reinforcement learning for indirect mechanisms, working with the existing class of sequential price mechanisms, which generalizes both serial dictatorship and posted price mechanisms and essentially characterizes all strongly obviously strategyproof mechanisms. Learning an optimal mechanism within this class forms a partially-observable Markov decision process. We provide r… ▽ More We introduce the use of reinforcement learning for indirect mechanisms, working with the existing class of sequential price mechanisms, which generalizes both serial dictatorship and posted price mechanisms and essentially characterizes all strongly obviously strategyproof mechanisms. Learning an optimal mechanism within this class forms a partially-observable Markov decision process. We provide rigorous conditions for when this class of mechanisms is more powerful than simpler static mechanisms, for sufficiency or insufficiency of observation statistics for learning, and for the necessity of complex (deep) policies. We show that our approach can learn optimal or near-optimal mechanisms in several experimental settings. △ Less

Submitted 5 May, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

arXiv:2009.13605 [pdf, other]

iMLCA: Machine Learning-powered Iterative Combinatorial Auctions with Interval Bidding

Authors: Benjamin Lubin, Sven Seuken, Manuel Beyeler, Gianluca Brero

Abstract: Preference elicitation is a major challenge in large combinatorial auctions because the bundle space grows exponentially in the number of items. Recent work has used machine learning (ML) algorithms to identify a small set of bundles to query from each bidder. However, a shortcoming of this prior work is that bidders must submit exact values for the queried bundles, which can be quite costly. To a… ▽ More Preference elicitation is a major challenge in large combinatorial auctions because the bundle space grows exponentially in the number of items. Recent work has used machine learning (ML) algorithms to identify a small set of bundles to query from each bidder. However, a shortcoming of this prior work is that bidders must submit exact values for the queried bundles, which can be quite costly. To address this, we propose iMLCA, a new ML-powered iterative combinatorial auction with interval bidding (i.e., where bidders submit upper and lower bounds instead of exact values). To steer the auction towards an efficient allocation, we introduce a price-based activity rule, asking bidders to tighten bounds on relevant bundles only. In our experiments, iMLCA achieves the same allocative efficiency as the prior ML-based auction that uses exact bidding. Moreover, it outperforms the well-known combinatorial clock auction in a realistically-sized domain. △ Less

Submitted 29 August, 2021; v1 submitted 28 September, 2020; originally announced September 2020.

Comments: 30 pages, 3 figures

arXiv:1911.08042 [pdf, other]

Machine Learning-powered Iterative Combinatorial Auctions

Authors: Gianluca Brero, Benjamin Lubin, Sven Seuken

Abstract: We present a machine learning-powered iterative combinatorial auction (MLCA). The main goal of integrating machine learning (ML) into the auction is to improve preference elicitation, which is a major challenge in large combinatorial auctions (CAs). In contrast to prior work, our auction design uses value queries instead of prices to drive the auction. The ML algorithm is used to help the auction… ▽ More We present a machine learning-powered iterative combinatorial auction (MLCA). The main goal of integrating machine learning (ML) into the auction is to improve preference elicitation, which is a major challenge in large combinatorial auctions (CAs). In contrast to prior work, our auction design uses value queries instead of prices to drive the auction. The ML algorithm is used to help the auction decide which value queries to ask in every iteration. While using ML inside a CA introduces new challenges, we demonstrate how we obtain a design that is individually rational, satisfies no-deficit, has good incentives, and is computationally practical. We benchmark our new auction against the well-known combinatorial clock auction (CCA). Our results indicate that, especially in large domains, MLCA can achieve significantly higher allocative efficiency than the CCA, even with only a small number of value queries. △ Less

Submitted 1 September, 2021; v1 submitted 18 November, 2019; originally announced November 2019.

arXiv:1809.05340 [pdf, other]

Fast Iterative Combinatorial Auctions via Bayesian Learning

Authors: Gianluca Brero, Sébastien Lahaie, Sven Seuken

Abstract: Iterative combinatorial auctions (CAs) are often used in multi-billion dollar domains like spectrum auctions, and speed of convergence is one of the crucial factors behind the choice of a specific design for practical applications. To achieve fast convergence, current CAs require careful tuning of the price update rule to balance convergence speed and allocative efficiency. Brero and Lahaie (2018)… ▽ More Iterative combinatorial auctions (CAs) are often used in multi-billion dollar domains like spectrum auctions, and speed of convergence is one of the crucial factors behind the choice of a specific design for practical applications. To achieve fast convergence, current CAs require careful tuning of the price update rule to balance convergence speed and allocative efficiency. Brero and Lahaie (2018) recently introduced a Bayesian iterative auction design for settings with single-minded bidders. The Bayesian approach allowed them to incorporate prior knowledge into the price update algorithm, reducing the number of rounds to convergence with minimal parameter tuning. In this paper, we generalize their work to settings with no restrictions on bidder valuations. We introduce a new Bayesian CA design for this general setting which uses Monte Carlo Expectation Maximization to update prices at each round of the auction. We evaluate our approach via simulations on CATS instances. Our results show that our Bayesian CA outperforms even a highly optimized benchmark in terms of clearing percentage and convergence speed. △ Less

Submitted 10 July, 2019; v1 submitted 14 September, 2018; originally announced September 2018.

Comments: 9 pages, 2 figures, AAAI-19

arXiv:1712.05291 [pdf, other]

A Bayesian Clearing Mechanism for Combinatorial Auctions

Authors: Gianluca Brero, Sébastien Lahaie

Abstract: We cast the problem of combinatorial auction design in a Bayesian framework in order to incorporate prior information into the auction process and minimize the number of rounds to convergence. We first develop a generative model of agent valuations and market prices such that clearing prices become maximum a posteriori estimates given observed agent valuations. This generative model then forms the… ▽ More We cast the problem of combinatorial auction design in a Bayesian framework in order to incorporate prior information into the auction process and minimize the number of rounds to convergence. We first develop a generative model of agent valuations and market prices such that clearing prices become maximum a posteriori estimates given observed agent valuations. This generative model then forms the basis of an auction process which alternates between refining estimates of agent valuations and computing candidate clearing prices. We provide an implementation of the auction using assumed density filtering to estimate valuations and expectation maximization to compute prices. An empirical evaluation over a range of valuation domains demonstrates that our Bayesian auction mechanism is highly competitive against the combinatorial clock auction in terms of rounds to convergence, even under the most favorable choices of price increment for this baseline. △ Less

Submitted 16 November, 2018; v1 submitted 14 December, 2017; originally announced December 2017.

Comments: 9 pages, 4 figures, AAAI-18

Showing 1–7 of 7 results for author: Brero, G