Search | arXiv e-print repository

Learning from Aggregate responses: Instance Level versus Bag Level Loss Functions

Authors: Adel Javanmard, Lin Chen, Vahab Mirrokni, Ashwinkumar Badanidiyuru, Gang Fu

Abstract: Due to the rise of privacy concerns, in many practical applications the training data is aggregated before being shared with the learner, in order to protect privacy of users' sensitive responses. In an aggregate learning framework, the dataset is grouped into bags of samples, where each bag is available only with an aggregate response, providing a summary of individuals' responses in that bag. In… ▽ More Due to the rise of privacy concerns, in many practical applications the training data is aggregated before being shared with the learner, in order to protect privacy of users' sensitive responses. In an aggregate learning framework, the dataset is grouped into bags of samples, where each bag is available only with an aggregate response, providing a summary of individuals' responses in that bag. In this paper, we study two natural loss functions for learning from aggregate responses: bag-level loss and the instance-level loss. In the former, the model is learnt by minimizing a loss between aggregate responses and aggregate model predictions, while in the latter the model aims to fit individual predictions to the aggregate responses. In this work, we show that the instance-level loss can be perceived as a regularized form of the bag-level loss. This observation lets us compare the two approaches with respect to bias and variance of the resulting estimators, and introduce a novel interpolating estimator which combines the two approaches. For linear regression tasks, we provide a precise characterization of the risk of the interpolating estimator in an asymptotic regime where the size of the training set grows in proportion to the features dimension. Our analysis allows us to theoretically understand the effect of different factors, such as bag size on the model prediction risk. In addition, we propose a mechanism for differentially private learning from aggregate responses and derive the optimal bag size in terms of prediction risk-privacy trade-off. We also carry out thorough experiments to corroborate our theory and show the efficacy of the interpolating estimator. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: To appear in the Twelfth International Conference on Learning Representations (ICLR 2024)

arXiv:2312.05659 [pdf, other]

Optimal Unbiased Randomizers for Regression with Label Differential Privacy

Authors: Ashwinkumar Badanidiyuru, Badih Ghazi, Pritish Kamath, Ravi Kumar, Ethan Leeman, Pasin Manurangsi, Avinash V Varadarajan, Chiyuan Zhang

Abstract: We propose a new family of label randomizers for training regression models under the constraint of label differential privacy (DP). In particular, we leverage the trade-offs between bias and variance to construct better label randomizers depending on a privately estimated prior distribution over the labels. We demonstrate that these randomizers achieve state-of-the-art privacy-utility trade-offs… ▽ More We propose a new family of label randomizers for training regression models under the constraint of label differential privacy (DP). In particular, we leverage the trade-offs between bias and variance to construct better label randomizers depending on a privately estimated prior distribution over the labels. We demonstrate that these randomizers achieve state-of-the-art privacy-utility trade-offs on several datasets, highlighting the importance of reducing bias when training neural networks with label DP. We also provide theoretical results shedding light on the structural properties of the optimal unbiased randomizers. △ Less

Submitted 9 December, 2023; originally announced December 2023.

Comments: Proceedings version to appear at NeurIPS 2023

arXiv:2309.13896 [pdf, other]

Follow-ups Also Matter: Improving Contextual Bandits via Post-serving Contexts

Authors: Chaoqi Wang, Ziyu Ye, Zhe Feng, Ashwinkumar Badanidiyuru, Haifeng Xu

Abstract: Standard contextual bandit problem assumes that all the relevant contexts are observed before the algorithm chooses an arm. This modeling paradigm, while useful, often falls short when dealing with problems in which valuable additional context can be observed after arm selection. For example, content recommendation platforms like Youtube, Instagram, Tiktok also observe valuable follow-up informati… ▽ More Standard contextual bandit problem assumes that all the relevant contexts are observed before the algorithm chooses an arm. This modeling paradigm, while useful, often falls short when dealing with problems in which valuable additional context can be observed after arm selection. For example, content recommendation platforms like Youtube, Instagram, Tiktok also observe valuable follow-up information pertinent to the user's reward after recommendation (e.g., how long the user stayed, what is the user's watch speed, etc.). To improve online learning efficiency in these applications, we study a novel contextual bandit problem with post-serving contexts and design a new algorithm, poLinUCB, that achieves tight regret under standard assumptions. Core to our technical proof is a robustified and generalized version of the well-known Elliptical Potential Lemma (EPL), which can accommodate noise in data. Such robustification is necessary for tackling our problem, and we believe it could also be of general interest. Extensive empirical tests on both synthetic and real-world datasets demonstrate the significant benefit of utilizing post-serving contexts as well as the superior performance of our algorithm over the state-of-the-art approaches. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: NeurIPS 2023 (Spotlight)

arXiv:2206.01293 [pdf, other]

Incrementality Bidding via Reinforcement Learning under Mixed and Delayed Rewards

Authors: Ashwinkumar Badanidiyuru, Zhe Feng, Tianxi Li, Haifeng Xu

Abstract: Incrementality, which is used to measure the causal effect of showing an ad to a potential customer (e.g. a user in an internet platform) versus not, is a central object for advertisers in online advertising platforms. This paper investigates the problem of how an advertiser can learn to optimize the bidding sequence in an online manner \emph{without} knowing the incrementality parameters in advan… ▽ More Incrementality, which is used to measure the causal effect of showing an ad to a potential customer (e.g. a user in an internet platform) versus not, is a central object for advertisers in online advertising platforms. This paper investigates the problem of how an advertiser can learn to optimize the bidding sequence in an online manner \emph{without} knowing the incrementality parameters in advance. We formulate the offline version of this problem as a specially structured episodic Markov Decision Process (MDP) and then, for its online learning counterpart, propose a novel reinforcement learning (RL) algorithm with regret at most $\widetilde{O}(H^2\sqrt{T})$, which depends on the number of rounds $H$ and number of episodes $T$, but does not depend on the number of actions (i.e., possible bids). A fundamental difference between our learning problem from standard RL problems is that the realized reward feedback from conversion incrementality is \emph{mixed} and \emph{delayed}. To handle this difficulty we propose and analyze a novel pairwise moment-matching algorithm to learn the conversion incrementality, which we believe is of independent of interest. △ Less

Submitted 13 January, 2023; v1 submitted 2 June, 2022; originally announced June 2022.

arXiv:2109.04888 [pdf, ps, other]

Auctioning with Strategically Reticent Bidders

Authors: Jibang Wu, Ashwinkumar Badanidiyuru, Haifeng Xu

Abstract: We propose and study a novel mechanism design setup where each bidder holds two kinds of private information: (1) type variable, which can be misreported; (2) information variable, which the bidder may want to conceal or partially reveal, but importantly, not to misreport. We refer to bidders with such behaviors as strategically reticent bidders. Among others, one direct motivation of our model is… ▽ More We propose and study a novel mechanism design setup where each bidder holds two kinds of private information: (1) type variable, which can be misreported; (2) information variable, which the bidder may want to conceal or partially reveal, but importantly, not to misreport. We refer to bidders with such behaviors as strategically reticent bidders. Among others, one direct motivation of our model is the ad auction in which many ad platforms today elicit from each bidder not only their private value per conversion but also their private information about Internet users (e.g., user activities on the advertiser's websites) in order to improve the platform's estimation of conversion rates. We show that in this new setup, it is still possible to design mechanisms that are both Incentive and Information Compatible (IIC). We develop two different black-box transformations, which convert any mechanism $\mathcal{M}$ for classic bidders to a mechanism $\bar{\mathcal{M}}$ for strategically reticent bidders, based on either outcome of expectation or expectation of outcome, respectively. We identify properties of the original mechanism $\mathcal{M}$ under which the transformation leads to IIC mechanisms $\bar{\mathcal{M}}$. Interestingly, as corollaries of these results, we show that running VCG with bidders' expected values maximizes welfare, whereas the mechanism using expected outcome of Myerson's auction maximizes revenue. Finally, we study how regulation on the auctioneer's usage of information can lead to more robust mechanisms. △ Less

Submitted 29 January, 2023; v1 submitted 10 September, 2021; originally announced September 2021.

arXiv:2109.03173 [pdf, ps, other]

Learning to Bid in Contextual First Price Auctions

Authors: Ashwinkumar Badanidiyuru, Zhe Feng, Guru Guruganesh

Abstract: In this paper, we investigate the problem about how to bid in repeated contextual first price auctions. We consider a single bidder (learner) who repeatedly bids in the first price auctions: at each time $t$, the learner observes a context $x_t\in \mathbb{R}^d$ and decides the bid based on historical information and $x_t$. We assume a structured linear model of the maximum bid of all the others… ▽ More In this paper, we investigate the problem about how to bid in repeated contextual first price auctions. We consider a single bidder (learner) who repeatedly bids in the first price auctions: at each time $t$, the learner observes a context $x_t\in \mathbb{R}^d$ and decides the bid based on historical information and $x_t$. We assume a structured linear model of the maximum bid of all the others $m_t = α_0\cdot x_t + z_t$, where $α_0\in \mathbb{R}^d$ is unknown to the learner and $z_t$ is randomly sampled from a noise distribution $\mathcal{F}$ with log-concave density function $f$. We consider both \emph{binary feedback} (the learner can only observe whether she wins or not) and \emph{full information feedback} (the learner can observe $m_t$) at the end of each time $t$. For binary feedback, when the noise distribution $\mathcal{F}$ is known, we propose a bidding algorithm, by using maximum likelihood estimation (MLE) method to achieve at most $\widetilde{O}(\sqrt{\log(d) T})$ regret. Moreover, we generalize this algorithm to the setting with binary feedback and the noise distribution is unknown but belongs to a parametrized family of distributions. For the full information feedback with \emph{unknown} noise distribution, we provide an algorithm that achieves regret at most $\widetilde{O}(\sqrt{dT})$. Our approach combines an estimator for log-concave density functions and then MLE method to learn the noise distribution $\mathcal{F}$ and linear weight $α_0$ simultaneously. We also provide a lower bound result such that any bidding policy in a broad class must achieve regret at least $Ω(\sqrt{T})$, even when the learner receives the full information feedback and $\mathcal{F}$ is known. △ Less

Submitted 10 November, 2021; v1 submitted 7 September, 2021; originally announced September 2021.

arXiv:2102.11050 [pdf, other]

Online Learning via Offline Greedy Algorithms: Applications in Market Design and Optimization

Authors: Rad Niazadeh, Negin Golrezaei, Joshua Wang, Fransisca Susan, Ashwinkumar Badanidiyuru

Abstract: Motivated by online decision-making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to a constant factor approximation using a greedy algorithm that is robust to local errors. For such problems, we provide a general framework that efficiently transforms off… ▽ More Motivated by online decision-making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to a constant factor approximation using a greedy algorithm that is robust to local errors. For such problems, we provide a general framework that efficiently transforms offline robust greedy algorithms to online ones using Blackwell approachability. We show that the resulting online algorithms have $O(\sqrt{T})$ (approximate) regret under the full information setting. We further introduce a bandit extension of Blackwell approachability that we call Bandit Blackwell approachability. We leverage this notion to transform greedy robust offline algorithms into a $O(T^{2/3})$ (approximate) regret in the bandit setting. Demonstrating the flexibility of our framework, we apply our offline-to-online transformation to several problems at the intersection of revenue management, market design, and online optimization, including product ranking optimization in online platforms, reserve price optimization in auctions, and submodular maximization. We also extend our reduction to greedy-like first order methods used in continuous optimization, such as those used for maximizing continuous strong DR monotone submodular functions subject to convex constraints. We show that our transformation, when applied to these applications, leads to new regret bounds or improves the current known bounds. We complement our theoretical studies by conducting numerical simulations for two of our applications, in both of which we observe that the numerical performance of our transformations outperforms the theoretical guarantees in practical instances. △ Less

Submitted 3 February, 2023; v1 submitted 18 February, 2021; originally announced February 2021.

Comments: 87 pages, 2 figures. Management Science (2022)

arXiv:2101.02284 [pdf, other]

Handling many conversions per click in modeling delayed feedback

Authors: Ashwinkumar Badanidiyuru, Andrew Evdokimov, Vinodh Krishnan, Pan Li, Wynn Vonnegut, Jayden Wang

Abstract: Predicting the expected value or number of post-click conversions (purchases or other events) is a key task in performance-based digital advertising. In training a conversion optimizer model, one of the most crucial aspects is handling delayed feedback with respect to conversions, which can happen multiple times with varying delay. This task is difficult, as the delay distribution is different for… ▽ More Predicting the expected value or number of post-click conversions (purchases or other events) is a key task in performance-based digital advertising. In training a conversion optimizer model, one of the most crucial aspects is handling delayed feedback with respect to conversions, which can happen multiple times with varying delay. This task is difficult, as the delay distribution is different for each advertiser, is long-tailed, often does not follow any particular class of parametric distributions, and can change over time. We tackle these challenges using an unbiased estimation model based on three core ideas. The first idea is to split the label as a sum of labels with different delay buckets, each of which trains only on mature label, the second is to use thermometer encoding to increase accuracy and reduce inference cost, and the third is to use auxiliary information to increase the stability of the model and to handle drift in the distribution. △ Less

Submitted 6 January, 2021; originally announced January 2021.

arXiv:2002.03523 [pdf, other]

Submodular Maximization Through Barrier Functions

Authors: Ashwinkumar Badanidiyuru, Amin Karbasi, Ehsan Kazemi, Jan Vondrak

Abstract: In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular maximization but also provides the state of the art guarantee. More precisely, for maximizing a monotone submodular function subject to the combination of a $k$-matchoid and… ▽ More In this paper, we introduce a novel technique for constrained submodular maximization, inspired by barrier functions in continuous optimization. This connection not only improves the running time for constrained submodular maximization but also provides the state of the art guarantee. More precisely, for maximizing a monotone submodular function subject to the combination of a $k$-matchoid and $\ell$-knapsack constraint (for $\ell\leq k$), we propose a potential function that can be approximately minimized. Once we minimize the potential function up to an $ε$ error it is guaranteed that we have found a feasible set with a $2(k+1+ε)$-approximation factor which can indeed be further improved to $(k+1+ε)$ by an enumeration technique. We extensively evaluate the performance of our proposed algorithm over several real-world applications, including a movie recommendation system, summarization tasks for YouTube videos, Twitter feeds and Yelp business locations, and a set cover problem. △ Less

Submitted 9 February, 2020; originally announced February 2020.

arXiv:1911.02056 [pdf, ps, other]

Response Prediction for Low-Regret Agents

Authors: Saeed Alaei, Ashwinkumar Badanidiyuru, Mohammad Mahdian, Sadra Yazdanbod

Abstract: Companies like Google and Microsoft run billions of auctions every day to sell advertising opportunities. Any change to the rules of these auctions can have a tremendous effect on the revenue of the company and the welfare of the advertisers and the users. Therefore, any change requires careful evaluation of its potential impacts. Currently, such impacts are often evaluated by running simulations… ▽ More Companies like Google and Microsoft run billions of auctions every day to sell advertising opportunities. Any change to the rules of these auctions can have a tremendous effect on the revenue of the company and the welfare of the advertisers and the users. Therefore, any change requires careful evaluation of its potential impacts. Currently, such impacts are often evaluated by running simulations or small controlled experiments. This, however, misses the important factor that the advertisers respond to changes. Our goal is to build a theoretical framework for predicting the actions of an agent (the advertiser) that is optimizing her actions in an uncertain environment. We model this problem using a variant of the multi-armed bandit setting where playing an arm is costly. The cost of each arm changes over time and is publicly observable. The value of playing an arm is drawn stochastically from a static distribution and is observed by the agent and not by us. We, however, observe the actions of the agent. Our main result is that assuming the agent is playing a strategy with a regret of at most $f(T)$ within the first $T$ rounds, we can learn to play the multi-armed bandits game (without observing the rewards) in such a way that the regret of our selected actions is at most $O(k^4(f(T)+1)\log(T))$, where $k$ is the number of arms. △ Less

Submitted 5 November, 2019; originally announced November 2019.

Journal ref: The 15th Conference on Web and Internet Economics, 2019

arXiv:1708.00611 [pdf, ps, other]

Targeting and Signaling in Ad Auctions

Authors: Ashwinkumar Badanidiyuru, Kshipra Bhawalkar, Haifeng Xu

Abstract: Modern ad auctions allow advertisers to target more specific segments of the user population. Unfortunately, this is not always in the best interest of the ad platform. In this paper, we examine the following basic question in the context of second-price ad auctions: how should an ad platform optimally reveal information about the ad opportunity to the advertisers in order to maximize revenue? We… ▽ More Modern ad auctions allow advertisers to target more specific segments of the user population. Unfortunately, this is not always in the best interest of the ad platform. In this paper, we examine the following basic question in the context of second-price ad auctions: how should an ad platform optimally reveal information about the ad opportunity to the advertisers in order to maximize revenue? We consider a model in which bidders' valuations depend on a random state of the ad opportunity. Different from previous work, we focus on a more practical, and challenging, situation where the space of possible realizations of ad opportunities is extremely large. We thus focus on develo** algorithms whose running time is independent of the number of ad opportunity realizations. We examine the auctioneer's algorithmic question of designing the optimal signaling scheme. When the auctioneer is restricted to send a public signal to all bidders, we focus on a well-motivated Bayesian valuation setting in which the auctioneer and bidders both have private information, and present two main results: 1. we exhibit a characterization result regarding approximately optimal schemes and prove that any constant-approximate public signaling scheme must use exponentially many signals; 2. we present a "simple" public signaling scheme that serves as a constant approximation under mild assumptions. We then initiate an exploration on the power of being able to send different signals privately to different bidders. Here we examine a basic setting where the auctioneer knows bidders' valuations, and exhibit a polynomial-time private scheme that extracts almost full surplus even in the worst Bayes Nash equilibrium. This illustrates the surprising power of private signaling schemes in extracting revenue. △ Less

Submitted 14 July, 2019; v1 submitted 2 August, 2017; originally announced August 2017.

Comments: Appears at SODA 2018

arXiv:1507.02351 [pdf, ps, other]

Locally Adaptive Optimization: Adaptive Seeding for Monotone Submodular Functions

Authors: Ashwinkumar Badanidiyuru, Christos Papadimitriou, Aviad Rubinstein, Lior Seeman, Yaron Singer

Abstract: The Adaptive Seeding problem is an algorithmic challenge motivated by influence maximization in social networks: One seeks to select among certain accessible nodes in a network, and then select, adaptively, among neighbors of those nodes as they become accessible in order to maximize a global objective function. More generally, adaptive seeding is a stochastic optimization framework where the choi… ▽ More The Adaptive Seeding problem is an algorithmic challenge motivated by influence maximization in social networks: One seeks to select among certain accessible nodes in a network, and then select, adaptively, among neighbors of those nodes as they become accessible in order to maximize a global objective function. More generally, adaptive seeding is a stochastic optimization framework where the choices in the first stage affect the realizations in the second stage, over which we aim to optimize. Our main result is a $(1-1/e)^2$-approximation for the adaptive seeding problem for any monotone submodular function. While adaptive policies are often approximated via non-adaptive policies, our algorithm is based on a novel method we call \emph{locally-adaptive} policies. These policies combine a non-adaptive global structure, with local adaptive optimizations. This method enables the $(1-1/e)^2$-approximation for general monotone submodular functions and circumvents some of the impossibilities associated with non-adaptive policies. We also introduce a fundamental problem in submodular optimization that may be of independent interest: given a ground set of elements where every element appears with some small probability, find a set of expected size at most $k$ that has the highest expected value over the realization of the elements. We show a surprising result: there are classes of monotone submodular functions (including coverage) that can be approximated almost optimally as the probability vanishes. For general monotone submodular functions we show via a reduction from \textsc{Planted-Clique} that approximations for this problem are not likely to be obtainable. This optimization problem is an important tool for adaptive seeding via non-adaptive policies, and its hardness motivates the introduction of \emph{locally-adaptive} policies we use in the main result. △ Less

Submitted 8 July, 2015; originally announced July 2015.

arXiv:1409.7938 [pdf, ps, other]

Lazier Than Lazy Greedy

Authors: Baharan Mirzasoleiman, Ashwinkumar Badanidiyuru, Amin Karbasi, Jan Vondrak, Andreas Krause

Abstract: Is it possible to maximize a monotone submodular function faster than the widely used lazy greedy algorithm (also known as accelerated greedy), both in theory and practice? In this paper, we develop the first linear-time algorithm for maximizing a general monotone submodular function subject to a cardinality constraint. We show that our randomized algorithm, STOCHASTIC-GREEDY, can achieve a… ▽ More Is it possible to maximize a monotone submodular function faster than the widely used lazy greedy algorithm (also known as accelerated greedy), both in theory and practice? In this paper, we develop the first linear-time algorithm for maximizing a general monotone submodular function subject to a cardinality constraint. We show that our randomized algorithm, STOCHASTIC-GREEDY, can achieve a $(1-1/e-\varepsilon)$ approximation guarantee, in expectation, to the optimum solution in time linear in the size of the data and independent of the cardinality constraint. We empirically demonstrate the effectiveness of our algorithm on submodular functions arising in data summarization, including training large-scale kernel methods, exemplar-based clustering, and sensor placement. We observe that STOCHASTIC-GREEDY practically achieves the same utility value as lazy greedy but runs much faster. More surprisingly, we observe that in many practical scenarios STOCHASTIC-GREEDY does not evaluate the whole fraction of data points even once and still achieves indistinguishable results compared to lazy greedy. △ Less

Submitted 28 November, 2014; v1 submitted 28 September, 2014; originally announced September 2014.

Comments: In Proc. Conference on Artificial Intelligence (AAAI), 2015

arXiv:1402.6779 [pdf, ps, other]

Resourceful Contextual Bandits

Authors: Ashwinkumar Badanidiyuru, John Langford, Aleksandrs Slivkins

Abstract: We study contextual bandits with ancillary constraints on resources, which are common in real-world applications such as choosing ads or dynamic pricing of items. We design the first algorithm for solving these problems that handles constrained resources other than time, and improves over a trivial reduction to the non-contextual case. We consider very general settings for both contextual bandits… ▽ More We study contextual bandits with ancillary constraints on resources, which are common in real-world applications such as choosing ads or dynamic pricing of items. We design the first algorithm for solving these problems that handles constrained resources other than time, and improves over a trivial reduction to the non-contextual case. We consider very general settings for both contextual bandits (arbitrary policy sets, e.g. Dudik et al. (UAI'11)) and bandits with resource constraints (bandits with knapsacks, Badanidiyuru et al. (FOCS'13)), and prove a regret guarantee with near-optimal statistical properties. △ Less

Submitted 31 July, 2015; v1 submitted 26 February, 2014; originally announced February 2014.

Comments: This is the full version of a paper in COLT 2014. Version history: (v2) Added some details to one of the proofs, (v3) a big revision following comments from COLT reviewers (but no new results), (v4) edits in related work, minor edits elsewhere. (v6) A correction for Theorem 3, corollary for contextual dynamic pricing with discretization; updated follow-up work & open questions

arXiv:1305.2545 [pdf, ps, other]

doi 10.1109/FOCS.2013.30

Bandits with Knapsacks

Authors: Ashwinkumar Badanidiyuru, Robert Kleinberg, Aleksandrs Slivkins

Abstract: Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, to Web search and advertising. In many of these application domains the learner may be constrained by one or more supply (or budget) limits, in addition to the customary limitation on the ti… ▽ More Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, to Web search and advertising. In many of these application domains the learner may be constrained by one or more supply (or budget) limits, in addition to the customary limitation on the time horizon. The literature lacks a general model encompassing these sorts of problems. We introduce such a model, called "bandits with knapsacks", that combines aspects of stochastic integer programming with online learning. A distinctive feature of our problem, in comparison to the existing regret-minimization literature, is that the optimal policy for a given latent distribution may significantly outperform the policy that plays the optimal fixed arm. Consequently, achieving sublinear regret in the bandits-with-knapsacks problem is significantly more challenging than in conventional bandit problems. We present two algorithms whose reward is close to the information-theoretic optimum: one is based on a novel "balanced exploration" paradigm, while the other is a primal-dual algorithm that uses multiplicative updates. Further, we prove that the regret achieved by both algorithms is optimal up to polylogarithmic factors. We illustrate the generality of the problem by presenting applications in a number of different domains including electronic commerce, routing, and scheduling. As one example of a concrete application, we consider the problem of dynamic posted pricing with limited supply and obtain the first algorithm whose regret, with respect to the optimal dynamic policy, is sublinear in the supply. △ Less

Submitted 5 September, 2017; v1 submitted 11 May, 2013; originally announced May 2013.

Comments: An extended abstract of this work has appeared in the 54th IEEE Symposium on Foundations of Computer Science (FOCS 2013). 55 pages. Compared to the initial "full version" from May'13, this version has a significantly revised presentation and reflects the current status of the follow-up work. Also, this version contains a stronger regret bound in one of the main results

arXiv:1112.0689 [pdf, ps, other]

Approximating Low-Dimensional Coverage Problems

Authors: Ashwinkumar Badanidiyuru, Robert Kleinberg, Hooyeon Lee

Abstract: We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a $(1-\eps)$-approximation to the maximum-cardinality union of $k$ sets, in running time $O(f(\eps,k,d)\cdot poly(n))$ where $n$ is the problem size, $d$ is the VC-dimension of the set system, and… ▽ More We study the complexity of the maximum coverage problem, restricted to set systems of bounded VC-dimension. Our main result is a fixed-parameter tractable approximation scheme: an algorithm that outputs a $(1-\eps)$-approximation to the maximum-cardinality union of $k$ sets, in running time $O(f(\eps,k,d)\cdot poly(n))$ where $n$ is the problem size, $d$ is the VC-dimension of the set system, and $f(\eps,k,d)$ is exponential in $(kd/\eps)^c$ for some constant $c$. We complement this positive result by showing that the function $f(\eps,k,d)$ in the running-time bound cannot be replaced by a function depending only on $(\eps,d)$ or on $(k,d)$, under standard complexity assumptions. We also present an improved upper bound on the approximation ratio of the greedy algorithm in special cases of the problem, including when the sets have bounded cardinality and when they are two-dimensional halfspaces. Complementing these positive results, we show that when the sets are four-dimensional halfspaces neither the greedy algorithm nor local search is capable of improving the worst-case approximation ratio of $1-1/e$ that the greedy algorithm achieves on arbitrary instances of maximum coverage. △ Less

Submitted 3 December, 2011; originally announced December 2011.

arXiv:1107.2869 [pdf, ps, other]

Optimization with Demand Oracles

Authors: Ashwinkumar Badanidiyuru, Shahar Dobzinski, Sigal Oren

Abstract: We study \emph{combinatorial procurement auctions}, where a buyer with a valuation function $v$ and budget $B$ wishes to buy a set of items. Each item $i$ has a cost $c_i$ and the buyer is interested in a set $S$ that maximizes $v(S)$ subject to $Σ_{i\in S}c_i\leq B$. Special cases of combinatorial procurement auctions are classical problems from submodular optimization. In particular, when the co… ▽ More We study \emph{combinatorial procurement auctions}, where a buyer with a valuation function $v$ and budget $B$ wishes to buy a set of items. Each item $i$ has a cost $c_i$ and the buyer is interested in a set $S$ that maximizes $v(S)$ subject to $Σ_{i\in S}c_i\leq B$. Special cases of combinatorial procurement auctions are classical problems from submodular optimization. In particular, when the costs are all equal (\emph{cardinality constraint}), a classic result by Nemhauser et al shows that the greedy algorithm provides an $\frac e {e-1}$ approximation. Motivated by many papers that utilize demand queries to elicit the preferences of agents in economic settings, we develop algorithms that guarantee improved approximation ratios in the presence of demand oracles. We are able to break the $\frac e {e-1}$ barrier: we present algorithms that use only polynomially many demand queries and have approximation ratios of $\frac 9 8+ε$ for the general problem and $\frac 9 8$ for maximization subject to a cardinality constraint. We also consider the more general class of subadditive valuations. We present algorithms that obtain an approximation ratio of $2+ε$ for the general problem and 2 for maximization subject to a cardinality constraint. We guarantee these approximation ratios even when the valuations are non-monotone. We show that these ratios are essentially optimal, in the sense that for any constant $ε>0$, obtaining an approximation ratio of $2-ε$ requires exponentially many demand queries. △ Less

Submitted 14 July, 2011; originally announced July 2011.

Showing 1–17 of 17 results for author: Badanidiyuru, A