Skip to main content

Showing 1–13 of 13 results for author: Simchi-Levi, D

Searching in archive math. Search in all archives.
.
  1. arXiv:2402.11425  [pdf, other

    stat.ME cs.LG math.OC math.PR

    Online Local False Discovery Rate Control: A Resource Allocation Approach

    Authors: Ruicheng Ao, Hongyu Chen, David Simchi-Levi, Feng Zhu

    Abstract: We consider the problem of sequentially conducting multiple experiments where each experiment corresponds to a hypothesis testing task. At each time point, the experimenter must make an irrevocable decision of whether to reject the null hypothesis (or equivalently claim a discovery) before the next experimental result arrives. The goal is to maximize the number of discoveries while maintaining a l… ▽ More

    Submitted 1 April, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

  2. arXiv:2304.04341  [pdf, ps, other

    stat.ML cs.LG math.ST stat.ME

    Regret Distribution in Stochastic Bandits: Optimal Trade-off between Expectation and Tail Risk

    Authors: David Simchi-Levi, Zeyu Zheng, Feng Zhu

    Abstract: We study the trade-off between expectation and tail risk for regret distribution in the stochastic multi-armed bandit problem. We fully characterize the interplay among three desired properties for policy design: worst-case optimality, instance-dependent consistency, and light-tailed risk. We show how the order of expected regret exactly affects the decaying rate of the regret tail probability for… ▽ More

    Submitted 9 April, 2023; originally announced April 2023.

  3. arXiv:2206.02969  [pdf, other

    stat.ML cs.LG math.ST

    A Simple and Optimal Policy Design with Safety against Heavy-tailed Risk for Stochastic Bandits

    Authors: David Simchi-Levi, Zeyu Zheng, Feng Zhu

    Abstract: We study the stochastic multi-armed bandit problem and design new policies that enjoy both worst-case optimality for expected regret and light-tailed risk for regret distribution. Starting from the two-armed bandit setting with time horizon $T$, we propose a simple policy and prove that the policy (i) enjoys the worst-case optimality for the expected regret at order $O(\sqrt{T\ln T})$ and (ii) has… ▽ More

    Submitted 14 November, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Preliminary version appeared in NeurIPS 2022

  4. arXiv:2204.07856  [pdf, ps, other

    math.ST math.FA stat.ML

    Optimal Learning Rates for Regularized Least-Squares with a Fourier Capacity Condition

    Authors: Prem Talwai, David Simchi-Levi

    Abstract: We derive minimax adaptive rates for a new, broad class of Tikhonov-regularized learning problems in Hilbert scales under general source conditions. Our analysis does not require the regression function to be contained in the hypothesis class, and most notably does not employ the conventional \textit{a priori} assumptions on kernel eigendecay. Using the theory of interpolation, we demonstrate that… ▽ More

    Submitted 15 September, 2023; v1 submitted 16 April, 2022; originally announced April 2022.

    Comments: The first version of this paper contained an error -- refer to current version

  5. arXiv:2106.14813  [pdf, other

    stat.ML cs.DM cs.LG math.OC

    Offline Planning and Online Learning under Recovering Rewards

    Authors: David Simchi-Levi, Zeyu Zheng, Feng Zhu

    Abstract: Motivated by emerging applications such as live-streaming e-commerce, promotions and recommendations, we introduce and solve a general class of non-stationary multi-armed bandit problems that have the following two features: (i) the decision maker can pull and collect rewards from up to $K\,(\ge 1)$ out of $N$ different arms in each time period; (ii) the expected reward of an arm immediately drops… ▽ More

    Submitted 21 December, 2021; v1 submitted 28 June, 2021; originally announced June 2021.

    Comments: v1 accepted by ICML 2021

  6. arXiv:2010.03104  [pdf, other

    cs.LG math.ST stat.ML

    Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective

    Authors: Dylan J. Foster, Alexander Rakhlin, David Simchi-Levi, Yunzong Xu

    Abstract: In the classical multi-armed bandit problem, instance-dependent algorithms attain improved performance on "easy" problems with a gap between the best and second-best arm. Are similar guarantees possible for contextual bandits? While positive results are known for certain special cases, there is no general theory characterizing when and how instance-dependent regret bounds for contextual bandits ca… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

  7. arXiv:2005.00947  [pdf, other

    cs.DS cs.LG math.OC stat.AP

    Online Learning and Optimization for Revenue Management Problems with Add-on Discounts

    Authors: David Simchi-Levi, Rui Sun, Huanan Zhang

    Abstract: We study in this paper a revenue management problem with add-on discounts. The problem is motivated by the practice in the video game industry, where a retailer offers discounts on selected supportive products (e.g. video games) to customers who have also purchased the core products (e.g. video game consoles). We formulate this problem as an optimization problem to determine the prices of differen… ▽ More

    Submitted 2 May, 2020; originally announced May 2020.

  8. arXiv:2003.12699  [pdf, other

    cs.LG math.ST stat.ML

    Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability

    Authors: David Simchi-Levi, Yunzong Xu

    Abstract: We consider the general (stochastic) contextual bandit problem under the realizability assumption, i.e., the expected reward, as a function of contexts and actions, belongs to a general function class $\mathcal{F}$. We design a fast and simple algorithm that achieves the statistically optimal regret with only ${O}(\log T)$ calls to an offline regression oracle across all $T$ rounds. The number of… ▽ More

    Submitted 10 July, 2021; v1 submitted 28 March, 2020; originally announced March 2020.

    Comments: Forthcoming in Mathematics of Operations Research

  9. arXiv:1911.01067  [pdf, other

    cs.LG cs.GT math.OC stat.ML

    Blind Network Revenue Management and Bandits with Knapsacks under Limited Switches

    Authors: David Simchi-Levi, Yunzong Xu, **glong Zhao

    Abstract: Our work is motivated by a common business constraint in online markets. While firms respect the advantages of dynamic pricing and price experimentation, they must limit the number of price changes (i.e., switches) to be within some budget due to various practical reasons. We study both the classical price-based network revenue management problem in the distributionally-unknown setup, and the band… ▽ More

    Submitted 1 December, 2020; v1 submitted 4 November, 2019; originally announced November 2019.

  10. arXiv:1908.09808  [pdf, other

    cs.DS math.OC

    Multi-stage and Multi-customer Assortment Optimization with Inventory Constraints

    Authors: Elaheh Fata, Will Ma, David Simchi-Levi

    Abstract: We consider an assortment optimization problem where a customer chooses a single item from a sequence of sets shown to her, while limited inventories constrain the items offered to customers over time. In the special case where all of the assortments have size one, our problem captures the online stochastic matching with timeouts problem. For this problem, we derive a polynomial-time approximation… ▽ More

    Submitted 26 July, 2020; v1 submitted 26 August, 2019; originally announced August 2019.

  11. arXiv:1905.10825  [pdf, ps, other

    cs.LG cs.DS math.OC stat.ML

    Phase Transitions in Bandits with Switching Constraints

    Authors: David Simchi-Levi, Yunzong Xu

    Abstract: We consider the classical stochastic multi-armed bandit problem with a constraint that limits the total cost incurred by switching between actions to be no larger than a given switching budget. For this problem, we prove matching upper and lower bounds on the optimal (i.e., minimax) regret, and provide efficient rate-optimal algorithms. Surprisingly, the optimal regret of this problem exhibits a n… ▽ More

    Submitted 18 March, 2021; v1 submitted 26 May, 2019; originally announced May 2019.

    Comments: An enhanced version. Many new results are obtained. The presentation is improved

  12. arXiv:1903.07844  [pdf, other

    math.OC cs.LG

    Shrinking the Upper Confidence Bound: A Dynamic Product Selection Problem for Urban Warehouses

    Authors: Rong **, David Simchi-Levi, Li Wang, Xinshang Wang, Sen Yang

    Abstract: The recent rising popularity of ultra-fast delivery services on retail platforms fuels the increasing use of urban warehouses, whose proximity to customers makes fast deliveries viable. The space limit in urban warehouses poses a problem for the online retailers: the number of products (SKUs) they carry is no longer "the more, the better", yet it can still be significantly large, reaching hundreds… ▽ More

    Submitted 2 May, 2019; v1 submitted 19 March, 2019; originally announced March 2019.

  13. arXiv:1901.02871  [pdf, other

    math.OC cs.DS cs.LG stat.ML

    The Lingering of Gradients: Theory and Applications

    Authors: Zeyuan Allen-Zhu, David Simchi-Levi, Xinshang Wang

    Abstract: Classically, the time complexity of a first-order method is estimated by its number of gradient computations. In this paper, we study a more refined complexity by taking into account the `lingering' of gradients: once a gradient is computed at $x_k$, the additional time to compute gradients at $x_{k+1},x_{k+2},\dots$ may be reduced. We show how this improves the running time of several first-ord… ▽ More

    Submitted 28 May, 2019; v1 submitted 9 January, 2019; originally announced January 2019.