Skip to main content

Showing 1–4 of 4 results for author: Izzo, Z

Searching in archive math. Search in all archives.
.
  1. arXiv:2210.07513  [pdf, ps, other

    math.OC cs.LG math.NA stat.ML

    Continuous-in-time Limit for Bayesian Bandits

    Authors: Yuhua Zhu, Zachary Izzo, Lexing Ying

    Abstract: This paper revisits the bandit problem in the Bayesian setting. The Bayesian approach formulates the bandit problem as an optimization problem, and the goal is to find the optimal policy which minimizes the Bayesian regret. One of the main challenges facing the Bayesian approach is that computation of the optimal policy is often intractable, especially when the length of the problem horizon or the… ▽ More

    Submitted 29 September, 2023; v1 submitted 14 October, 2022; originally announced October 2022.

  2. arXiv:2209.08745  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Importance Tempering: Group Robustness for Overparameterized Models

    Authors: Yi** Lu, Wenlong Ji, Zachary Izzo, Lexing Ying

    Abstract: Although overparameterized models have shown their success on many machine learning tasks, the accuracy could drop on the testing distribution that is different from the training one. This accuracy drop still limits applying machine learning in the wild. At the same time, importance weighting, a traditional technique to handle distribution shifts, has been demonstrated to have less or even no effe… ▽ More

    Submitted 27 September, 2022; v1 submitted 18 September, 2022; originally announced September 2022.

  3. arXiv:2110.08991  [pdf, other

    cs.DS cs.LG math.PR

    Dimensionality Reduction for Wasserstein Barycenter

    Authors: Zachary Izzo, Sandeep Silwal, Samson Zhou

    Abstract: The Wasserstein barycenter is a geometric construct which captures the notion of centrality among probability distributions, and which has found many applications in machine learning. However, most algorithms for finding even an approximate barycenter suffer an exponential dependence on the dimension $d$ of the underlying space of the distributions. In order to cope with this "curse of dimensional… ▽ More

    Submitted 18 October, 2021; v1 submitted 17 October, 2021; originally announced October 2021.

    Comments: Published as a conference paper in NeurIPS 2021

  4. arXiv:2006.06173  [pdf, other

    math.OC cs.LG stat.ML

    Borrowing From the Future: Addressing Double Sampling in Model-free Control

    Authors: Yuhua Zhu, Zach Izzo, Lexing Ying

    Abstract: In model-free reinforcement learning, the temporal difference method and its variants become unstable when combined with nonlinear function approximations. Bellman residual minimization with stochastic gradient descent (SGD) is more stable, but it suffers from the double sampling problem: given the current state, two independent samples for the next state are required, but often only one sample is… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.