Skip to main content

Showing 1–10 of 10 results for author: Altschuler, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2403.00278  [pdf, other

    cs.LG cs.CR math.OC math.ST stat.ML

    Shifted Interpolation for Differential Privacy

    Authors: **ho Bok, Weijie Su, Jason M. Altschuler

    Abstract: Noisy gradient descent and its variants are the predominant algorithms for differentially private machine learning. It is a fundamental question to quantify their privacy leakage, yet tight characterizations remain open even in the foundational setting of convex losses. This paper improves over previous analyses by establishing (and refining) the "privacy amplification by iteration" phenomenon in… ▽ More

    Submitted 12 June, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

    Comments: 45 pages, ICML 2024. v2: added lower bounds (Appendix C.5)

  2. arXiv:2401.15092  [pdf, ps, other

    math.PR cs.DM cs.LG stat.ML

    A note on the capacity of the binary perceptron

    Authors: Dylan J. Altschuler, Konstantin Tikhomirov

    Abstract: Determining the capacity $α_c$ of the Binary Perceptron is a long-standing problem. Krauth and Mezard (1989) conjectured an explicit value of $α_c$, approximately equal to .833, and a rigorous lower bound matching this prediction was recently established by Ding and Sun (2019). Regarding the upper bound, Kim and Roche (1998) and Talagrand (1999) independently showed that $α_c$ < .996, while Krauth… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  3. arXiv:2302.10249  [pdf, ps, other

    math.ST cs.DS cs.LG math.AP stat.ML

    Faster high-accuracy log-concave sampling via algorithmic warm starts

    Authors: Jason M. Altschuler, Sinho Chewi

    Abstract: Understanding the complexity of sampling from a strongly log-concave and log-smooth distribution $π$ on $\mathbb{R}^d$ to high accuracy is a fundamental problem, both from a practical and theoretical standpoint. In practice, high-accuracy samplers such as the classical Metropolis-adjusted Langevin algorithm (MALA) remain the de facto gold standard; and in theory, via the proximal sampler reduction… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: 59 pages, 1 table

  4. arXiv:2212.12629  [pdf, ps, other

    stat.ML cs.LG math.PR math.ST

    Concentration of the Langevin Algorithm's Stationary Distribution

    Authors: Jason M. Altschuler, Kunal Talwar

    Abstract: A canonical algorithm for log-concave sampling is the Langevin Algorithm, aka the Langevin Diffusion run with some discretization stepsize $η> 0$. This discretization leads the Langevin Algorithm to have a stationary distribution $π_η$ which differs from the stationary distribution $π$ of the Langevin Diffusion, and it is an important challenge to understand whether the well-known properties of… ▽ More

    Submitted 23 December, 2022; originally announced December 2022.

  5. arXiv:2210.08448  [pdf, other

    math.ST math.OC math.PR stat.ML

    Resolving the Mixing Time of the Langevin Algorithm to its Stationary Distribution for Log-Concave Sampling

    Authors: Jason M. Altschuler, Kunal Talwar

    Abstract: Sampling from a high-dimensional distribution is a fundamental task in statistics, engineering, and the sciences. A canonical approach is the Langevin Algorithm, i.e., the Markov chain for the discretized Langevin Diffusion. This is the sampling analog of Gradient Descent. Despite being studied for several decades in multiple communities, tight mixing bounds for this algorithm remain unresolved ev… ▽ More

    Submitted 31 October, 2022; v1 submitted 16 October, 2022; originally announced October 2022.

  6. arXiv:2205.13710  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Privacy of Noisy Stochastic Gradient Descent: More Iterations without More Privacy Loss

    Authors: Jason M. Altschuler, Kunal Talwar

    Abstract: A central issue in machine learning is how to train models on sensitive user data. Industry has widely adopted a simple algorithm: Stochastic Gradient Descent with noise (a.k.a. Stochastic Gradient Langevin Dynamics). However, foundational theoretical questions about this algorithm's privacy loss remain open -- even in the seemingly simple setting of smooth convex losses over a bounded domain. Our… ▽ More

    Submitted 28 February, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: v2: improved exposition, slightly simplified proofs, all results unchanged

  7. arXiv:1903.09239  [pdf, other

    stat.ML cs.LG

    Multi-Domain Adversarial Learning

    Authors: Alice Schoenauer-Sebag, Louise Heinrich, Marc Schoenauer, Michele Sebag, Lani F. Wu, Steve J. Altschuler

    Abstract: Multi-domain learning (MDL) aims at obtaining a model with minimal average risk across multiple domains. Our empirical motivation is automated microscopy data, where cultured cells are imaged after being exposed to known and unknown chemical perturbations, and each dataset displays significant experimental bias. This paper presents a multi-domain adversarial learning approach, MuLANN, to leverage… ▽ More

    Submitted 21 March, 2019; originally announced March 2019.

    Comments: Accepted at ICLR'19

    Journal ref: ICLR 2019-Seventh annual International Conference on Learning Representations

  8. arXiv:1812.05189  [pdf, other

    stat.ML cs.DS cs.LG math.OC

    Massively scalable Sinkhorn distances via the Nyström method

    Authors: Jason Altschuler, Francis Bach, Alessandro Rudi, Jonathan Niles-Weed

    Abstract: The Sinkhorn "distance", a variant of the Wasserstein distance with entropic regularization, is an increasingly popular tool in machine learning and statistical inference. However, the time and memory requirements of standard algorithms for computing this distance grow quadratically with the size of the data, making them prohibitively expensive on massive data sets. In this work, we show that this… ▽ More

    Submitted 26 October, 2019; v1 submitted 12 December, 2018; originally announced December 2018.

    Comments: to appear in NeurIPS 2019

    Journal ref: Advances in Neural Information Processing Systems 32 (NeurIPS 2019)

  9. arXiv:1802.09514  [pdf, ps, other

    math.ST cs.LG stat.ML

    Best Arm Identification for Contaminated Bandits

    Authors: Jason Altschuler, Victor-Emmanuel Brunel, Alan Malek

    Abstract: This paper studies active learning in the context of robust statistics. Specifically, we propose a variant of the Best Arm Identification problem for \emph{contaminated bandits}, where each arm pull has probability $\varepsilon$ of generating a sample from an arbitrary contamination distribution instead of the true underlying distribution. The goal is to identify the best (or approximately best) t… ▽ More

    Submitted 15 May, 2019; v1 submitted 26 February, 2018; originally announced February 2018.

    Comments: to appear in Journal of Machine Learning Research (JMLR)

    Journal ref: Journal of Machine Learning Research (JMLR), 20(91), 1-39, 2019

  10. arXiv:1705.09634  [pdf, other

    cs.DS stat.ML

    Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration

    Authors: Jason Altschuler, Jonathan Weed, Philippe Rigollet

    Abstract: Computing optimal transport distances such as the earth mover's distance is a fundamental problem in machine learning, statistics, and computer vision. Despite the recent introduction of several algorithms with good empirical performance, it is unknown whether general optimal transport distances can be approximated in near-linear time. This paper demonstrates that this ambitious goal is in fact ac… ▽ More

    Submitted 7 February, 2018; v1 submitted 26 May, 2017; originally announced May 2017.

    Journal ref: Advances in Neural Information Processing Systems 30 (NIPS 2017), 1961-1971