Skip to main content

Showing 1–50 of 61 results for author: Duchi, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2403.16336  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Predictive Inference in Multi-environment Scenarios

    Authors: John C. Duchi, Suyash Gupta, Kuanhao Jiang, Pragya Sur

    Abstract: We address the challenge of constructing valid confidence intervals and sets in problems of prediction across multiple environments. We investigate two types of coverage suitable for these problems, extending the jackknife and split-conformal methods to show how to obtain distribution-free coverage in such non-traditional, hierarchical data-generating scenarios. Our contributions also include exte… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  2. arXiv:2402.07131  [pdf, other

    stat.ML cs.CR cs.LG stat.ME

    Resampling methods for private statistical inference

    Authors: Karan Chadha, John Duchi, Rohith Kuditipudi

    Abstract: We consider the task of constructing confidence intervals with differential privacy. We propose two private variants of the non-parametric bootstrap, which privately compute the median of the results of multiple "little" bootstraps run on partitions of the data and give asymptotic bounds on the coverage error of the resulting confidence intervals. For a fixed differential privacy parameter $ε$, ou… ▽ More

    Submitted 3 June, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

    Comments: 45 pages

  3. arXiv:2311.01453  [pdf, other

    stat.ML cs.LG stat.ME

    PPI++: Efficient Prediction-Powered Inference

    Authors: Anastasios N. Angelopoulos, John C. Duchi, Tijana Zrnic

    Abstract: We present PPI++: a computationally lightweight methodology for estimation and inference based on a small labeled dataset and a typically much larger dataset of machine-learning predictions. The methods automatically adapt to the quality of available predictions, yielding easy-to-compute confidence sets -- for parameters of any dimensionality -- that always improve on classical intervals using onl… ▽ More

    Submitted 25 March, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: Code available at https://github.com/aangelopoulos/ppi_py

  4. arXiv:2307.11947  [pdf, other

    stat.ML cs.DC cs.LG

    Collaboratively Learning Linear Models with Structured Missing Data

    Authors: Chen Cheng, Gary Cheng, John Duchi

    Abstract: We study the problem of collaboratively learning least squares estimates for $m$ agents. Each agent observes a different subset of the features$\unicode{x2013}$e.g., containing data collected from sensors of varying resolution. Our goal is to determine how to coordinate the agents in order to produce the best estimator for each agent. We propose a distributed, semi-supervised algorithm Collab, con… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  5. arXiv:2301.07078  [pdf, ps, other

    stat.ML cs.CR cs.DS cs.LG

    A Fast Algorithm for Adaptive Private Mean Estimation

    Authors: John Duchi, Saminul Haque, Rohith Kuditipudi

    Abstract: We design an $(\varepsilon, δ)$-differentially private algorithm to estimate the mean of a $d$-variate distribution, with unknown covariance $Σ$, that is adaptive to $Σ$. To within polylogarithmic factors, the estimator achieves optimal rates of convergence with respect to the induced Mahalanobis norm $||\cdot||_Σ$, takes time $\tilde{O}(n d^2)$ to compute, has near linear sample complexity for su… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

    Comments: 38 pages, no figures

  6. arXiv:2210.17070  [pdf, ps, other

    cs.LG cs.CR math.OC stat.ML

    Private optimization in the interpolation regime: faster rates and hardness results

    Authors: Hilal Asi, Karan Chadha, Gary Cheng, John Duchi

    Abstract: In non-private stochastic convex optimization, stochastic gradient methods converge much faster on interpolation problems -- problems where there exists a solution that simultaneously minimizes all of the sample losses -- than on non-interpolating ones; we show that generally similar improvements are impossible in the private setting. However, when the functions exhibit quadratic growth around the… ▽ More

    Submitted 31 October, 2022; originally announced October 2022.

    Comments: published at ICML 2022; 25 pages

  7. arXiv:2210.13497  [pdf, other

    cs.LG cs.IT math.ST stat.ML

    Subspace Recovery from Heterogeneous Data with Non-isotropic Noise

    Authors: John Duchi, Vitaly Feldman, Lunjia Hu, Kunal Talwar

    Abstract: Recovering linear subspaces from data is a fundamental and important task in statistics and machine learning. Motivated by heterogeneity in Federated Learning settings, we study a basic formulation of this problem: the principal component analysis (PCA), with a focus on dealing with irregular noise. Our data come from $n$ users with user $i$ contributing data samples from a $d$-dimensional distrib… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: In NeurIPS 2022

  8. arXiv:2206.07236  [pdf, other

    stat.ML cs.LG

    Query-Adaptive Predictive Inference with Partial Labels

    Authors: Maxime Cauchois, John Duchi

    Abstract: The cost and scarcity of fully supervised labels in statistical machine learning encourage using partially labeled data for model validation as a cheaper and more accessible alternative. Effectively collecting and leveraging weakly supervised data for large-space structured prediction tasks thus becomes an important part of an end-to-end learning system. We propose a new computationally-friendly m… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

  9. arXiv:2202.09889  [pdf, ps, other

    stat.ML cs.LG math.ST

    Memorize to Generalize: on the Necessity of Interpolation in High Dimensional Linear Regression

    Authors: Chen Cheng, John Duchi, Rohith Kuditipudi

    Abstract: We examine the necessity of interpolation in overparameterized models, that is, when achieving optimal predictive risk in machine learning problems requires (nearly) interpolating the training data. In particular, we consider simple overparameterized linear regression $y = X θ+ w$ with random design $X \in \mathbb{R}^{n \times d}$ under the proportional asymptotics $d/n \to γ\in (1, \infty)$. We p… ▽ More

    Submitted 16 June, 2022; v1 submitted 20 February, 2022; originally announced February 2022.

    Comments: 32 pages; accepted to the 35th Annual Conference on Learning Theory (COLT) 2022

  10. arXiv:2202.04166  [pdf, other

    stat.ME stat.ML

    The Lifecycle of a Statistical Model: Model Failure Detection, Identification, and Refitting

    Authors: Alnur Ali, Maxime Cauchois, John C. Duchi

    Abstract: The statistical machine learning community has demonstrated considerable resourcefulness over the years in develo** highly expressive tools for estimation, prediction, and inference. The bedrock assumptions underlying these developments are that the data comes from a fixed population and displays little heterogeneity. But reality is significantly more complex: statistical models now routinely fa… ▽ More

    Submitted 8 February, 2022; originally announced February 2022.

  11. arXiv:2201.08315  [pdf, other

    stat.ML cs.LG

    Predictive Inference with Weak Supervision

    Authors: Maxime Cauchois, Suyash Gupta, Alnur Ali, John Duchi

    Abstract: The expense of acquiring labels in large-scale statistical machine learning makes partially and weakly-labeled data attractive, though it is not always apparent how to leverage such data for model fitting or validation. We present a methodology to bridge the gap between partial supervision and validation, develo** a conformal prediction framework to provide valid predictive confidence sets -- se… ▽ More

    Submitted 9 February, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

  12. arXiv:2108.07313  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Federated Asymptotics: a model to compare federated learning algorithms

    Authors: Gary Cheng, Karan Chadha, John Duchi

    Abstract: We propose an asymptotic framework to analyze the performance of (personalized) federated learning algorithms. In this new framework, we formulate federated learning as a multi-criterion objective, where the goal is to minimize each client's loss using information from all of the clients. We analyze a linear regression model where, for a given client, we may theoretically compare the performance o… ▽ More

    Submitted 18 February, 2022; v1 submitted 16 August, 2021; originally announced August 2021.

    Comments: 42 pages (11 main pages, 2 reference pages, 29 appendix pages), 13 figures

  13. arXiv:2108.02391  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Adapting to Function Difficulty and Growth Conditions in Private Optimization

    Authors: Hilal Asi, Daniel Levy, John Duchi

    Abstract: We develop algorithms for private stochastic convex optimization that adapt to the hardness of the specific function we wish to optimize. While previous work provide worst-case bounds for arbitrary convex functions, it is often the case that the function at hand belongs to a smaller class that enjoys faster rates. Concretely, we show that for functions exhibiting $κ$-growth around the optimum, i.e… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

    Comments: 28 pages

  14. arXiv:2106.13756  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Private Adaptive Gradient Methods for Convex Optimization

    Authors: Hilal Asi, John Duchi, Alireza Fallah, Omid Javidbakht, Kunal Talwar

    Abstract: We study adaptive methods for differentially private convex optimization, proposing and analyzing differentially private variants of a Stochastic Gradient Descent (SGD) algorithm with adaptive stepsizes, as well as the AdaGrad algorithm. We provide upper bounds on the regret of both algorithms and show that the bounds are (worst-case) optimal. As a consequence of our development, we show that our… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

    Comments: To appear in 38th International Conference on Machine Learning (ICML 2021)

  15. arXiv:2101.05234  [pdf, other

    stat.ML cs.LG

    On Misspecification in Prediction Problems and Robustness via Improper Learning

    Authors: Annie Marsden, John Duchi, Gregory Valiant

    Abstract: We study probabilistic prediction games when the underlying model is misspecified, investigating the consequences of predicting using an incorrect parametric model. We show that for a broad class of loss functions and parametric families of distributions, the regret of playing a "proper" predictor -- one from the putative model class -- relative to the best predictor in the same model class has lo… ▽ More

    Submitted 29 January, 2021; v1 submitted 13 January, 2021; originally announced January 2021.

    Comments: 28 pages, 6 figures

  16. arXiv:2101.02696  [pdf, other

    math.OC cs.LG stat.ML

    Accelerated, Optimal, and Parallel: Some Results on Model-Based Stochastic Optimization

    Authors: Karan Chadha, Gary Cheng, John C. Duchi

    Abstract: We extend the Approximate-Proximal Point (aProx) family of model-based methods for solving stochastic convex optimization problems, including stochastic subgradient, proximal point, and bundle methods, to the minibatch and accelerated setting. To do so, we propose specific model-based algorithms and an acceleration scheme for which we provide non-asymptotic convergence guarantees, which are order-… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

    Comments: 24 pages, 17 figures

  17. arXiv:2010.05893  [pdf, other

    math.OC cs.LG stat.ML

    Large-Scale Methods for Distributionally Robust Optimization

    Authors: Daniel Levy, Yair Carmon, John C. Duchi, Aaron Sidford

    Abstract: We propose and analyze algorithms for distributionally robust optimization of convex losses with conditional value at risk (CVaR) and $χ^2$ divergence uncertainty sets. We prove that our algorithms require a number of gradient evaluations independent of training set size and number of parameters, making them suitable for large-scale applications. For $χ^2$ uncertainty sets these are the first such… ▽ More

    Submitted 10 December, 2020; v1 submitted 12 October, 2020; originally announced October 2020.

    Comments: 63 pages, NeurIPS 2020

  18. arXiv:2008.10581  [pdf, other

    cs.LG stat.ML

    Neural Bridge Sampling for Evaluating Safety-Critical Autonomous Systems

    Authors: Aman Sinha, Matthew O'Kelly, Russ Tedrake, John Duchi

    Abstract: Learning-based methodologies increasingly find applications in safety-critical domains like autonomous driving and medical robotics. Due to the rare nature of dangerous events, real-world testing is prohibitively expensive and unscalable. In this work, we employ a probabilistic approach to safety evaluation in simulation, where we are concerned with computing the probability of dangerous events. W… ▽ More

    Submitted 8 August, 2021; v1 submitted 24 August, 2020; originally announced August 2020.

    Comments: NeurIPS 2020

  19. arXiv:2008.04267  [pdf, other

    stat.ML cs.LG stat.ME

    Robust Validation: Confident Predictions Even When Distributions Shift

    Authors: Maxime Cauchois, Suyash Gupta, Alnur Ali, John C. Duchi

    Abstract: While the traditional viewpoint in machine learning and statistics assumes training and testing samples come from the same population, practice belies this fiction. One strategy -- coming from robust statistics and optimization -- is thus to build a model robust to distributional perturbations. In this paper, we take a different approach to describe procedures for robust predictive inference, wher… ▽ More

    Submitted 1 March, 2023; v1 submitted 10 August, 2020; originally announced August 2020.

    Comments: 58 pages, 10 figures

  20. arXiv:2007.13982  [pdf, other

    cs.LG stat.ML

    Distributionally Robust Losses for Latent Covariate Mixtures

    Authors: John Duchi, Tatsunori Hashimoto, Hongseok Namkoong

    Abstract: While modern large-scale datasets often consist of heterogeneous subpopulations -- for example, multiple demographic groups or multiple text corpora -- the standard practice of minimizing average loss fails to guarantee uniformly low losses across all subpopulations. We propose a convex procedure that controls the worst-case performance over all subpopulations of a given size. Our procedure comes… ▽ More

    Submitted 10 August, 2022; v1 submitted 28 July, 2020; originally announced July 2020.

    Comments: First released in 2019 on a personal website; published in Operations Research in 2022

  21. arXiv:2006.13476  [pdf, other

    cs.LG math.OC stat.ML

    Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

    Authors: Yossi Arjevani, Yair Carmon, John C. Duchi, Dylan J. Foster, Ayush Sekhari, Karthik Sridharan

    Abstract: We design an algorithm which finds an $ε$-approximate stationary point (with $\|\nabla F(x)\|\le ε$) using $O(ε^{-3})$ stochastic gradient and Hessian-vector products, matching guarantees that were previously available only under a stronger assumption of access to multiple queries with the same random seed. We prove a lower bound which establishes that this rate is optimal and---surprisingly---tha… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

    Comments: Accepted to CONFERENCE ON LEARNING THEORY (COLT) 2020

  22. arXiv:2005.10630  [pdf, other

    cs.CR cs.LG stat.ML

    Near Instance-Optimality in Differential Privacy

    Authors: Hilal Asi, John C. Duchi

    Abstract: We develop two notions of instance optimality in differential privacy, inspired by classical statistical theory: one by defining a local minimax risk and the other by considering unbiased mechanisms and analogizing the Cramer-Rao bound, and we show that the local modulus of continuity of the estimand of interest completely determines these quantities. We also develop a complementary collection mec… ▽ More

    Submitted 16 May, 2020; originally announced May 2020.

  23. arXiv:2004.10181  [pdf, other

    stat.ML cs.LG stat.ME

    Knowing what you know: valid and validated confidence sets in multiclass and multilabel prediction

    Authors: Maxime Cauchois, Suyash Gupta, John Duchi

    Abstract: We develop conformal prediction methods for constructing valid predictive confidence sets in multiclass and multilabel problems without assumptions on the data generating distribution. A challenge here is that typical conformal prediction methods---which give marginal validity (coverage) guarantees---provide uneven coverage, in that they address easy examples at the expense of essentially ignoring… ▽ More

    Submitted 10 July, 2020; v1 submitted 21 April, 2020; originally announced April 2020.

    Comments: Updated section on multilabel settings addressing the cases when labels may repel each other

  24. arXiv:2003.03900  [pdf, other

    cs.LG cs.MA cs.RO stat.ML

    FormulaZero: Distributionally Robust Online Adaptation via Offline Population Synthesis

    Authors: Aman Sinha, Matthew O'Kelly, Hongrui Zheng, Rahul Mangharam, John Duchi, Russ Tedrake

    Abstract: Balancing performance and safety is crucial to deploying autonomous vehicles in multi-agent environments. In particular, autonomous racing is a domain that penalizes safe but conservative policies, highlighting the need for robust, adaptive strategies. Current approaches either make simplifying assumptions about other agents or lack robust mechanisms for online adaptation. This work makes algorith… ▽ More

    Submitted 22 August, 2020; v1 submitted 8 March, 2020; originally announced March 2020.

    Comments: ICML 2020: https://icml.cc/virtual/2020/poster/6277

  25. arXiv:2002.10716  [pdf, other

    cs.LG stat.ML

    Understanding and Mitigating the Tradeoff Between Robustness and Accuracy

    Authors: Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John Duchi, Percy Liang

    Abstract: Adversarial training augments the training set with perturbations to improve the robust error (over worst-case perturbations), but it often leads to an increase in the standard error (on unperturbed test inputs). Previous explanations for this tradeoff rely on the assumption that no predictor in the hypothesis class has low standard and robust error. In this work, we precisely characterize the eff… ▽ More

    Submitted 6 July, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

    Comments: Appearing at International Conference on Machine Learning (ICML) 2020

  26. arXiv:1912.04042  [pdf, other

    cs.LG cs.CR stat.ML

    Element Level Differential Privacy: The Right Granularity of Privacy

    Authors: Hilal Asi, John Duchi, Omid Javidbakht

    Abstract: Differential Privacy (DP) provides strong guarantees on the risk of compromising a user's data in statistical learning applications, though these strong protections make learning challenging and may be too stringent for some use cases. To address this, we propose element level differential privacy, which extends differential privacy to provide protection against leaking information about any parti… ▽ More

    Submitted 5 December, 2019; originally announced December 2019.

    Comments: 34 pages, 5 figures

  27. arXiv:1912.02365  [pdf, other

    math.OC cs.IT cs.LG stat.ML

    Lower Bounds for Non-Convex Stochastic Optimization

    Authors: Yossi Arjevani, Yair Carmon, John C. Duchi, Dylan J. Foster, Nathan Srebro, Blake Woodworth

    Abstract: We lower bound the complexity of finding $ε$-stationary points (with gradient norm at most $ε$) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance, we prove that (in the worst case) any algorithm requires at least $ε^{-4}$ queries to find an… ▽ More

    Submitted 27 February, 2022; v1 submitted 4 December, 2019; originally announced December 2019.

    Comments: Correction to hard instance dimensions in Theorem 3

  28. arXiv:1909.10455  [pdf, other

    math.OC cs.IT cs.LG stat.ML

    Necessary and Sufficient Geometries for Gradient Methods

    Authors: Daniel Levy, John C. Duchi

    Abstract: We study the impact of the constraint set and gradient geometry on the convergence of online and stochastic methods for convex optimization, providing a characterization of the geometries for which stochastic gradient and adaptive gradient methods are (minimax) optimal. In particular, we show that when the constraint set is quadratically convex, diagonally pre-conditioned stochastic gradient metho… ▽ More

    Submitted 28 October, 2019; v1 submitted 23 September, 2019; originally announced September 2019.

    Comments: 23 pages. To appear at NeurIPS 2019

  29. arXiv:1906.06032  [pdf, other

    cs.LG stat.ML

    Adversarial Training Can Hurt Generalization

    Authors: Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John C. Duchi, Percy Liang

    Abstract: While adversarial training can improve robust accuracy (against an adversary), it sometimes hurts standard accuracy (when there is no adversary). Previous work has studied this tradeoff between standard and robust accuracy, but only in the setting where no predictor performs well on both objectives in the infinite data limit. In this paper, we show that even when the optimal predictor with infinit… ▽ More

    Submitted 26 August, 2019; v1 submitted 14 June, 2019; originally announced June 2019.

  30. arXiv:1905.13736  [pdf, other

    stat.ML cs.CV cs.LG

    Unlabeled Data Improves Adversarial Robustness

    Authors: Yair Carmon, Aditi Raghunathan, Ludwig Schmidt, Percy Liang, John C. Duchi

    Abstract: We demonstrate, theoretically and empirically, that adversarial robustness can significantly benefit from semisupervised learning. Theoretically, we revisit the simple Gaussian model of Schmidt et al. that shows a sample complexity gap between standard and robust classification. We prove that unlabeled data bridges this gap: a simple semisupervised learning procedure (self-training) achieves high… ▽ More

    Submitted 13 January, 2022; v1 submitted 31 May, 2019; originally announced May 2019.

    Comments: Corrected some math typos in the proof of Lemma 1

  31. The importance of better models in stochastic optimization

    Authors: Hilal Asi, John C. Duchi

    Abstract: Standard stochastic optimization methods are brittle, sensitive to stepsize choices and other algorithmic parameters, and they exhibit instability outside of well-behaved families of objectives. To address these challenges, we investigate models for stochastic minimization and learning problems that exhibit better robustness to problem families and algorithmic parameters. With appropriately accura… ▽ More

    Submitted 20 March, 2019; originally announced March 2019.

  32. arXiv:1903.02675  [pdf, other

    cs.LG cs.DS math.OC stat.ML

    A Rank-1 Sketch for Matrix Multiplicative Weights

    Authors: Yair Carmon, John C. Duchi, Aaron Sidford, Kevin Tian

    Abstract: We show that a simple randomized sketch of the matrix multiplicative weight (MMW) update enjoys (in expectation) the same regret bounds as MMW, up to a small constant factor. Unlike MMW, where every step requires full matrix exponentiation, our steps require only a single product of the form $e^A b$, which the Lanczos method approximates efficiently. Our key technique is to view the sketch as a… ▽ More

    Submitted 12 August, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

  33. Mean Estimation from One-Bit Measurements

    Authors: Alon Kipnis, John C. Duchi

    Abstract: We consider the problem of estimating the mean of a symmetric log-concave distribution under the constraint that only a single bit per sample from this distribution is available to the estimator. We study the mean squared error as a function of the sample size (and hence the number of bits). We consider three settings: first, a centralized setting, where an encoder may release $n$ bits given a sam… ▽ More

    Submitted 9 May, 2022; v1 submitted 10 January, 2019; originally announced January 2019.

    Comments: Accepted for publication in the IEEE Transactions on Information Theory

    Journal ref: IEEE Transactions on Information Theory ( Volume: 68, Issue: 9, September 2022)

  34. arXiv:1812.00984  [pdf, other

    stat.ML cs.LG

    Protection Against Reconstruction and Its Applications in Private Federated Learning

    Authors: Abhishek Bhowmick, John Duchi, Julien Freudiger, Gaurav Kapoor, Ryan Rogers

    Abstract: In large-scale statistical learning, data collection and model fitting are moving increasingly toward peripheral devices---phones, watches, fitness trackers---away from centralized data collection. Concomitant with this rise in decentralized data are increasing challenges of maintaining privacy while allowing enough information to fit accurate, useful statistical models. This motivates local notio… ▽ More

    Submitted 3 June, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

  35. arXiv:1811.00145  [pdf, ps, other

    cs.LG cs.RO stat.ML

    Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation

    Authors: Matthew O'Kelly, Aman Sinha, Hongseok Namkoong, John Duchi, Russ Tedrake

    Abstract: While recent developments in autonomous vehicle (AV) technology highlight substantial progress, we lack tools for rigorous and scalable testing. Real-world testing, the $\textit{de facto}$ evaluation environment, places the public in danger, and, due to the rare nature of accidents, will require billions of miles in order to statistically validate performance claims. We implement a simulation fram… ▽ More

    Submitted 12 January, 2019; v1 submitted 31 October, 2018; originally announced November 2018.

    Comments: NeurIPS 2018

  36. arXiv:1810.08750  [pdf, other

    stat.ML cs.LG

    Learning Models with Uniform Performance via Distributionally Robust Optimization

    Authors: John Duchi, Hongseok Namkoong

    Abstract: A common goal in statistics and machine learning is to learn models that can perform well against distributional shifts, such as latent heterogeneous subpopulations, unknown covariate shifts, or unmodeled temporal effects. We develop and analyze a distributionally robust stochastic optimization (DRO) framework that learns a model providing good performance against perturbations to the data-generat… ▽ More

    Submitted 17 July, 2020; v1 submitted 19 October, 2018; originally announced October 2018.

  37. arXiv:1810.05633  [pdf, other

    math.OC stat.ML

    Stochastic (Approximate) Proximal Point Methods: Convergence, Optimality, and Adaptivity

    Authors: Hilal Asi, John C. Duchi

    Abstract: We develop model-based methods for solving stochastic convex optimization problems, introducing the approximate-proximal point, or aProx, family, which includes stochastic subgradient, proximal point, and bundle methods. When the modeling approaches we propose are appropriately accurate, the methods enjoy stronger convergence and robustness guarantees than classical approaches, even though the mod… ▽ More

    Submitted 3 July, 2019; v1 submitted 12 October, 2018; originally announced October 2018.

    Comments: To appear in SIAM Journal on Optimization

    Journal ref: SIAM Journal on Optimization 29(3), pp. 2257--2290, 2019

  38. arXiv:1808.09521  [pdf, other

    stat.ME

    Bounds on the conditional and average treatment effect with unobserved confounding factors

    Authors: Steve Yadlowsky, Hongseok Namkoong, Sanjay Basu, John Duchi, Lu Tian

    Abstract: For observational studies, we study the sensitivity of causal inference when treatment assignments may depend on unobserved confounders. We develop a loss minimization approach for estimating bounds on the conditional average treatment effect (CATE) when unobserved confounders have a bounded effect on the odds ratio of treatment selection. Our approach is scalable and allows flexible use of model… ▽ More

    Submitted 9 March, 2022; v1 submitted 28 August, 2018; originally announced August 2018.

  39. arXiv:1804.03761  [pdf, other

    stat.ML cs.LG

    Derivative free optimization via repeated classification

    Authors: Tatsunori B. Hashimoto, Steve Yadlowsky, John C. Duchi

    Abstract: We develop an algorithm for minimizing a function using $n$ batched function value measurements at each of $T$ rounds by using classifiers to identify a function's sublevel set. We show that sufficiently accurate classifiers can achieve linear convergence rates, and show that the convergence rate is tied to the difficulty of active learning sublevel sets. Further, we show that the bootstrap is a c… ▽ More

    Submitted 10 April, 2018; originally announced April 2018.

    Comments: At AISTATS2018

  40. arXiv:1711.02226  [pdf, other

    stat.ML

    Unsupervised Transformation Learning via Convex Relaxations

    Authors: Tatsunori B. Hashimoto, John C. Duchi, Percy Liang

    Abstract: Our goal is to extract meaningful transformations from raw images, such as varying the thickness of lines in handwriting or the lighting in a portrait. We propose an unsupervised approach to learn such transformations by attempting to reconstruct an image from a linear combination of transformations of its nearest neighbors. On handwritten digits and celebrity portraits, we show that even with lin… ▽ More

    Submitted 6 November, 2017; originally announced November 2017.

    Comments: To appear at NIPS 2017

  41. arXiv:1710.10571  [pdf, ps, other

    stat.ML cs.LG

    Certifying Some Distributional Robustness with Principled Adversarial Training

    Authors: Aman Sinha, Hongseok Namkoong, Riccardo Volpi, John Duchi

    Abstract: Neural networks are vulnerable to adversarial examples and researchers have proposed many heuristic attack and defense mechanisms. We address this problem through the principled lens of distributionally robust optimization, which guarantees performance under adversarial input perturbations. By considering a Lagrangian penalty formulation of perturbing the underlying data distribution in a Wasserst… ▽ More

    Submitted 1 May, 2020; v1 submitted 29 October, 2017; originally announced October 2017.

    Comments: ICLR 2018: https://openreview.net/forum?id=Hk6kPgZA-

  42. arXiv:1612.05612  [pdf, other

    math.ST math.OC stat.ML

    Asymptotic Optimality in Stochastic Optimization

    Authors: John Duchi, Feng Ruan

    Abstract: We study local complexity measures for stochastic convex optimization problems, providing a local minimax theory analogous to that of Hájek and Le Cam for classical statistical problems. We give complementary optimality results, develo** fully online methods that adaptively achieve optimal convergence guarantees. Our results provide function-specific lower bounds and convergence results that mak… ▽ More

    Submitted 2 November, 2018; v1 submitted 16 December, 2016; originally announced December 2016.

    Journal ref: Annals of Statistics 2019

  43. arXiv:1610.03425  [pdf, ps, other

    stat.ML

    Statistics of Robust Optimization: A Generalized Empirical Likelihood Approach

    Authors: John Duchi, Peter Glynn, Hongseok Namkoong

    Abstract: We study statistical inference and distributionally robust solution methods for stochastic optimization problems, focusing on confidence intervals for optimal values and solutions that achieve exact coverage asymptotically. We develop a generalized empirical likelihood framework---based on distributional uncertainty sets constructed from nonparametric $f$-divergence balls---for Hadamard differenti… ▽ More

    Submitted 30 June, 2018; v1 submitted 11 October, 2016; originally announced October 2016.

  44. arXiv:1610.02581  [pdf, ps, other

    stat.ML math.ST

    Variance-based regularization with convex objectives

    Authors: John Duchi, Hongseok Namkoong

    Abstract: We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results c… ▽ More

    Submitted 14 December, 2017; v1 submitted 8 October, 2016; originally announced October 2016.

  45. arXiv:1608.03100  [pdf, other

    stat.ML cs.LG

    Estimation from Indirect Supervision with Linear Moments

    Authors: Aditi Raghunathan, Roy Frostig, John Duchi, Percy Liang

    Abstract: In structured prediction problems where we have indirect supervision of the output, maximum marginal likelihood faces two computational obstacles: non-convexity of the objective and intractability of even a single gradient computation. In this paper, we bypass both obstacles for a class of what we call linear indirectly-supervised problems. Our approach is simple: we solve a linear system to estim… ▽ More

    Submitted 10 August, 2016; originally announced August 2016.

    Comments: 12 pages, 7 figures, extended and updated version of our paper appearing in ICML 2016

  46. arXiv:1605.07596  [pdf, other

    stat.ML

    Local Minimax Complexity of Stochastic Convex Optimization

    Authors: Yuancheng Zhu, Sabyasachi Chatterjee, John Duchi, John Lafferty

    Abstract: We extend the traditional worst-case, minimax analysis of stochastic convex optimization by introducing a localized form of minimax complexity for individual functions. Our main result gives function-specific lower and upper bounds on the number of stochastic subgradient evaluations needed to optimize either the function or its "hardest local alternative" to a given numerical precision. The bounds… ▽ More

    Submitted 26 May, 2016; v1 submitted 24 May, 2016; originally announced May 2016.

  47. arXiv:1604.02390  [pdf, ps, other

    math.ST cs.IT stat.ME

    Minimax Optimal Procedures for Locally Private Estimation

    Authors: John Duchi, Martin Wainwright, Michael Jordan

    Abstract: Working under a model of privacy in which data remains private even from the statistician, we study the tradeoff between privacy guarantees and the risk of the resulting statistical estimators. We develop private versions of classical information-theoretic bounds, in particular those due to Le Cam, Fano, and Assouad. These inequalities allow for a precise characterization of statistical rates unde… ▽ More

    Submitted 14 November, 2017; v1 submitted 8 April, 2016; originally announced April 2016.

    Comments: 64 pages, 8 figures. arXiv admin note: substantial text overlap with arXiv:1302.3203

  48. arXiv:1508.00882  [pdf, ps, other

    math.OC stat.ML

    Asynchronous stochastic convex optimization

    Authors: John C. Duchi, Sorathan Chaturapruek, Christopher Ré

    Abstract: We show that asymptotically, completely asynchronous stochastic gradient procedures achieve optimal (even to constant factors) convergence rates for the solution of convex optimization problems under nearly the same conditions required for asymptotic optimality of standard stochastic gradient procedures. Roughly, the noise inherent to the stochastic approximation scheme dominates any noise from as… ▽ More

    Submitted 4 August, 2015; originally announced August 2015.

    Comments: 38 pages, 8 figures

  49. arXiv:1312.2139  [pdf, ps, other

    math.OC cs.IT stat.ML

    Optimal rates for zero-order convex optimization: the power of two function evaluations

    Authors: John C. Duchi, Michael I. Jordan, Martin J. Wainwright, Andre Wibisono

    Abstract: We consider derivative-free algorithms for stochastic and non-stochastic convex optimization problems that use only function values rather than gradients. Focusing on non-asymptotic bounds on convergence rates, we show that if pairs of function values are available, algorithms for $d$-dimensional optimization that use gradient estimates based on random perturbations suffer a factor of at most… ▽ More

    Submitted 20 August, 2014; v1 submitted 7 December, 2013; originally announced December 2013.

    Comments: 34 pages

  50. arXiv:1305.5029  [pdf, ps, other

    math.ST cs.LG stat.ML

    Divide and Conquer Kernel Ridge Regression: A Distributed Algorithm with Minimax Optimal Rates

    Authors: Yuchen Zhang, John C. Duchi, Martin J. Wainwright

    Abstract: We establish optimal convergence rates for a decomposition-based scalable approach to kernel ridge regression. The method is simple to describe: it randomly partitions a dataset of size N into m subsets of equal size, computes an independent kernel ridge regression estimator for each subset, then averages the local solutions into a global predictor. This partitioning leads to a substantial reducti… ▽ More

    Submitted 29 April, 2014; v1 submitted 22 May, 2013; originally announced May 2013.