Skip to main content

Showing 1–9 of 9 results for author: Soloff, J A

.
  1. arXiv:2405.14064  [pdf, other

    stat.ML cs.LG math.ST

    Building a stable classifier with the inflated argmax

    Authors: Jake A. Soloff, Rina Foygel Barber, Rebecca Willett

    Abstract: We propose a new framework for algorithmic stability in the context of multiclass classification. In practice, classification algorithms often operate by first assigning a continuous score (for instance, an estimated probability) to each possible label, then taking the maximizer -- i.e., selecting the class that has the highest score. A drawback of this type of approach is that it is inherently un… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  2. arXiv:2405.09511  [pdf, other

    math.ST

    Stability via resampling: statistical problems beyond the real line

    Authors: Jake A. Soloff, Rina Foygel Barber, Rebecca Willett

    Abstract: Model averaging techniques based on resampling methods (such as bootstrap** or subsampling) have been utilized across many areas of statistics, often with the explicit goal of promoting stability in the resulting output. We provide a general, finite-sample theoretical result guaranteeing the stability of bagging when applied to algorithms that return outputs in a general space, so that the outpu… ▽ More

    Submitted 24 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  3. arXiv:2307.03748  [pdf, other

    stat.ME cs.GT cs.LG stat.ML

    Incentive-Theoretic Bayesian Inference for Collaborative Science

    Authors: Stephen Bates, Michael I. Jordan, Michael Sklar, Jake A. Soloff

    Abstract: Contemporary scientific research is a distributed, collaborative endeavor, carried out by teams of researchers, regulatory institutions, funding agencies, commercial partners, and scientific bodies, all interacting with each other and facing different incentives. To maintain scientific rigor, statistical methods should acknowledge this state of affairs. To this end, we study hypothesis testing whe… ▽ More

    Submitted 8 February, 2024; v1 submitted 7 July, 2023; originally announced July 2023.

  4. arXiv:2301.12600  [pdf, other

    stat.ML cs.LG math.ST

    Bagging Provides Assumption-free Stability

    Authors: Jake A. Soloff, Rina Foygel Barber, Rebecca Willett

    Abstract: Bagging is an important technique for stabilizing machine learning models. In this paper, we derive a finite-sample guarantee on the stability of bagging for any model. Our result places no assumptions on the distribution of the data, on the properties of the base algorithm, or on the dimensionality of the covariates. Our guarantee applies to many variants of bagging and is optimal up to a constan… ▽ More

    Submitted 25 April, 2024; v1 submitted 29 January, 2023; originally announced January 2023.

  5. arXiv:2207.07299  [pdf, other

    stat.ME math.ST

    The edge of discovery: Controlling the local false discovery rate at the margin

    Authors: Jake A. Soloff, Daniel Xiang, William Fithian

    Abstract: Despite the popularity of the false discovery rate (FDR) as an error control metric for large-scale multiple testing, its close Bayesian counterpart the local false discovery rate (lfdr), defined as the posterior probability that a particular null hypothesis is false, is a more directly relevant standard for justifying and interpreting individual rejections. However, the lfdr is difficult to work… ▽ More

    Submitted 21 September, 2023; v1 submitted 15 July, 2022; originally announced July 2022.

  6. arXiv:2205.06812  [pdf, other

    cs.GT cs.LG cs.MA math.ST stat.ME

    Principal-Agent Hypothesis Testing

    Authors: Stephen Bates, Michael I. Jordan, Michael Sklar, Jake A. Soloff

    Abstract: Consider the relationship between a regulator (the principal) and an experimenter (the agent) such as a pharmaceutical company. The pharmaceutical company wishes to sell a drug for profit, whereas the regulator wishes to allow only efficacious drugs to be marketed. The efficacy of the drug is not known to the regulator, so the pharmaceutical company must run a costly trial to prove efficacy to the… ▽ More

    Submitted 15 April, 2024; v1 submitted 13 May, 2022; originally announced May 2022.

  7. arXiv:2109.03466  [pdf, other

    math.ST

    Multivariate, Heteroscedastic Empirical Bayes via Nonparametric Maximum Likelihood

    Authors: Jake A. Soloff, Adityanand Guntuboyina, Bodhisattva Sen

    Abstract: Multivariate, heteroscedastic errors complicate statistical inference in many large-scale denoising problems. Empirical Bayes is attractive in such settings, but standard parametric approaches rest on assumptions about the form of the prior distribution which can be hard to justify and which introduce unnecessary tuning parameters. We extend the nonparametric maximum likelihood estimator (NPMLE) f… ▽ More

    Submitted 29 December, 2023; v1 submitted 8 September, 2021; originally announced September 2021.

  8. arXiv:2007.15252  [pdf, ps, other

    math.ST

    Covariance estimation with nonnegative partial correlations

    Authors: Jake A. Soloff, Adityanand Guntuboyina, Michael I. Jordan

    Abstract: We study the problem of high-dimensional covariance estimation under the constraint that the partial correlations are nonnegative. The sign constraints dramatically simplify estimation: the Gaussian maximum likelihood estimator is well defined with only two observations regardless of the number of variables. We analyze its performance in the setting where the dimension may be much larger than the… ▽ More

    Submitted 30 July, 2020; originally announced July 2020.

  9. arXiv:1812.04249  [pdf, other

    math.ST math.PR

    Distribution-free properties of isotonic regression

    Authors: Jake A. Soloff, Adityanand Guntuboyina, Jim Pitman

    Abstract: It is well known that the isotonic least squares estimator is characterized as the derivative of the greatest convex minorant of a random walk. Provided the walk has exchangeable increments, we prove that the slopes of the greatest convex minorant are distributed as order statistics of the running averages. This result implies an exact non-asymptotic formula for the squared error risk of least squ… ▽ More

    Submitted 11 December, 2018; originally announced December 2018.