Skip to main content

Showing 1–12 of 12 results for author: G'Sell, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2106.07623  [pdf, other

    stat.AP stat.ME

    Inference with generalizable classifier predictions

    Authors: Ciaran Evans, Zara Y. Weinberg, Manojkumar A. Puthenveedu, Max G'Sell

    Abstract: This paper addresses the problem of making statistical inference about a population that can only be identified through classifier predictions. The problem is motivated by scientific studies in which human labels of a population are replaced by a classifier. For downstream analysis of the population based on classifier predictions to be sound, the predictions must generalize equally across experim… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

    Comments: 26 pages, 9 figures

  2. arXiv:2009.08592  [pdf, other

    stat.ME stat.ML

    Sequential changepoint detection in classification data under label shift

    Authors: Ciaran Evans, Max G'Sell

    Abstract: Classifier predictions often rely on the assumption that new observations come from the same distribution as training data. When the underlying distribution changes, so does the optimal classification rule, and performance may degrade. We consider the problem of detecting such a change in distribution in sequentially-observed, unlabeled classification data. We focus on label shift changes to the d… ▽ More

    Submitted 31 August, 2021; v1 submitted 17 September, 2020; originally announced September 2020.

    Comments: 25 pages, 3 figures, 4 tables

  3. arXiv:2003.13808  [pdf, other

    stat.ME cs.CY

    Fairness Evaluation in Presence of Biased Noisy Labels

    Authors: Riccardo Fogliato, Max G'Sell, Alexandra Chouldechova

    Abstract: Risk assessment tools are widely used around the country to inform decision making within the criminal justice system. Recently, considerable attention has been devoted to the question of whether such tools may suffer from racial bias. In this type of assessment, a fundamental issue is that the training and evaluation of the model is based on a variable (arrest) that may represent a noisy version… ▽ More

    Submitted 30 March, 2020; originally announced March 2020.

    Comments: Accepted at International Conference on Artificial Intelligence and Statistics (AISTATS), 2020

  4. arXiv:1812.03644  [pdf, other

    stat.ME

    Post-Selection Inference for Changepoint Detection Algorithms with Application to Copy Number Variation Data

    Authors: Sangwon Hyun, Kevin Lin, Max G'Sell, Ryan J. Tibshirani

    Abstract: Changepoint detection methods are used in many areas of science and engineering, e.g., in the analysis of copy number variation data, to detect abnormalities in copy numbers along the genome. Despite the broad array of available tools, methodology for quantifying our uncertainty in the strength (or presence) of given changepoints, post-detection, are lacking. Post-selection inference offers a fram… ▽ More

    Submitted 10 December, 2018; originally announced December 2018.

  5. arXiv:1801.03635  [pdf, other

    stat.ME

    Sharp instruments for classifying compliers and generalizing causal effects

    Authors: Edward H. Kennedy, Sivaraman Balakrishnan, Max G'Sell

    Abstract: It is well-known that, without restricting treatment effect heterogeneity, instrumental variable (IV) methods only identify "local" effects among compliers, i.e., those subjects who take treatment only when encouraged by the IV. Local effects are controversial since they seem to only apply to an unidentified subgroup; this has led many to denounce these effects as having little policy relevance. H… ▽ More

    Submitted 30 May, 2019; v1 submitted 11 January, 2018; originally announced January 2018.

  6. arXiv:1707.00046  [pdf, other

    stat.AP cs.CY stat.ML

    Fairer and more accurate, but for whom?

    Authors: Alexandra Chouldechova, Max G'Sell

    Abstract: Complex statistical machine learning models are increasingly being used or considered for use in high-stakes decision-making pipelines in domains such as financial services, health care, criminal justice and human services. These models are often investigated as possible improvements over more classical tools such as regression models or human judgement. While the modeling approach may be new, the… ▽ More

    Submitted 30 June, 2017; originally announced July 2017.

    Comments: Presented as a poster at the 2017 Workshop on Fairness, Accountability, and Transparency in Machine Learning (FAT/ML 2017)

  7. arXiv:1606.03552  [pdf, other

    stat.ME

    Exact Post-Selection Inference for Changepoint Detection and Other Generalized Lasso Problems

    Authors: Sangwon Hyun, Max G'Sell, Ryan J. Tibshirani

    Abstract: We study tools for inference conditioned on model selection events that are defined by the generalized lasso regularization path. The generalized lasso estimate is given by the solution of a penalized least squares regression problem, where the penalty is the l1 norm of a matrix D times the coefficient vector. The generalized lasso path collects these estimates for a range of penalty parameter (λ)… ▽ More

    Submitted 11 June, 2016; originally announced June 2016.

  8. arXiv:1604.04173  [pdf, other

    stat.ME math.ST stat.ML

    Distribution-Free Predictive Inference For Regression

    Authors: **g Lei, Max G'Sell, Alessandro Rinaldo, Ryan J. Tibshirani, Larry Wasserman

    Abstract: We develop a general framework for distribution-free predictive inference in regression, using conformal inference. The proposed methodology allows for the construction of a prediction band for the response variable using any estimator of the regression function. The resulting prediction band preserves the consistency properties of the original estimator under standard assumptions, while guarantee… ▽ More

    Submitted 8 March, 2017; v1 submitted 14 April, 2016; originally announced April 2016.

    Comments: 50 pages, 7 figures, 3 tables

  9. arXiv:1309.5352  [pdf, other

    math.ST stat.ME

    Sequential Selection Procedures and False Discovery Rate Control

    Authors: Max Grazier G'Sell, Stefan Wager, Alexandra Chouldechova, Robert Tibshirani

    Abstract: We consider a multiple hypothesis testing setting where the hypotheses are ordered and one is only permitted to reject an initial contiguous block, H_1,\dots,H_k, of hypotheses. A rejection rule in this setting amounts to a procedure for choosing the stop** point k. This setting is inspired by the sequential nature of many model selection problems, where choosing a stop** point or a model is e… ▽ More

    Submitted 23 March, 2015; v1 submitted 20 September, 2013; originally announced September 2013.

    Comments: 31 pages, 14 figures. Accepted to the Journal of the Royal Statistical Society: Series B

  10. arXiv:1308.2329  [pdf, other

    stat.ME

    Sensitivity Analysis for Inference with Partially Identifiable Covariance Matrices

    Authors: Max Grazier G'Sell, Shai S. Shen-Orr, Robert Tibshirani

    Abstract: In some multivariate problems with missing data, pairs of variables exist that are never observed together. For example, some modern biological tools can produce data of this form. As a result of this structure, the covariance matrix is only partially identifiable, and point estimation requires that identifying assumptions be made. These assumptions can introduce an unknown and potentially large b… ▽ More

    Submitted 10 August, 2013; originally announced August 2013.

    Comments: 19 pages, 8 figures. Submitted to Computational Statistics

  11. arXiv:1307.4765  [pdf, other

    math.ST stat.ME

    Adaptive testing for the graphical lasso

    Authors: Max Grazier G'Sell, Jonathan Taylor, Robert Tibshirani

    Abstract: We consider tests of significance in the setting of the graphical lasso for inverse covariance matrix estimation. We propose a simple test statistic based on a subsequence of the knots in the graphical lasso path. We show that this statistic has an exponential asymptotic null distribution, under the null hypothesis that the model contains the true connected components. Though the null distributi… ▽ More

    Submitted 22 July, 2013; v1 submitted 17 July, 2013; originally announced July 2013.

    Comments: 33 pages, 8 figures. Submitted to Annals of Statistics

    MSC Class: 62F12; 62H15

  12. arXiv:1302.2303  [pdf, other

    stat.ME

    False Variable Selection Rates in Regression

    Authors: Max Grazier G'Sell, Trevor Hastie, Robert Tibshirani

    Abstract: There has been recent interest in extending the ideas of False Discovery Rates (FDR) to variable selection in regression settings. Traditionally the FDR in these settings has been defined in terms of the coefficients of the full regression model. Recent papers have struggled with controlling this quantity when the predictors are correlated. This paper shows that this full model definition of FDR s… ▽ More

    Submitted 10 February, 2013; originally announced February 2013.

    Comments: 14 figures, 21 pages. Submitted to Annals of Applied Statistics