Skip to main content

Showing 51–75 of 75 results for author: Wager, S

.
  1. arXiv:1705.01677  [pdf, other

    stat.ME

    Optimized Regression Discontinuity Designs

    Authors: Guido Imbens, Stefan Wager

    Abstract: The increasing popularity of regression discontinuity methods for causal inference in observational studies has led to a proliferation of different estimating strategies, most of which involve first fitting non-parametric regression models on both sides of a treatment assignment boundary and then reporting plug-in estimates for the effect of interest. In applications, however, it is often difficul… ▽ More

    Submitted 7 June, 2018; v1 submitted 3 May, 2017; originally announced May 2017.

    Comments: Review of Economics and Statistics, forthcoming

  2. arXiv:1702.02896  [pdf, other

    math.ST cs.LG econ.EM stat.ML

    Policy Learning with Observational Data

    Authors: Susan Athey, Stefan Wager

    Abstract: In many areas, practitioners seek to use observational data to learn a treatment assignment policy that satisfies application-specific constraints, such as budget, fairness, simplicity, or other functional form constraints. For example, policies may be restricted to take the form of decision trees based on a limited set of easily observable individual characteristics. We propose a new approach to… ▽ More

    Submitted 4 September, 2020; v1 submitted 9 February, 2017; originally announced February 2017.

    Comments: Forthcoming in Econometrica. Original title: Efficient Policy Learning

  3. arXiv:1702.01250  [pdf, ps, other

    stat.ME econ.EM

    Estimating Average Treatment Effects: Supplementary Analyses and Remaining Challenges

    Authors: Susan Athey, Guido Imbens, Thai Pham, Stefan Wager

    Abstract: There is a large literature on semiparametric estimation of average treatment effects under unconfounded treatment assignment in settings with a fixed number of covariates. More recently attention has focused on settings with a large number of covariates. In this paper we extend lessons from the earlier literature to this new setting. We propose that in addition to reporting point estimates and st… ▽ More

    Submitted 4 February, 2017; originally announced February 2017.

    Comments: 9 pages

  4. arXiv:1610.01271  [pdf, other

    stat.ME econ.EM stat.ML

    Generalized Random Forests

    Authors: Susan Athey, Julie Tibshirani, Stefan Wager

    Abstract: We propose generalized random forests, a method for non-parametric statistical estimation based on random forests (Breiman, 2001) that can be used to fit any quantity of interest identified as the solution to a set of local moment equations. Following the literature on local maximum likelihood estimation, our method considers a weighted set of nearby training examples; however, instead of using cl… ▽ More

    Submitted 5 April, 2018; v1 submitted 5 October, 2016; originally announced October 2016.

    Comments: Forthcoming in the Annals of Statistics

  5. High-dimensional regression adjustments in randomized experiments

    Authors: Stefan Wager, Wenfei Du, Jonathan Taylor, Robert Tibshirani

    Abstract: We study the problem of treatment effect estimation in randomized experiments with high-dimensional covariate information, and show that essentially any risk-consistent regression adjustment can be used to obtain efficient estimates of the average treatment effect. Our results considerably extend the range of settings where high-dimensional regression adjustments are guaranteed to provide valid in… ▽ More

    Submitted 27 October, 2016; v1 submitted 22 July, 2016; originally announced July 2016.

    Comments: To appear in the Proceedings of the National Academy of Sciences. The present draft does not reflect final copyediting by the PNAS staff

  6. arXiv:1604.07125  [pdf, other

    stat.ME econ.EM math.ST

    Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions

    Authors: Susan Athey, Guido W. Imbens, Stefan Wager

    Abstract: There are many settings where researchers are interested in estimating average treatment effects and are willing to rely on the unconfoundedness assumption, which requires that the treatment assignment be as good as random conditional on pre-treatment variables. The unconfoundedness assumption is often more plausible if a large number of pre-treatment variables are included in the analysis, but th… ▽ More

    Submitted 31 January, 2018; v1 submitted 25 April, 2016; originally announced April 2016.

    Comments: Forthcoming in the Journal of the Royal Statistical Society, Series B

  7. arXiv:1603.06340  [pdf, other

    stat.ML

    Data Augmentation via Levy Processes

    Authors: Stefan Wager, William Fithian, Percy Liang

    Abstract: If a document is about travel, we may expect that short snippets of the document should also be about travel. We introduce a general framework for incorporating these types of invariances into a discriminative classifier. The framework imagines data as being drawn from a slice of a Levy process. If we slice the Levy process at an earlier point in time, we obtain additional pseudo-examples, which c… ▽ More

    Submitted 21 March, 2016; originally announced March 2016.

  8. arXiv:1602.01206  [pdf, other

    stat.AP stat.ME

    denoiseR: A Package for Low Rank Matrix Estimation

    Authors: Julie Josse, Sylvain Sardy, Stefan Wager

    Abstract: We introduce denoiseR, an R package that provides a unified implementation of several state-of-the-art proposals for regularized low rank matrix estimation, along with automatic selection of the regularization parameters. We also extend these methods to allow for missing values. The regularization schemes discussed in this paper are built around singular-value shrinkage and bootstrap-based stabili… ▽ More

    Submitted 8 August, 2018; v1 submitted 3 February, 2016; originally announced February 2016.

  9. arXiv:1510.04342  [pdf, other

    stat.ME math.ST stat.ML

    Estimation and Inference of Heterogeneous Treatment Effects using Random Forests

    Authors: Stefan Wager, Susan Athey

    Abstract: Many scientific and engineering challenges -- ranging from personalized medicine to customized marketing recommendations -- require an understanding of treatment effect heterogeneity. In this paper, we develop a non-parametric causal forest for estimating heterogeneous treatment effects that extends Breiman's widely used random forest algorithm. In the potential outcomes framework with unconfounde… ▽ More

    Submitted 9 July, 2017; v1 submitted 14 October, 2015; originally announced October 2015.

    Comments: To appear in the Journal of the American Statistical Association. Part of the results developed in this paper were made available as an earlier technical report "Asymptotic Theory for Random Forests", available at (arXiv:1405.0352)

  10. arXiv:1508.01278  [pdf, other

    stat.AP

    Teaching Statistics at Google Scale

    Authors: Nicholas Chamandy, Omkar Muralidharan, Stefan Wager

    Abstract: Modern data and applications pose very different challenges from those of the 1950s or even the 1980s. Students contemplating a career in statistics or data science need to have the tools to tackle problems involving massive, heavy-tailed data, often interacting with live, complex systems. However, despite the deepening connections between engineering and modern data science, we argue that trainin… ▽ More

    Submitted 16 August, 2015; v1 submitted 6 August, 2015; originally announced August 2015.

    Comments: To appear in The American Statistician

  11. arXiv:1507.03003  [pdf, other

    math.ST stat.ML

    High-Dimensional Asymptotics of Prediction: Ridge Regression and Classification

    Authors: Edgar Dobriban, Stefan Wager

    Abstract: We provide a unified analysis of the predictive risk of ridge regression and regularized discriminant analysis in a dense random effects model. We work in a high-dimensional asymptotic regime where $p, n \to \infty$ and $p/n \to γ\in (0, \, \infty)$, and allow for arbitrary covariance among the features. For both methods, we provide an explicit and efficiently computable expression for the limitin… ▽ More

    Submitted 4 November, 2015; v1 submitted 10 July, 2015; originally announced July 2015.

    Comments: Added a section on prediction versus estimation for ridge regression. Rewrote introduction. Other results unchanged

  12. arXiv:1507.00832  [pdf, other

    math.ST

    The Efficiency of Density Deconvolution

    Authors: Stefan Wager

    Abstract: The density deconvolution problem involves recovering a target density g from a sample that has been corrupted by noise. From the perspective of Le Cam's local asymptotic normality theory, we show that non-parametric density deconvolution with Gaussian noise behaves similarly to a low-dimensional parametric problem that can easily be solved by maximum likelihood. This framework allows us to give a… ▽ More

    Submitted 3 July, 2015; originally announced July 2015.

  13. arXiv:1503.06388  [pdf, other

    math.ST stat.ML

    Adaptive Concentration of Regression Trees, with Application to Random Forests

    Authors: Stefan Wager, Guenther Walther

    Abstract: We study the convergence of the predictive surface of regression trees and forests. To support our analysis we introduce a notion of adaptive concentration for regression trees. This approach breaks tree training into a model selection phase in which we pick the tree splits, followed by a model fitting phase where we find the best regression model consistent with these splits. We then show that th… ▽ More

    Submitted 30 April, 2016; v1 submitted 22 March, 2015; originally announced March 2015.

  14. arXiv:1412.4182  [pdf, other

    math.ST cs.LG stat.ML

    The Statistics of Streaming Sparse Regression

    Authors: Jacob Steinhardt, Stefan Wager, Percy Liang

    Abstract: We present a sparse analogue to stochastic gradient descent that is guaranteed to perform well under similar conditions to the lasso. In the linear regression setup with irrepresentable noise features, our algorithm recovers the support set of the optimal parameter vector with high probability, and achieves a statistically quasi-optimal rate of convergence of Op(k log(d)/T), where k is the sparsit… ▽ More

    Submitted 12 December, 2014; originally announced December 2014.

  15. arXiv:1410.8275  [pdf, other

    stat.ME cs.LG stat.ML

    Bootstrap-Based Regularization for Low-Rank Matrix Estimation

    Authors: Julie Josse, Stefan Wager

    Abstract: We develop a flexible framework for low-rank matrix estimation that allows us to transform noise models into regularization schemes via a simple bootstrap algorithm. Effectively, our procedure seeks an autoencoding basis for the observed matrix that is stable with respect to the specified noise model; we call the resulting procedure a stable autoencoder. In the simplest case, with an isotropic noi… ▽ More

    Submitted 28 June, 2016; v1 submitted 30 October, 2014; originally announced October 2014.

    Comments: To appear in the Journal of Machine Learning Research

  16. arXiv:1407.7614  [pdf, other

    stat.ME

    Confidence Areas for Fixed-Effects PCA

    Authors: Julie Josse, Stefan Wager, François Husson

    Abstract: PCA is often used to visualize data when the rows and the columns are both of interest. In such a setting there is a lack of inferential methods on the PCA output. We study the asymptotic variance of a fixed-effects model for PCA, and propose several approaches to assessing the variability of PCA estimates: a method based on a parametric bootstrap, a new cell-wise jackknife, as well as a computati… ▽ More

    Submitted 28 July, 2014; originally announced July 2014.

  17. arXiv:1407.3289  [pdf, other

    stat.ML cs.LG math.ST

    Altitude Training: Strong Bounds for Single-Layer Dropout

    Authors: Stefan Wager, William Fithian, Sida Wang, Percy Liang

    Abstract: Dropout training, originally designed for deep neural networks, has been successful on high-dimensional single-layer natural language tasks. This paper proposes a theoretical explanation for this phenomenon: we show that, under a generative Poisson topic model with long documents, dropout training improves the exponent in the generalization bound for empirical risk minimization. Dropout achieves t… ▽ More

    Submitted 31 October, 2014; v1 submitted 11 July, 2014; originally announced July 2014.

    Comments: Advances in Neural Information Processing Systems (NIPS), 2014

  18. arXiv:1405.0352  [pdf, other

    math.ST stat.ML

    Asymptotic Theory for Random Forests

    Authors: Stefan Wager

    Abstract: Random forests have proven to be reliable predictive algorithms in many application areas. Not much is known, however, about the statistical properties of random forests. Several authors have established conditions under which their predictions are consistent, but these results do not provide practical estimates of random forest errors. In this paper, we analyze a random forest model based on subs… ▽ More

    Submitted 3 May, 2016; v1 submitted 2 May, 2014; originally announced May 2014.

    Comments: This manuscript is superseded by "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests" by Wager and Athey (arXiv:1510.04342). The new paper extends the asymptotic theory developed here, and applies it to causal inference in the potential outcomes framework with unconfoundedness. The present version is maintained online for archival purposes only

  19. arXiv:1311.4555  [pdf, other

    stat.ML stat.CO stat.ME

    Confidence Intervals for Random Forests: The Jackknife and the Infinitesimal Jackknife

    Authors: Stefan Wager, Trevor Hastie, Bradley Efron

    Abstract: We study the variability of predictions made by bagged learners and random forests, and show how to estimate standard errors for these methods. Our work builds on variance estimates for bagging proposed by Efron (1992, 2012) that are based on the jackknife and the infinitesimal jackknife (IJ). In practice, bagged predictors are computed using a finite number B of bootstrap replicates, and working… ▽ More

    Submitted 28 March, 2014; v1 submitted 18 November, 2013; originally announced November 2013.

    Comments: To appear in Journal of Machine Learning Research (JMLR)

  20. arXiv:1310.2931  [pdf, other

    stat.ME cs.LG stat.ML

    Feedback Detection for Live Predictors

    Authors: Stefan Wager, Nick Chamandy, Omkar Muralidharan, Amir Najmi

    Abstract: A predictor that is deployed in a live production system may perturb the features it uses to make predictions. Such a feedback loop can occur, for example, when a model that predicts a certain type of behavior ends up causing the behavior it predicts, thus creating a self-fulfilling prophecy. In this paper we analyze predictor feedback detection as a causal inference problem, and introduce a local… ▽ More

    Submitted 31 October, 2014; v1 submitted 10 October, 2013; originally announced October 2013.

    Comments: Advances in Neural Information Processing Systems (NIPS), 2014

  21. arXiv:1310.1363  [pdf, ps, other

    stat.ML cs.LG

    Weakly supervised clustering: Learning fine-grained signals from coarse labels

    Authors: Stefan Wager, Alexander Blocker, Niall Cardin

    Abstract: Consider a classification problem where we do not have access to labels for individual training examples, but only have average labels over subpopulations. We give practical examples of this setup and show how such a classification task can usefully be analyzed as a weakly supervised clustering problem. We propose three approaches to solving the weakly supervised clustering problem, including a la… ▽ More

    Submitted 15 September, 2015; v1 submitted 4 October, 2013; originally announced October 2013.

    Comments: Published at http://dx.doi.org/10.1214/15-AOAS812 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS812

    Journal ref: Annals of Applied Statistics 2015, Vol. 9, No. 2, 801-820

  22. arXiv:1309.5352  [pdf, other

    math.ST stat.ME

    Sequential Selection Procedures and False Discovery Rate Control

    Authors: Max Grazier G'Sell, Stefan Wager, Alexandra Chouldechova, Robert Tibshirani

    Abstract: We consider a multiple hypothesis testing setting where the hypotheses are ordered and one is only permitted to reject an initial contiguous block, H_1,\dots,H_k, of hypotheses. A rejection rule in this setting amounts to a procedure for choosing the stop** point k. This setting is inspired by the sequential nature of many model selection problems, where choosing a stop** point or a model is e… ▽ More

    Submitted 23 March, 2015; v1 submitted 20 September, 2013; originally announced September 2013.

    Comments: 31 pages, 14 figures. Accepted to the Journal of the Royal Statistical Society: Series B

  23. arXiv:1307.7830  [pdf, other

    stat.ME

    Semiparametric Exponential Families for Heavy-Tailed Data

    Authors: William Fithian, Stefan Wager

    Abstract: We propose a semiparametric method for fitting the tail of a heavy-tailed population given a relatively small sample from that population and a larger sample from a related background population. We model the tail of the small sample as an exponential tilt of the better-observed large-sample tail, using a robust sufficient statistic motivated by extreme value theory. In particular, our method indu… ▽ More

    Submitted 19 October, 2014; v1 submitted 30 July, 2013; originally announced July 2013.

    Comments: To appear in Biometrika

    MSC Class: 62G32; 62G35 (Primary) 62G20 (Secondary)

  24. arXiv:1307.1493  [pdf, other

    stat.ML cs.LG stat.ME

    Dropout Training as Adaptive Regularization

    Authors: Stefan Wager, Sida Wang, Percy Liang

    Abstract: Dropout and other feature noising schemes control overfitting by artificially corrupting the training data. For generalized linear models, dropout performs a form of adaptive regularization. Using this viewpoint, we show that the dropout regularizer is first-order equivalent to an L2 regularizer applied after scaling the features by an estimate of the inverse diagonal Fisher information matrix. We… ▽ More

    Submitted 1 November, 2013; v1 submitted 4 July, 2013; originally announced July 2013.

    Comments: 11 pages. Advances in Neural Information Processing Systems (NIPS), 2013

  25. arXiv:1204.0316  [pdf, other

    stat.ME math.ST

    Subsampling Extremes: From Block Maxima to Smooth Tail Estimation

    Authors: Stefan Wager

    Abstract: We study a new estimator for the tail index of a distribution in the Frechet domain of attraction that arises naturally by computing subsample maxima. This estimator is equivalent to taking a U-statistic over a Hill estimator with two order statistics. The estimator presents multiple advantages over the Hill estimator. In particular, it has asymptotically smooth sample paths as a function of the t… ▽ More

    Submitted 19 October, 2014; v1 submitted 2 April, 2012; originally announced April 2012.

    Comments: Added references