Skip to main content

Showing 1–10 of 10 results for author: Crawford, L

Searching in archive stat. Search in all archives.
.
  1. arXiv:2308.14249  [pdf, other

    stat.ME

    Statistical Inference on Grayscale Images via the Euler-Radon Transform

    Authors: Kun Meng, Mattie Ji, **yu Wang, Kexin Ding, Henry Kirveslahti, Ani Eloyan, Lorin Crawford

    Abstract: Tools from topological data analysis have been widely used to represent binary images in many scientific applications. Methods that aim to represent grayscale images (i.e., where pixel intensities instead take on continuous values) have been relatively underdeveloped. In this paper, we introduce the Euler-Radon transform, which generalizes the Euler characteristic transform to grayscale images by… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

    Comments: 85 pages, 9 figures

  2. arXiv:2306.11839  [pdf, other

    stat.ME cs.LG stat.AP stat.ML

    Should I Stop or Should I Go: Early Stop** with Heterogeneous Populations

    Authors: Hammaad Adam, Fan Yin, Huibin, Hu, Neil Tenenholtz, Lorin Crawford, Lester Mackey, Allison Koenecke

    Abstract: Randomized experiments often need to be stopped prematurely due to the treatment having an unintended harmful effect. Existing methods that determine when to stop an experiment early are typically applied to the data in aggregate and do not account for treatment effect heterogeneity. In this paper, we study the early stop** of experiments for harm on heterogeneous populations. We first establish… ▽ More

    Submitted 27 October, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 (spotlight)

  3. arXiv:2302.02024  [pdf, other

    stat.ME

    A Simple Approach for Local and Global Variable Importance in Nonlinear Regression Models

    Authors: Emily T. Winn-Nuñez, Maryclare Griffin, Lorin Crawford

    Abstract: The ability to interpret machine learning models has become increasingly important as their usage in data science continues to rise. Most current interpretability methods are optimized to work on either (\textit{i}) a global scale, where the goal is to rank features based on their contributions to overall variation in an observed population, or (\textit{ii}) the local level, which aims to detail o… ▽ More

    Submitted 10 August, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

  4. Randomness of Shapes and Statistical Inference on Shapes via the Smooth Euler Characteristic Transform

    Authors: Kun Meng, **yu Wang, Lorin Crawford, Ani Eloyan

    Abstract: In this article, we establish the mathematical foundations for modeling the randomness of shapes and conducting statistical inference on shapes using the smooth Euler characteristic transform. Based on these foundations, we propose two chi-squared statistic-based algorithms for testing hypotheses on random shapes. Simulation studies are presented to validate our mathematical derivations and to com… ▽ More

    Submitted 23 May, 2024; v1 submitted 27 April, 2022; originally announced April 2022.

    Comments: 110 pages

    Journal ref: Journal of the American Statistical Association, 2024

  5. arXiv:2104.03088  [pdf

    stat.ML cs.LG stat.ME

    Hollow-tree Super: a directional and scalable approach for feature importance in boosted tree models

    Authors: Stephane Doyen, Hugh Taylor, Peter Nicholas, Lewis Crawford, Isabella Young, Michael Sughrue

    Abstract: Current limitations in boosted tree modelling prevent the effective scaling to datasets with a large feature number, particularly when investigating the magnitude and directionality of various features on classification. We present a novel methodology, Hollow-tree Super (HOTS), to resolve and visualize feature importance in boosted tree models involving a large number of features. Further, HOTS al… ▽ More

    Submitted 7 April, 2021; originally announced April 2021.

    Comments: 28 pages, 1 table, 7 figures, PDF format - Submitted to PLOSONE pending review

    MSC Class: 62-08

  6. arXiv:2007.10389  [pdf, other

    stat.ML cs.LG

    Generalizing Variational Autoencoders with Hierarchical Empirical Bayes

    Authors: Wei Cheng, Gregory Darnell, Sohini Ramachandran, Lorin Crawford

    Abstract: Variational Autoencoders (VAEs) have experienced recent success as data-generating models by using simple architectures that do not require significant fine-tuning of hyperparameters. However, VAEs are known to suffer from over-regularization which can lead to failure to escape local maxima. This phenomenon, known as posterior collapse, prevents learning a meaningful latent encoding of the data. R… ▽ More

    Submitted 20 July, 2020; originally announced July 2020.

    Comments: 13 pages

  7. arXiv:1901.09839  [pdf, other

    stat.ML cs.LG

    Interpreting Deep Neural Networks Through Variable Importance

    Authors: Jonathan Ish-Horowicz, Dana Udwin, Seth Flaxman, Sarah Filippi, Lorin Crawford

    Abstract: While the success of deep neural networks (DNNs) is well-established across a variety of domains, our ability to explain and interpret these methods is limited. Unlike previously proposed local methods which try to explain particular classification decisions, we focus on global interpretability and ask a universally applicable question: given a trained model, which features are the most important?… ▽ More

    Submitted 28 April, 2020; v1 submitted 28 January, 2019; originally announced January 2019.

  8. arXiv:1801.07318  [pdf, other

    stat.ME q-bio.QM stat.AP stat.ML

    Variable Prioritization in Nonlinear Black Box Methods: A Genetic Association Case Study

    Authors: Lorin Crawford, Seth R. Flaxman, Daniel E. Runcie, Mike West

    Abstract: The central aim in this paper is to address variable selection questions in nonlinear and nonparametric regression. Motivated by statistical genetics, where nonlinear interactions are of particular interest, we introduce a novel and interpretable way to summarize the relative importance of predictor variables. Methodologically, we develop the "RelATive cEntrality" (RATE) measure to prioritize cand… ▽ More

    Submitted 26 August, 2018; v1 submitted 22 January, 2018; originally announced January 2018.

    Comments: 28 pages, 5 figures, 1 tables; Supplementary Material

  9. Predicting Clinical Outcomes in Glioblastoma: An Application of Topological and Functional Data Analysis

    Authors: Lorin Crawford, Anthea Monod, Andrew X. Chen, Sayan Mukherjee, Raúl Rabadán

    Abstract: Glioblastoma multiforme (GBM) is an aggressive form of human brain cancer that is under active study in the field of cancer biology. Its rapid progression and the relative time cost of obtaining molecular data make other readily-available forms of data, such as images, an important resource for actionable measures in patients. Our goal is to utilize information given by medical images taken from G… ▽ More

    Submitted 12 September, 2019; v1 submitted 21 November, 2016; originally announced November 2016.

    Comments: 30 pages, 9 figures, 1 table

    Journal ref: Journal of the American Statistical Association (2019)

  10. arXiv:1508.01217  [pdf, other

    stat.ME q-bio.QM stat.AP stat.ML

    Bayesian Approximate Kernel Regression with Variable Selection

    Authors: Lorin Crawford, Kris C. Wood, Xiang Zhou, Sayan Mukherjee

    Abstract: Nonlinear kernel regression models are often used in statistics and machine learning because they are more accurate than linear models. Variable selection for kernel regression models is a challenge partly because, unlike the linear regression setting, there is no clear concept of an effect size for regression coefficients. In this paper, we propose a novel framework that provides an effect size a… ▽ More

    Submitted 9 June, 2017; v1 submitted 5 August, 2015; originally announced August 2015.

    Comments: 22 pages, 3 figures, 3 tables; theory added; new simulations presented; references added