Skip to main content

Showing 1–32 of 32 results for author: Miller, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2402.14959  [pdf, other

    stat.AP cs.CY stat.ML

    A Causal Framework to Evaluate Racial Bias in Law Enforcement Systems

    Authors: Jessy Xinyi Han, Andrew Miller, S. Craig Watkins, Christopher Winship, Fotini Christia, Devavrat Shah

    Abstract: We are interested in develo** a data-driven method to evaluate race-induced biases in law enforcement systems. While the recent works have addressed this question in the context of police-civilian interactions using police stop data, they have two key limitations. First, bias can only be properly quantified if true criminality is accounted for in addition to race, but it is absent in prior works… ▽ More

    Submitted 20 March, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

  2. arXiv:2310.18261  [pdf, other

    stat.ME stat.ML

    Label Shift Estimators for Non-Ignorable Missing Data

    Authors: Andrew C. Miller, Joseph Futoma

    Abstract: We consider the problem of estimating the mean of a random variable Y subject to non-ignorable missingness, i.e., where the missingness mechanism depends on Y . We connect the auxiliary proxy variable framework for non-ignorable missingness (West and Little, 2013) to the label shift setting (Saerens et al., 2002). Exploiting this connection, we construct an estimator for non-ignorable missing data… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: 8 pages, 5 figures

  3. arXiv:2307.13918  [pdf, other

    stat.ML cs.LG q-bio.QM

    Simulation-based Inference for Cardiovascular Models

    Authors: Antoine Wehenkel, Jens Behrmann, Andrew C. Miller, Guillermo Sapiro, Ozan Sener, Marco Cuturi, Jörn-Henrik Jacobsen

    Abstract: Over the past decades, hemodynamics simulators have steadily evolved and have become tools of choice for studying cardiovascular systems in-silico. While such tools are routinely used to simulate whole-body hemodynamics from physiological parameters, solving the corresponding inverse problem of map** waveforms back to plausible physiological parameters remains both promising and challenging. Mot… ▽ More

    Submitted 29 July, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

  4. arXiv:2202.06797  [pdf, other

    astro-ph.GA stat.AP

    Map** Interstellar Dust with Gaussian Processes

    Authors: Andrew C. Miller, Lauren Anderson, Boris Leistedt, John P. Cunningham, David W. Hogg, David M. Blei

    Abstract: Interstellar dust corrupts nearly every stellar observation, and accounting for it is crucial to measuring physical properties of stars. We model the dust distribution as a spatially varying latent field with a Gaussian process (GP) and develop a likelihood model and inference method that scales to millions of astronomical observations. Modeling interstellar dust is complicated by two factors. The… ▽ More

    Submitted 14 February, 2022; originally announced February 2022.

  5. arXiv:2112.00881  [pdf, other

    cs.LG stat.ML

    Learning Invariant Representations with Missing Data

    Authors: Mark Goldstein, Jörn-Henrik Jacobsen, Olina Chau, Adriel Saporta, Aahlad Puli, Rajesh Ranganath, Andrew C. Miller

    Abstract: Spurious correlations allow flexible models to predict well during training but poorly on related test distributions. Recent work has shown that models that satisfy particular independencies involving correlation-inducing \textit{nuisance} variables have guarantees on their test performance. Enforcing such independencies requires nuisances to be observed during training. However, nuisances, such a… ▽ More

    Submitted 8 June, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

    Comments: CLeaR (Causal Learning and Reasoning) 2022

  6. arXiv:2104.12231  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Model-based metrics: Sample-efficient estimates of predictive model subpopulation performance

    Authors: Andrew C. Miller, Leon A. Gatys, Joseph Futoma, Emily B. Fox

    Abstract: Machine learning models $-$ now commonly developed to screen, diagnose, or predict health conditions $-$ are evaluated with a variety of performance metrics. An important first step in assessing the practical utility of a model is to evaluate its average performance over an entire population of interest. In many settings, it is also critical that the model makes good predictions within predefined… ▽ More

    Submitted 25 April, 2021; originally announced April 2021.

    Comments: 27 pages, 8 figures

  7. arXiv:2104.12219  [pdf, other

    stat.ML cs.LG stat.ME

    Breiman's two cultures: You don't have to choose sides

    Authors: Andrew C. Miller, Nicholas J. Foti, Emily B. Fox

    Abstract: Breiman's classic paper casts data analysis as a choice between two cultures: data modelers and algorithmic modelers. Stated broadly, data modelers use simple, interpretable models with well-understood theoretical properties to analyze data. Algorithmic modelers prioritize predictive accuracy and use more flexible function approximations to analyze data. This dichotomy overlooks a third set of mod… ▽ More

    Submitted 25 April, 2021; originally announced April 2021.

    Comments: Commentary to appear in a special issue of Observational Studies, discussing Leo Breiman's paper "Statistical Modeling: The Two Cultures" (https://doi.org/10.1214/ss/1009213726)

  8. arXiv:2103.01992  [pdf, other

    cs.LG stat.AP stat.ME

    Improving Neural Networks for Time Series Forecasting using Data Augmentation and AutoML

    Authors: Indrajeet Y. Javeri, Mohammadhossein Toutiaee, Ismailcem B. Arpinar, Tom W. Miller, John A. Miller

    Abstract: Statistical methods such as the Box-Jenkins method for time-series forecasting have been prominent since their development in 1970. Many researchers rely on such models as they can be efficiently estimated and also provide interpretability. However, advances in machine learning research indicate that neural networks can be powerful data modeling techniques, as they can give higher accuracy for a p… ▽ More

    Submitted 7 May, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

  9. arXiv:2103.00393  [pdf, other

    cs.LG stat.ML

    Hierarchical Inducing Point Gaussian Process for Inter-domain Observations

    Authors: Luhuan Wu, Andrew Miller, Lauren Anderson, Geoff Pleiss, David Blei, John Cunningham

    Abstract: We examine the general problem of inter-domain Gaussian Processes (GPs): problems where the GP realization and the noisy observations of that realization lie on different domains. When the map** between those domains is linear, such as integration or differentiation, inference is still closed form. However, many of the scaling and approximation techniques that our community has developed do not… ▽ More

    Submitted 24 June, 2021; v1 submitted 27 February, 2021; originally announced March 2021.

  10. arXiv:2012.00110  [pdf, other

    stat.ML cs.LG stat.AP

    Representing and Denoising Wearable ECG Recordings

    Authors: Jeffrey Chan, Andrew C. Miller, Emily B. Fox

    Abstract: Modern wearable devices are embedded with a range of noninvasive biomarker sensors that hold promise for improving detection and treatment of disease. One such sensor is the single-lead electrocardiogram (ECG) which measures electrical signals in the heart. The benefits of the sheer volume of ECG measurements with rich longitudinal structure made possible by wearables come at the price of potentia… ▽ More

    Submitted 30 November, 2020; originally announced December 2020.

    Comments: ML for Mobile Health Workshop, NeurIPS 2020

  11. arXiv:2009.07330  [pdf, other

    physics.comp-ph cs.LG math.OC physics.plasm-ph stat.ML

    Training neural networks under physical constraints using a stochastic augmented Lagrangian approach

    Authors: Alp Dener, Marco Andres Miller, Randy Michael Churchill, Todd Munson, Choong-Seock Chang

    Abstract: We investigate the physics-constrained training of an encoder-decoder neural network for approximating the Fokker-Planck-Landau collision operator in the 5-dimensional kinetic fusion simulation in XGC. To train this network, we propose a stochastic augmented Lagrangian approach that utilizes pyTorch's native stochastic gradient descent method to solve the inner unconstrained minimization subproble… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

  12. arXiv:2008.02852  [pdf, other

    stat.ML cs.LG stat.AP

    Learning Insulin-Glucose Dynamics in the Wild

    Authors: Andrew C. Miller, Nicholas J. Foti, Emily Fox

    Abstract: We develop a new model of insulin-glucose dynamics for forecasting blood glucose in type 1 diabetics. We augment an existing biomedical model by introducing time-varying dynamics driven by a machine learning sequence model. Our model maintains a physiologically plausible inductive bias and clinically interpretable parameters -- e.g., insulin sensitivity -- while inheriting the flexibility of moder… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

    Comments: Machine Learning for Healthcare 2020

  13. arXiv:2006.13760  [pdf, other

    cs.LG cs.AI cs.CL cs.NE stat.ML

    The NetHack Learning Environment

    Authors: Heinrich Küttler, Nantas Nardelli, Alexander H. Miller, Roberta Raileanu, Marco Selvatici, Edward Grefenstette, Tim Rocktäschel

    Abstract: Progress in Reinforcement Learning (RL) algorithms goes hand-in-hand with the development of challenging environments that test the limits of current methods. While existing RL environments are either sufficiently complex or based on fast simulation, they are rarely both. Here, we present the NetHack Learning Environment (NLE), a scalable, procedurally generated, stochastic, rich, and challenging… ▽ More

    Submitted 1 December, 2020; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: 28 pages. Accepted at NeurIPS 2020

  14. Simulation-free estimation of an individual-based SEIR model for evaluating nonpharmaceutical interventions with an application to COVID-19 in Iowa

    Authors: Daniel K. Sewell, Aaron Miller

    Abstract: The ongoing COVID-19 pandemic has overwhelmingly demonstrated the need to accurately evaluate the effects of implementing new or altering existing nonpharmaceutical interventions. Since these interventions applied at the societal level cannot be evaluated through traditional experimental means, public health officials and other decision makers must rely on statistical and mathematical epidemiologi… ▽ More

    Submitted 2 November, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

  15. arXiv:2005.02157  [pdf, other

    cs.CV cs.LG stat.ML

    Stereotype-Free Classification of Fictitious Faces

    Authors: Mohammadhossein Toutiaee, Soheyla Amirian, John A. Miller, Sheng Li

    Abstract: Equal Opportunity and Fairness are receiving increasing attention in artificial intelligence. Stereoty** is another source of discrimination, which yet has been unstudied in literature. GAN-made faces would be exposed to such discrimination, if they are classified by human perception. It is possible to eliminate the human impact on fictitious faces classification task by the use of statistical a… ▽ More

    Submitted 29 April, 2020; originally announced May 2020.

  16. arXiv:2003.05822  [pdf, other

    cs.LG stat.ML

    Topological Effects on Attacks Against Vertex Classification

    Authors: Benjamin A. Miller, Mustafa Çamurcu, Alexander J. Gomez, Kevin Chan, Tina Eliassi-Rad

    Abstract: Vertex classification is vulnerable to perturbations of both graph topology and vertex attributes, as shown in recent research. As in other machine learning domains, concerns about robustness to adversarial manipulation can prevent potential users from adopting proposed methods when the consequence of action is very high. This paper considers two topological characteristics of graphs and explores… ▽ More

    Submitted 12 March, 2020; originally announced March 2020.

    Comments: 17 pages, 11 figures

  17. arXiv:1910.04054  [pdf, other

    cs.LG cs.DC cs.NI stat.ML

    MVFST-RL: An Asynchronous RL Framework for Congestion Control with Delayed Actions

    Authors: Viswanath Sivakumar, Olivier Delalleau, Tim Rocktäschel, Alexander H. Miller, Heinrich Küttler, Nantas Nardelli, Mike Rabbat, Joelle Pineau, Sebastian Riedel

    Abstract: Effective network congestion control strategies are key to kee** the Internet (or any large computer network) operational. Network congestion control has been dominated by hand-crafted heuristics for decades. Recently, ReinforcementLearning (RL) has emerged as an alternative to automatically optimize such control strategies. Research so far has primarily considered RL interfaces which block the… ▽ More

    Submitted 26 May, 2021; v1 submitted 9 October, 2019; originally announced October 2019.

    Comments: Workshop on ML for Systems at NeurIPS 2019

  18. arXiv:1905.04545  [pdf, ps, other

    cs.NE cs.LG stat.ML

    Deep Learning: a new definition of artificial neuron with double weight

    Authors: Adriano Baldeschi, Raffaella Margutti, Adam Miller

    Abstract: Deep learning is a subset of a broader family of machine learning methods based on learning data representations. These models are inspired by human biological nervous systems, even if there are various differences pertaining to the structural and functional properties of biological brains. The elementary constituents of deep learning models are neurons, which can be considered as functions that r… ▽ More

    Submitted 20 May, 2019; v1 submitted 11 May, 2019; originally announced May 2019.

  19. arXiv:1812.00210  [pdf, other

    stat.ML cs.LG

    Measuring the Stability of EHR- and EKG-based Predictive Models

    Authors: Andrew C. Miller, Ziad Obermeyer, Sendhil Mullainathan

    Abstract: Databases of electronic health records (EHRs) are increasingly used to inform clinical decisions. Machine learning methods can find patterns in EHRs that are predictive of future adverse outcomes. However, statistical models may be built upon patterns of health-seeking behavior that vary across patient subpopulations, leading to poor predictive performance when training on one patient population a… ▽ More

    Submitted 1 December, 2018; originally announced December 2018.

    Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:cs/0101200

    Report number: ML4H/2018/188

  20. arXiv:1812.00209  [pdf, other

    stat.ML cs.LG q-bio.QM

    A Probabilistic Model of Cardiac Physiology and Electrocardiograms

    Authors: Andrew C. Miller, Ziad Obermeyer, David M. Blei, John P. Cunningham, Sendhil Mullainathan

    Abstract: An electrocardiogram (EKG) is a common, non-invasive test that measures the electrical activity of a patient's heart. EKGs contain useful diagnostic information about patient health that may be absent from other electronic health record (EHR) data. As multi-dimensional waveforms, they could be modeled using generic machine learning tools, such as a linear factor model or a variational autoencoder.… ▽ More

    Submitted 1 December, 2018; originally announced December 2018.

    Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:cs/0101200

    Report number: ML4H/2018/97

  21. arXiv:1803.00113  [pdf, other

    stat.AP astro-ph.IM cs.LG stat.ML

    Approximate Inference for Constructing Astronomical Catalogs from Images

    Authors: Jeffrey Regier, Andrew C. Miller, David Schlegel, Ryan P. Adams, Jon D. McAuliffe, Prabhat

    Abstract: We present a new, fully generative model for constructing astronomical catalogs from optical telescope image sets. Each pixel intensity is treated as a random variable with parameters that depend on the latent properties of stars and galaxies. These latent properties are themselves modeled as random. We compare two procedures for posterior inference. One procedure is based on Markov chain Monte Ca… ▽ More

    Submitted 9 April, 2019; v1 submitted 28 February, 2018; originally announced March 2018.

    Comments: accepted to the Annals of Applied Statistics

    MSC Class: 62P35 ACM Class: G.3

  22. arXiv:1802.02550  [pdf, other

    stat.ML cs.CL cs.LG

    Semi-Amortized Variational Autoencoders

    Authors: Yoon Kim, Sam Wiseman, Andrew C. Miller, David Sontag, Alexander M. Rush

    Abstract: Amortized variational inference (AVI) replaces instance-specific local inference with a global inference network. While AVI has enabled efficient training of deep generative models such as variational autoencoders (VAE), recent empirical work suggests that inference networks can produce suboptimal variational parameters. We propose a hybrid approach, to use AVI to initialize the variational parame… ▽ More

    Submitted 23 July, 2018; v1 submitted 7 February, 2018; originally announced February 2018.

    Comments: ICML 2018

  23. arXiv:1705.07880  [pdf, other

    stat.ML stat.CO stat.ME

    Reducing Reparameterization Gradient Variance

    Authors: Andrew C. Miller, Nicholas J. Foti, Alexander D'Amour, Ryan P. Adams

    Abstract: Optimization with noisy gradients has become ubiquitous in statistics and machine learning. Reparameterization gradients, or gradient estimates computed via the "reparameterization trick," represent a class of noisy gradients often used in Monte Carlo variational inference (MCVI). However, when these gradient estimators are too noisy, the optimization procedure can be slow or fail to converge. One… ▽ More

    Submitted 22 May, 2017; originally announced May 2017.

  24. arXiv:1611.06585  [pdf, other

    stat.ML cs.LG stat.ME

    Variational Boosting: Iteratively Refining Posterior Approximations

    Authors: Andrew C. Miller, Nicholas Foti, Ryan P. Adams

    Abstract: We propose a black-box variational inference method to approximate intractable distributions with an increasingly rich approximating class. Our method, termed variational boosting, iteratively refines an existing variational approximation by solving a sequence of optimization problems, allowing the practitioner to trade computation time for accuracy. We show how to expand the variational approxima… ▽ More

    Submitted 19 February, 2017; v1 submitted 20 November, 2016; originally announced November 2016.

    Comments: 25 pages, 9 figures, 2 tables

  25. arXiv:1610.08466  [pdf, other

    stat.ML

    Recurrent switching linear dynamical systems

    Authors: Scott W. Linderman, Andrew C. Miller, Ryan P. Adams, David M. Blei, Liam Paninski, Matthew J. Johnson

    Abstract: Many natural systems, such as neurons firing in the brain or basketball teams traversing a court, give rise to time series data with complex, nonlinear dynamics. We can gain insight into these systems by decomposing the data into segments that are each explained by simpler dynamic units. Building on switching linear dynamical systems (SLDS), we present a new model class that not only discovers the… ▽ More

    Submitted 26 October, 2016; originally announced October 2016.

    Comments: 15 pages, 6 figures

  26. arXiv:1607.08891  [pdf

    stat.ML q-bio.NC

    Assessing Functional Neural Connectivity as an Indicator of Cognitive Performance

    Authors: Brian S. Helfer, James R. Williamson, Benjamin A. Miller, Joseph Perricone, Thomas F. Quatieri

    Abstract: Studies in recent years have demonstrated that neural organization and structure impact an individual's ability to perform a given task. Specifically, individuals with greater neural efficiency have been shown to outperform those with less organized functional structure. In this work, we compare the predictive ability of properties of neural connectivity on a working memory task. We provide two no… ▽ More

    Submitted 29 July, 2016; originally announced July 2016.

    Comments: Oral presentation at MLINI 2015 - 5th NIPS Workshop on Machine Learning and Interpretation in Neuroimaging (arXiv:1605.04435)

    Report number: MLINI/2015/17

  27. arXiv:1506.01351  [pdf

    astro-ph.IM stat.ML

    Celeste: Variational inference for a generative model of astronomical images

    Authors: Jeffrey Regier, Andrew Miller, Jon McAuliffe, Ryan Adams, Matt Hoffman, Dustin Lang, David Schlegel, Prabhat

    Abstract: We present a new, fully generative model of optical telescope image sets, along with a variational procedure for inference. Each pixel intensity is treated as a Poisson random variable, with a rate parameter dependent on latent properties of stars and galaxies. Key latent properties are themselves random, with scientific prior distributions constructed from large ancillary data sets. We check our… ▽ More

    Submitted 3 June, 2015; originally announced June 2015.

    Comments: in the Proceedings of the 32nd International Conference on Machine Learning (2015)

    MSC Class: 62P35; 85A35; 68T01 ACM Class: G.3

  28. Characterizing the spatial structure of defensive skill in professional basketball

    Authors: Alexander Franks, Andrew Miller, Luke Bornn, Kirk Goldsberry

    Abstract: Although basketball is a dualistic sport, with all players competing on both offense and defense, almost all of the sport's conventional metrics are designed to summarize offensive play. As a result, player valuations are largely based on offensive performances and to a much lesser degree on defensive ones. Steals, blocks and defensive rebounds provide only a limited summary of defensive effective… ▽ More

    Submitted 28 May, 2015; v1 submitted 1 May, 2014; originally announced May 2014.

    Comments: Published at http://dx.doi.org/10.1214/14-AOAS799 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS799

    Journal ref: Annals of Applied Statistics 2015, Vol. 9, No. 1, 94-121

  29. A Spectral Framework for Anomalous Subgraph Detection

    Authors: Benjamin A. Miller, Michelle S. Beard, Patrick J. Wolfe, Nadya T. Bliss

    Abstract: A wide variety of application domains are concerned with data consisting of entities and their relationships or connections, formally represented as graphs. Within these diverse application areas, a common problem of interest is the detection of a subset of entities whose connectivity is anomalous with respect to the rest of the data. While the detection of such anomalous subgraphs has received a… ▽ More

    Submitted 22 October, 2014; v1 submitted 29 January, 2014; originally announced January 2014.

    Comments: In submission to the IEEE, 16 pages, 8 figures

    Journal ref: IEEE Trans. Signal Process. 63 (2015) 4191-4206

  30. arXiv:1401.0942  [pdf, other

    stat.ML stat.AP

    Factorized Point Process Intensities: A Spatial Analysis of Professional Basketball

    Authors: Andrew Miller, Luke Bornn, Ryan Adams, Kirk Goldsberry

    Abstract: We develop a machine learning approach to represent and analyze the underlying spatial structure that governs shot selection among professional basketball players in the NBA. Typically, NBA players are discussed and compared in an heuristic, imprecise manner that relies on unmeasured intuitions about player behavior. This makes it difficult to draw comparisons between players and make accurate pla… ▽ More

    Submitted 7 January, 2014; v1 submitted 5 January, 2014; originally announced January 2014.

    Comments: 13 pages, 6 figures, fixed formatting issues

  31. arXiv:1204.4180  [pdf, other

    astro-ph.IM astro-ph.SR stat.AP

    Construction of a Calibrated Probabilistic Classification Catalog: Application to 50k Variable Sources in the All-Sky Automated Survey

    Authors: Joseph W. Richards, Dan L. Starr, Adam A. Miller, Joshua S. Bloom, Nathaniel R. Butler, Henrik Brink, Arien Crellin-Quick

    Abstract: With growing data volumes from synoptic surveys, astronomers must become more abstracted from the discovery and introspection processes. Given the scarcity of follow-up resources, there is a particularly sharp onus on the frameworks that replace these human roles to provide accurate and well-calibrated probabilistic classification catalogs. Such catalogs inform the subsequent follow-up, allowing c… ▽ More

    Submitted 24 April, 2012; v1 submitted 18 April, 2012; originally announced April 2012.

    Comments: 56 pages, 15 figures, 8 tables, submitted. The Machine-learned ASAS Classification Catalog is available at http://www.bigmacc.info

  32. arXiv:1106.2832  [pdf, other

    astro-ph.IM stat.AP

    Active Learning to Overcome Sample Selection Bias: Application to Photometric Variable Star Classification

    Authors: Joseph W. Richards, Dan L. Starr, Henrik Brink, Adam A. Miller, Joshua S. Bloom, Nathaniel R. Butler, J. Berian James, James P. Long, John Rice

    Abstract: Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby obje… ▽ More

    Submitted 17 June, 2011; v1 submitted 14 June, 2011; originally announced June 2011.

    Comments: 43 pages, 11 figures, submitted to ApJ