Skip to main content

Showing 1–50 of 85 results for author: Taylor, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.06768  [pdf, other

    stat.ME cs.LG econ.EM q-bio.QM

    Data-Driven Switchback Experiments: Theoretical Tradeoffs and Empirical Bayes Designs

    Authors: Ruoxuan Xiong, Alex Chin, Sean J. Taylor

    Abstract: We study the design and analysis of switchback experiments conducted on a single aggregate unit. The design problem is to partition the continuous time space into intervals and switch treatments between intervals, in order to minimize the estimation error of the treatment effect. We show that the estimation error depends on four factors: carryover effects, periodicity, serially correlated outcomes… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  2. arXiv:2405.00179  [pdf, other

    stat.ME

    A Bayesian joint longitudinal-survival model with a latent stochastic process for intensive longitudinal data

    Authors: Madeline R. Abbott, Walter H. Dempsey, Inbal Nahum-Shani, Lindsey N. Potter, David W. Wetter, Cho Y. Lam, Jeremy M. G. Taylor

    Abstract: The availability of mobile health (mHealth) technology has enabled increased collection of intensive longitudinal data (ILD). ILD have potential to capture rapid fluctuations in outcomes that may be associated with changes in the risk of an event. However, existing methods for jointly modeling longitudinal and event-time outcomes are not well-equipped to handle ILD due to the high computational co… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: Main text is 32 pages with 6 figures. Supplementary material is 21 pages

  3. arXiv:2401.12911  [pdf, other

    stat.ME

    Pretraining and the Lasso

    Authors: Erin Craig, Mert Pilanci, Thomas Le Menestrel, Balasubramanian Narasimhan, Manuel Rivas, Roozbeh Dehghannasiri, Julia Salzman, Jonathan Taylor, Robert Tibshirani

    Abstract: Pretraining is a popular and powerful paradigm in machine learning. As an example, suppose one has a modest-sized dataset of images of cats and dogs, and plans to fit a deep neural network to classify them from the pixel features. With pretraining, we start with a neural network trained on a large corpus of images, consisting of not just cats and dogs but hundreds of other image types. Then we fix… ▽ More

    Submitted 18 April, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

  4. arXiv:2310.10740  [pdf, other

    stat.ME

    Unbiased Estimation of Structured Prediction Error

    Authors: Kevin Fry, Jonathan E. Taylor

    Abstract: Many modern datasets, such as those in ecology and geology, are composed of samples with spatial structure and dependence. With such data violating the usual independent and identically distributed (IID) assumption in machine learning and classical statistics, it is unclear a priori how one should measure the performance and generalization of models. Several authors have empirically investigated c… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 28 pages, 13 figures

  5. arXiv:2309.11472  [pdf, other

    stat.ME

    Optimizing Dynamic Predictions from Joint Models using Super Learning

    Authors: Dimitris Rizopoulos, Jeremy M. G. Taylor

    Abstract: Joint models for longitudinal and time-to-event data are often employed to calculate dynamic individualized predictions used in numerous applications of precision medicine. Two components of joint models that influence the accuracy of these predictions are the shape of the longitudinal trajectories and the functional form linking the longitudinal outcome history to the hazard of the event. Finding… ▽ More

    Submitted 1 December, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

  6. arXiv:2309.07435  [pdf, other

    stat.ME

    Uncertainty Intervals for Prediction Errors in Time Series Forecasting

    Authors: Hui Xu, Song Mei, Stephen Bates, Jonathan Taylor, Robert Tibshirani

    Abstract: Inference for prediction errors is critical in time series forecasting pipelines. However, providing statistically meaningful uncertainty intervals for prediction errors remains relatively under-explored. Practitioners often resort to forward cross-validation (FCV) for obtaining point estimators and constructing confidence intervals based on the Central Limit Theorem (CLT). The naive version assum… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: 35 pages, 17 figures

  7. arXiv:2309.02115  [pdf, ps, other

    stat.AP

    Using Joint Models for Longitudinal and Time-to-Event Data to Investigate the Causal Effect of Salvage Therapy after Prostatectomy

    Authors: Dimitris Rizopoulos, Jeremy M. G. Taylor, Grigorios Papageorgiou, Todd M. Morgan

    Abstract: Prostate cancer patients who undergo prostatectomy are closely monitored for recurrence and metastasis using routine prostate-specific antigen (PSA) measurements. When PSA levels rise, salvage therapies are recommended to decrease the risk of metastasis. However, due to the side effects of these therapies and to avoid over-treatment, it is important to understand which patients and when to initiat… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  8. arXiv:2307.15681  [pdf, other

    stat.ME

    A Continuous-Time Dynamic Factor Model for Intensive Longitudinal Data Arising from Mobile Health Studies

    Authors: Madeline R. Abbott, Walter H. Dempsey, Inbal Nahum-Shani, Cho Y. Lam, David W. Wetter, Jeremy M. G. Taylor

    Abstract: Intensive longitudinal data (ILD) collected in mobile health (mHealth) studies contain rich information on multiple outcomes measured frequently over time that have the potential to capture short-term and long-term dynamics. Motivated by an mHealth study of smoking cessation in which participants self-report the intensity of many emotions multiple times per day, we describe a dynamic factor model… ▽ More

    Submitted 20 February, 2024; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: Main text is 20 pages with 5 figures and 1 table. Supplementary material is 26 pages

  9. arXiv:2306.04675  [pdf, other

    cs.LG cs.CV stat.ML

    Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models

    Authors: George Stein, Jesse C. Cresswell, Rasa Hosseinzadeh, Yi Sui, Brendan Leigh Ross, Valentin Villecroze, Zhaoyan Liu, Anthony L. Caterini, J. Eric T. Taylor, Gabriel Loaiza-Ganem

    Abstract: We systematically study a wide variety of generative models spanning semantically-diverse image datasets to understand and improve the feature extractors and metrics used to evaluate them. Using best practices in psychophysics, we measure human perception of image realism for generated samples by conducting the largest experiment evaluating generative models to date, and find that no existing metr… ▽ More

    Submitted 30 October, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023. 53 pages, 29 figures, 12 tables. Code at https://github.com/layer6ai-labs/dgm-eval, reviews at https://openreview.net/forum?id=08zf7kTOoh

    Journal ref: Thirty-seventh Conference on Neural Information Processing Systems (2023)

  10. arXiv:2305.16735  [pdf, other

    stat.ME stat.AP

    Angular Combining of Forecasts of Probability Distributions

    Authors: James W. Taylor, Xiaochun Meng

    Abstract: When multiple forecasts are available for a probability distribution, forecast combining enables a pragmatic synthesis of the available information to extract the wisdom of the crowd. A linear opinion pool has been widely used, whereby the combining is applied to the probability predictions of the distributional forecasts. However, it has been argued that this will tend to deliver overdispersed di… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 47 pages, 16 figures

    MSC Class: 90B50

  11. arXiv:2212.12940  [pdf, other

    stat.ME stat.CO stat.ML

    Exact Selective Inference with Randomization

    Authors: Snigdha Panigrahi, Kevin Fry, Jonathan Taylor

    Abstract: We introduce a pivot for exact selective inference with randomization. Not only does our pivot lead to exact inference in Gaussian regression models, but it is also available in closed form. We reduce the problem of exact selective inference to a bivariate truncated Gaussian distribution. By doing so, we give up some power that is achieved with approximate maximum likelihood estimation in Panigrah… ▽ More

    Submitted 22 December, 2023; v1 submitted 25 December, 2022; originally announced December 2022.

    Comments: 48 pages, 8 Figures, 2 Tables

  12. arXiv:2211.15826  [pdf, other

    stat.ME

    Surrogacy Validation for Time-to-Event Outcomes with Illness-Death Frailty Models

    Authors: Emily K. Roberts, Michael R. Elliott, Jeremy M. G. Taylor

    Abstract: A common practice in clinical trials is to evaluate a treatment effect on an intermediate endpoint when the true outcome of interest would be difficult or costly to measure. We consider how to validate intermediate endpoints in a causally-valid way when the trial outcomes are time-to-event. Using counterfactual outcomes, those that would be observed if the counterfactual treatment had been given,… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

  13. arXiv:2209.00181  [pdf, other

    stat.ME stat.AP

    Understanding the dynamic impact of COVID-19 through competing risk modeling with bivariate varying coefficients

    Authors: Wenbo Wu, John D. Kalbfleisch, Jeremy M. G. Taylor, Jian Kang, Kevin He

    Abstract: The coronavirus disease 2019 (COVID-19) pandemic has exerted a profound impact on patients with end-stage renal disease relying on kidney dialysis to sustain their lives. Motivated by a request by the U.S. Centers for Medicare & Medicaid Services, our analysis of their postdischarge hospital readmissions and deaths in 2020 revealed that the COVID-19 effect has varied significantly with postdischar… ▽ More

    Submitted 31 August, 2022; originally announced September 2022.

    Comments: 40 pages, 8 figures, 1 table

  14. arXiv:2203.14504  [pdf, other

    stat.ME stat.ML

    Black-box Selective Inference via Bootstrap**

    Authors: Sifan Liu, Jelena Markovic-Voronov, Jonathan Taylor

    Abstract: Conditional selective inference requires an exact characterization of the selection event, which is often unavailable except for a few examples like the lasso. This work addresses this challenge by introducing a generic approach to estimate the selection event, facilitating feasible inference conditioned on the selection event. The method proceeds by repeatedly generating bootstrap data and runnin… ▽ More

    Submitted 20 August, 2023; v1 submitted 28 March, 2022; originally announced March 2022.

  15. arXiv:2108.02118  [pdf, other

    math.PR stat.ME

    The volume-of-tube method for Gaussian random fields with inhomogeneous variance

    Authors: Satoshi Kuriki, Akimichi Takemura, Jonathan E. Taylor

    Abstract: The tube method or the volume-of-tube method approximates the tail probability of the maximum of a smooth Gaussian random field with zero mean and unit variance. This method evaluates the volume of a spherical tube about the index set, and then transforms it to the tail probability. In this study, we generalize the tube method to a case in which the variance is not constant. We provide the volume… ▽ More

    Submitted 9 September, 2021; v1 submitted 4 August, 2021; originally announced August 2021.

    Comments: 30 pages, 3 figures

    MSC Class: 62H10 (Primary); 60G60 (Secondary)

  16. A synthetic data integration framework to leverage external summary-level information from heterogeneous populations

    Authors: Tian Gu, Jeremy M. G. Taylor, Bhramar Mukherjee

    Abstract: There is a growing need for flexible general frameworks that integrate individual-level data with external summary information for improved statistical inference. External information relevant for a risk prediction model may come in multiple forms, through regression coefficient estimates or predicted values of the outcome variable. Different external models may use different sets of predictors an… ▽ More

    Submitted 1 June, 2022; v1 submitted 12 June, 2021; originally announced June 2021.

  17. arXiv:2104.12947  [pdf, other

    stat.ME

    Incorporating baseline covariates to validate surrogate endpoints with a constant biomarker under control arm

    Authors: Emily Roberts, Michael Elliott, Jeremy M. G. Taylor

    Abstract: A surrogate endpoint S in a clinical trial is an outcome that may be measured earlier or more easily than the true outcome of interest T. In this work, we extend causal inference approaches to validate such a surrogate using potential outcomes. The causal association paradigm assesses the relationship of the treatment effect on the surrogate with the treatment effect on the true endpoint. Using th… ▽ More

    Submitted 2 February, 2022; v1 submitted 26 April, 2021; originally announced April 2021.

  18. arXiv:2103.09577  [pdf, other

    cs.LG cs.CV stat.ML

    Theoretical bounds on data requirements for the ray-based classification

    Authors: Brian J. Weber, Sandesh S. Kalantre, Thomas McJunkin, Jacob M. Taylor, Justyna P. Zwolak

    Abstract: The problem of classifying high-dimensional shapes in real-world data grows in complexity as the dimension of the space increases. For the case of identifying convex shapes of different geometries, a new classification framework has recently been proposed in which the intersections of a set of one-dimensional representations, called rays, with the boundaries of the shape are used to identify the s… ▽ More

    Submitted 26 February, 2022; v1 submitted 17 March, 2021; originally announced March 2021.

    Comments: 10 pages, 5 figures

    MSC Class: 68T20; 68Q32; 68U10

    Journal ref: SN Comput. Sci. 3, 57 (2022)

  19. arXiv:2103.02033  [pdf, other

    stat.ME

    Multiple imputation with missing data indicators

    Authors: Lauren J Beesley, Irina Bondarenko, Michael R Elliott, Allison W Kurian, Steven J Katz, Jeremy M G Taylor

    Abstract: Multiple imputation is a well-established general technique for analyzing data with missing values. A convenient way to implement multiple imputation is sequential regression multiple imputation (SRMI), also called chained equations multiple imputation. In this approach, we impute missing values using regression models for each variable, conditional on the other variables in the data. This approac… ▽ More

    Submitted 2 March, 2021; originally announced March 2021.

    Comments: See also: Supplemental Material

  20. arXiv:2101.07954  [pdf, other

    stat.ME

    Accounting for not-at-random missingness through imputation stacking

    Authors: Lauren J Beesley, Jeremy M G Taylor

    Abstract: Not-at-random missingness presents a challenge in addressing missing data in many health research applications. In this paper, we propose a new approach to account for not-at-random missingness after multiple imputation through weighted analysis of stacked multiple imputations. The weights are easily calculated as a function of the imputed data and assumptions about the not-at-random missingness.… ▽ More

    Submitted 19 January, 2021; originally announced January 2021.

    Comments: See also: Supplementary Materials

  21. arXiv:2101.02354  [pdf, other

    stat.ME

    Kullback-Leibler-Based Discrete Failure Time Models for Integration of Published Prediction Models with New Time-To-Event Dataset

    Authors: Di Wang, Wen Ye, Randall Sung, Hui Jiang, Jeremy M. G. Taylor, Lisa Ly, Kevin He

    Abstract: Prediction of time-to-event data often suffers from rare event rates, small sample sizes, high dimensionality and low signal-to-noise ratios. Incorporating published prediction models from large-scale studies is expected to improve the performance of prognosis prediction on internal individual-level time-to-event data. However, existing integration approaches typically assume that underlying distr… ▽ More

    Submitted 28 July, 2022; v1 submitted 6 January, 2021; originally announced January 2021.

  22. arXiv:2010.16001  [pdf, other

    eess.SY cs.LG math.OC stat.ML

    Guaranteeing Safety of Learned Perception Modules via Measurement-Robust Control Barrier Functions

    Authors: Sarah Dean, Andrew J. Taylor, Ryan K. Cosner, Benjamin Recht, Aaron D. Ames

    Abstract: Modern nonlinear control theory seeks to develop feedback controllers that endow systems with properties such as safety and stability. The guarantees ensured by these controllers often rely on accurate estimates of the system state for determining control actions. In practice, measurement model uncertainty can lead to error in state estimates that degrades these guarantees. In this paper, we seek… ▽ More

    Submitted 29 October, 2020; originally announced October 2020.

  23. A meta-inference framework to integrate multiple external models into a current study

    Authors: Tian Gu, Jeremy M. G. Taylor, Bhramar Mukherjee

    Abstract: It is becoming increasingly common for researchers to consider incorporating external information from large studies to improve the accuracy of statistical inference instead of relying on a modestly sized dataset collected internally. With some new predictors only available internally, we aim to build improved regression models based on individual-level data from an "internal" study while incorpor… ▽ More

    Submitted 9 April, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

  24. arXiv:2010.00500  [pdf, other

    cs.LG cond-mat.mes-hall cs.CV quant-ph stat.ML

    Ray-based classification framework for high-dimensional data

    Authors: Justyna P. Zwolak, Sandesh S. Kalantre, Thomas McJunkin, Brian J. Weber, Jacob M. Taylor

    Abstract: While classification of arbitrary structures in high dimensions may require complete quantitative information, for simple geometrical structures, low-dimensional qualitative information about the boundaries defining the structures can suffice. Rather than using dense, multi-dimensional data, we propose a deep neural network (DNN) classification framework that utilizes a minimal collection of one-d… ▽ More

    Submitted 26 February, 2022; v1 submitted 1 October, 2020; originally announced October 2020.

    Journal ref: Proceedings of the Machine Learning and the Physical Sciences Workshop at NeurIPS 2020, Vancouver, Canada

  25. arXiv:2008.04257  [pdf, other

    stat.ME stat.AP

    Using Multiple Imputation to Classify Potential Outcomes Subgroups

    Authors: Yun Li, Irina Bondarenko, Michael R. Elliott, Timothy P. Hofer, Jeremy M. G. Taylor

    Abstract: With medical tests becoming increasingly available, concerns about over-testing and over-treatment dramatically increase. Hence, it is important to understand the influence of testing on treatment selection in general practice. Most statistical methods focus on average effects of testing on treatment decisions. However, this may be ill-advised, particularly for patient subgroups that tend not to b… ▽ More

    Submitted 10 August, 2020; originally announced August 2020.

  26. arXiv:2007.12158  [pdf, other

    cs.LG physics.geo-ph stat.ML

    Signal Enhancement for Magnetic Navigation Challenge Problem

    Authors: Albert R. Gnadt, Joseph Belarge, Aaron Canciani, Glenn Carl, Lauren Conger, Joseph Curro, Alan Edelman, Peter Morales, Aaron P. Nielsen, Michael F. O'Keeffe, Christopher V. Rackauckas, Jonathan Taylor, Allan B. Wollaber

    Abstract: Harnessing the magnetic field of the Earth for navigation has shown promise as a viable alternative to other navigation systems. A magnetic navigation system collects its own magnetic field data using a magnetometer and uses magnetic anomaly maps to determine the current location. The greatest challenge with magnetic navigation arises when the magnetic field measurements from the magnetometer enco… ▽ More

    Submitted 6 January, 2023; v1 submitted 23 July, 2020; originally announced July 2020.

    Comments: 12 pages, 2 figures. See https://github.com/MIT-AI-Accelerator/MagNav.jl for accompanying data and code

  27. arXiv:2007.11103  [pdf

    stat.AP

    A Comparison of Aggregation Methods for Probabilistic Forecasts of COVID-19 Mortality in the United States

    Authors: Kathryn S. Taylor, James W. Taylor

    Abstract: The COVID-19 pandemic has placed forecasting models at the forefront of health policy making. Predictions of mortality and hospitalization help governments meet planning and resource allocation challenges. In this paper, we consider the weekly forecasting of the cumulative mortality due to COVID-19 at the national and state level in the U.S. Optimal decision-making requires a forecast of a probabi… ▽ More

    Submitted 20 August, 2020; v1 submitted 21 July, 2020; originally announced July 2020.

    Comments: 32 pages, 11 figures, 5 tables

  28. Array Programming with NumPy

    Authors: Charles R. Harris, K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fernández del Río, Mark Wiebe, Pearu Peterson, Pierre Gérard-Marchant, Kevin Sheppard, Tyler Reddy, Warren Weckesser, Hameer Abbasi, Christoph Gohlke , et al. (1 additional authors not shown)

    Abstract: Array programming provides a powerful, compact, expressive syntax for accessing, manipulating, and operating on data in vectors, matrices, and higher-dimensional arrays. NumPy is the primary array programming library for the Python language. It plays an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, material sci… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

    Journal ref: Nature 585, 357 (2020)

  29. Probabilistic Forecasting of Patient Waiting Times in an Emergency Department

    Authors: Siddharth Arora, James W. Taylor, Ho-Yin Mak

    Abstract: We study the estimation of the probability distribution of individual patient waiting times in an emergency department (ED). Our feature-rich modelling allows for dynamic updating and refinement of waiting time estimates as patient- and ED-specific information (e.g., patient condition, ED congestion levels) is revealed during the waiting process. Aspects relating to communicating forecast uncertai… ▽ More

    Submitted 30 May, 2020; originally announced June 2020.

  30. arXiv:2005.13271  [pdf, other

    stat.ME

    Analysis of time-to-event for observational studies: Guidance to the use of intensity models

    Authors: Per Kragh Andersen, Maja Pohar Perme, Hans C van Houwelingen, Richard J Cook, Pierre Joly, Torben Martinussen, Jeremy MG Taylor, Michal Abrahamowicz, Terry M Therneau

    Abstract: This paper provides guidance for researchers with some mathematical background on the conduct of time-to-event analysis in observational studies based on intensity (hazard) models. Discussions of basic concepts like time axis, event definition and censoring are given. Hazard models are introduced, with special emphasis on the Cox proportional hazards regression model. We provide check lists that m… ▽ More

    Submitted 28 May, 2020; v1 submitted 27 May, 2020; originally announced May 2020.

    Comments: 28 pages, 12 figures. For associated Supplementary material, see http://publicifsv.sund.ku.dk/~pka/STRATOSTG8/

  31. arXiv:2003.06723  [pdf, other

    stat.ME

    Inferring Treatment Effects After Testing Instrument Strength in Linear Models

    Authors: Nan Bi, Hyunseung Kang, Jonathan Taylor

    Abstract: A common practice in IV studies is to check for instrument strength, i.e. its association to the treatment, with an F-test from regression. If the F-statistic is above some threshold, usually 10, the instrument is deemed to satisfy one of the three core IV assumptions and used to test for the treatment effect. However, in many cases, the inference on the treatment effect does not take into account… ▽ More

    Submitted 14 March, 2020; originally announced March 2020.

    Comments: 24 pages, 3 figures

  32. arXiv:2002.09578  [pdf, other

    math.ST q-fin.ST stat.ME

    Scores for Multivariate Distributions and Level Sets

    Authors: Xiaochun Meng, James W. Taylor, Souhaib Ben Taieb, Siran Li

    Abstract: Forecasts of multivariate probability distributions are required for a variety of applications. Scoring rules enable the evaluation of forecast accuracy, and comparison between forecasting methods. We propose a theoretical framework for scoring rules for multivariate distributions, which encompasses the existing quadratic score and multivariate continuous ranked probability score. We demonstrate h… ▽ More

    Submitted 21 June, 2023; v1 submitted 21 February, 2020; originally announced February 2020.

  33. arXiv:1911.03985  [pdf, other

    stat.ME stat.AP

    Inference After Selecting Plausibly Valid Instruments with Application to Mendelian Randomization

    Authors: Nan Bi, Hyunseung Kang, Jonathan Taylor

    Abstract: Mendelian randomization (MR) is a popular method in genetic epidemiology to estimate the effect of an exposure on an outcome by using genetic instruments. These instruments are often selected from a combination of prior knowledge from genome wide association studies (GWAS) and data-driven instrument selection procedures or tests. Unfortunately, when testing for the exposure effect, the instrument… ▽ More

    Submitted 10 November, 2019; originally announced November 2019.

  34. arXiv:1911.00515  [pdf, ps, other

    physics.med-ph cs.CV cs.LG eess.IV q-bio.QM stat.ML

    The reliability of a deep learning model in clinical out-of-distribution MRI data: a multicohort study

    Authors: Gustav Mårtensson, Daniel Ferreira, Tobias Granberg, Lena Cavallin, Ketil Oppedal, Alessandro Padovani, Irena Rektorova, Laura Bonanni, Matteo Pardini, Milica Kramberger, John-Paul Taylor, Jakub Hort, Jón Snædal, Jaime Kulisevsky, Frederic Blanc, Angelo Antonini, Patrizia Mecocci, Bruno Vellas, Magda Tsolaki, Iwona Kłoszewska, Hilkka Soininen, Simon Lovestone, Andrew Simmons, Dag Aarsland, Eric Westman

    Abstract: Deep learning (DL) methods have in recent years yielded impressive results in medical imaging, with the potential to function as clinical aid to radiologists. However, DL models in medical imaging are often trained on public research cohorts with images acquired with a single scanner or with strict protocol harmonization, which is not representative of a clinical setting. The aim of this study was… ▽ More

    Submitted 1 November, 2019; originally announced November 2019.

    Comments: 11 pages, 3 figures

  35. arXiv:1910.04625  [pdf, other

    stat.ME

    A stacked approach for chained equations multiple imputation incorporating the substantive model

    Authors: Lauren Beesley, Jeremy M G Taylor

    Abstract: Multiple imputation by chained equations (MICE) has emerged as a popular approach for handling missing data. A central challenge for applying MICE is determining how to incorporate outcome information into covariate imputation models, particularly for complicated outcomes. Often, we have a particular analysis model in mind, and we would like to ensure congeniality between the imputation and analys… ▽ More

    Submitted 10 October, 2019; originally announced October 2019.

  36. arXiv:1905.07357  [pdf, other

    cs.LG stat.ML

    Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep Feature Spaces

    Authors: Philipp Becker, Harit Pandya, Gregor Gebhardt, Cheng Zhao, James Taylor, Gerhard Neumann

    Abstract: In order to integrate uncertainty estimates into deep time-series modelling, Kalman Filters (KFs) (Kalman et al., 1960) have been integrated with deep learning models, however, such approaches typically rely on approximate inference techniques such as variational inference which makes learning more complex and often less scalable due to approximation errors. We propose a new deep approach to Kalma… ▽ More

    Submitted 17 May, 2019; originally announced May 2019.

    Comments: accepted at ICML 2019

  37. arXiv:1902.07884  [pdf, other

    stat.ME

    Approximate selective inference via maximum likelihood

    Authors: Snigdha Panigrahi, Jonathan Taylor

    Abstract: Several strategies have been developed recently to ensure valid inference after model selection; some of these are easy to compute, while others fare better in terms of inferential power. In this paper, we consider a selective inference framework for Gaussian data. We propose a new method for inference through approximate maximum likelihood estimation. Our goal is to: (i) achieve better inferentia… ▽ More

    Submitted 11 July, 2022; v1 submitted 21 February, 2019; originally announced February 2019.

    Comments: 63 Pages, 8 Figures

  38. arXiv:1902.07634  [pdf, other

    stat.AP stat.ME

    Active Matrix Factorization for Surveys

    Authors: Chelsea Zhang, Sean J. Taylor, Curtiss Cobb, Jasjeet Sekhon

    Abstract: Amid historically low response rates, survey researchers seek ways to reduce respondent burden while measuring desired concepts with precision. We propose to ask fewer questions of respondents and impute missing responses via probabilistic matrix factorization. A variance-minimizing active learning criterion chooses the most informative questions per respondent. In simulations of our matrix sampli… ▽ More

    Submitted 18 June, 2019; v1 submitted 20 February, 2019; originally announced February 2019.

  39. arXiv:1901.09973  [pdf, other

    stat.ME

    Inference after black box selection

    Authors: Jelena Markovic, Jonathan Taylor, Jeremy Taylor

    Abstract: We consider the problem of inference for parameters selected to report only after some algorithm, the canonical example being inference for model parameters after a model selection procedure. The conditional correction for selection requires knowledge of how the selection is affected by changes in the underlying data, and current research explicitly describes this selection. In this work, we assum… ▽ More

    Submitted 28 January, 2019; originally announced January 2019.

    Comments: 20 pages, 4 figures

  40. arXiv:1803.09590  [pdf

    stat.AP

    Rule-based Autoregressive Moving Average Models for Forecasting Load on Special Days: A Case Study for France

    Authors: Siddharth Arora, James W. Taylor

    Abstract: This paper presents a case study on short-term load forecasting for France, with emphasis on special days, such as public holidays. We investigate the generalisability to French data of a recently proposed approach, which generates forecasts for normal and special days in a coherent and unified framework, by incorporating subjective judgment in univariate statistical models using a rule-based meth… ▽ More

    Submitted 26 March, 2018; originally announced March 2018.

    Comments: 11 figures, 3 tables

  41. arXiv:1709.09636  [pdf, ps, other

    cs.SI physics.soc-ph stat.ME

    Randomized experiments to detect and estimate social influence in networks

    Authors: Sean J. Taylor, Dean Eckles

    Abstract: Estimation of social influence in networks can be substantially biased in observational studies due to homophily and network correlation in exposure to exogenous events. Randomized experiments, in which the researcher intervenes in the social system and uses randomization to determine how to do so, provide a methodology for credibly estimating of causal effects of social behaviors. In addition to… ▽ More

    Submitted 27 September, 2017; originally announced September 2017.

    Comments: Forthcoming in Spreading Dynamics in Social Systems

  42. Setpoint Tracking with Partially Observed Loads

    Authors: Antoine Lesage-Landry, Joshua A. Taylor

    Abstract: We use online convex optimization (OCO) for setpoint tracking with uncertain, flexible loads. We consider full feedback from the loads, bandit feedback, and two intermediate types of feedback: partial bandit where a subset of the loads are individually observed and the rest are observed in aggregate, and Bernoulli feedback where in each round the aggregator receives either full or bandit feedback… ▽ More

    Submitted 19 September, 2017; v1 submitted 12 September, 2017; originally announced September 2017.

    Journal ref: IEEE Transactions on Power Systems, 32 (5): 5615-5627. September 2018

  43. arXiv:1708.01977  [pdf, other

    stat.ML cs.LG

    Why Adaptively Collected Data Have Negative Bias and How to Correct for It

    Authors: Xinkun Nie, Xiaoying Tian, Jonathan Taylor, James Zou

    Abstract: From scientific experiments to online A/B testing, the previously observed data often affects how future experiments are performed, which in turn affects which data will be collected. Such adaptivity introduces complex correlations between the data and the collection procedure. In this paper, we prove that when the data collection procedure satisfies natural conditions, then sample means of the da… ▽ More

    Submitted 30 December, 2017; v1 submitted 6 August, 2017; originally announced August 2017.

    Comments: Accepted to the 21st International Conference on Artificial Intelligence and Statistics (AISTATS) 2018, Lanzarote, Spain

  44. R Package ASMap: Efficient Genetic Linkage Map Construction and Diagnosis

    Authors: Julian Taylor, David Butler

    Abstract: Although various forms of linkage map construction software are widely available, there is a distinct lack of packages for use in the R statistical computing environment. This article introduces the ASMap linkage map construction R package which contains functions that use the efficient MSTmap algorithm for clustering and optimally ordering large sets of markers. Additional to the construction fun… ▽ More

    Submitted 19 May, 2017; originally announced May 2017.

    Comments: Conditionally accepted for publication in Journal of Statistical Software

  45. arXiv:1703.06559  [pdf, other

    stat.ME

    Unifying approach to selective inference with applications to cross-validation

    Authors: Jelena Markovic, Lucy Xia, Jonathan Taylor

    Abstract: We develop tools to do valid post-selective inference for a family of model selection procedures, including choosing a model via cross-validated Lasso. The tools apply universally when the following random vectors are jointly asymptotically multivariate Gaussian: 1. the vector composed of each model's quality value evaluated under certain model selection criteria (e.g. cross-validation errors acro… ▽ More

    Submitted 12 February, 2018; v1 submitted 19 March, 2017; originally announced March 2017.

  46. arXiv:1703.06176  [pdf, other

    stat.ME

    Scalable methods for Bayesian selective inference

    Authors: Snigdha Panigrahi, Jonathan Taylor

    Abstract: Modeled along the truncated approach in Panigrahi (2016), selection-adjusted inference in a Bayesian regime is based on a selective posterior. Such a posterior is determined together by a generative model imposed on data and the selection event that enforces a truncation on the assumed law. The effective difference between the selective posterior and the usual Bayesian framework is reflected in th… ▽ More

    Submitted 11 September, 2017; v1 submitted 17 March, 2017; originally announced March 2017.

    Comments: 48 pages, 6 figures

  47. arXiv:1703.06154  [pdf, other

    stat.ME

    An MCMC-free approach to post-selective inference

    Authors: Snigdha Panigrahi, Jelena Markovic, Jonathan Taylor

    Abstract: We develop a Monte Carlo-free approach to inference post output from randomized algorithms with a convex loss and a convex penalty. The pivotal statistic based on a truncated law, called the selective pivot, usually lacks closed form expressions. Inference in these settings relies upon standard Monte Carlo sampling techniques at a reference parameter followed by an exponential tilting at the refer… ▽ More

    Submitted 18 May, 2017; v1 submitted 17 March, 2017; originally announced March 2017.

  48. arXiv:1612.07811  [pdf, ps, other

    stat.ME

    Bootstrap inference after using multiple queries for model selection

    Authors: Jelena Markovic, Jonathan Taylor

    Abstract: In this work, we provide a refinement of the selective CLT result of Tian and Taylor (2015), which allows for selective inference in non-parametric settings by adjusting for the asymptotic Gaussian limit for selection. Under some regularity assumptions on the density of the randomization, including heavier tails than Gaussian satisfied by e.g. logistic distribution, we prove the selective CLT hold… ▽ More

    Submitted 27 September, 2017; v1 submitted 22 December, 2016; originally announced December 2016.

    Comments: 58 pages

  49. High-dimensional regression adjustments in randomized experiments

    Authors: Stefan Wager, Wenfei Du, Jonathan Taylor, Robert Tibshirani

    Abstract: We study the problem of treatment effect estimation in randomized experiments with high-dimensional covariate information, and show that essentially any risk-consistent regression adjustment can be used to obtain efficient estimates of the average treatment effect. Our results considerably extend the range of settings where high-dimensional regression adjustments are guaranteed to provide valid in… ▽ More

    Submitted 27 October, 2016; v1 submitted 22 July, 2016; originally announced July 2016.

    Comments: To appear in the Proceedings of the National Academy of Sciences. The present draft does not reflect final copyediting by the PNAS staff

  50. arXiv:1605.08824  [pdf, other

    stat.ME

    Integrative Methods for Post-Selection Inference Under Convex Constraints

    Authors: Snigdha Panigrahi, Jonathan Taylor, Asaf Weinstein

    Abstract: Inference after model selection has been an active research topic in the past few years, with numerous works offering different approaches to addressing the perils of the reuse of data. In particular, major progress has been made recently on large and useful classes of problems by harnessing general theory of hypothesis testing in exponential families, but these methods have their limitations. Per… ▽ More

    Submitted 30 May, 2020; v1 submitted 27 May, 2016; originally announced May 2016.