-
Inferring random change point from left-censored longitudinal data by segmented mechanistic nonlinear models, with application in HIV surveillance study
Authors:
Hongbin Zhang,
McKaylee Robertson,
Sarah L. Braunstein,
Levi Waldron,
Denis Nash
Abstract:
The primary goal of public health efforts to control HIV epidemics is to diagnose and treat people with HIV infection as soon as possible after seroconversion. The timing of initiation of antiretroviral therapy (ART) treatment after HIV diagnosis is, therefore, a critical population-level indicator that can be used to measure the effectiveness of public health programs and policies at local and na…
▽ More
The primary goal of public health efforts to control HIV epidemics is to diagnose and treat people with HIV infection as soon as possible after seroconversion. The timing of initiation of antiretroviral therapy (ART) treatment after HIV diagnosis is, therefore, a critical population-level indicator that can be used to measure the effectiveness of public health programs and policies at local and national levels. However, population-based data on ART initiation are unavailable because ART initiation and prescription are typically measured indirectly by public health departments (e.g., with viral suppression as a proxy). In this paper, we present a random change-point model to infer the time of ART initiation utilizing routinely reported individual-level HIV viral load from an HIV surveillance system. To deal with the left-censoring and the nonlinear trajectory of viral load data, we formulate a flexible segmented nonlinear mixed effects model and propose a Stochastic version of EM (StEM) algorithm, coupled with a Gibbs sampler for the inference. We apply the method to a random subset of HIV surveillance data to infer the timing of ART initiation since diagnosis and to gain additional insights into the viral load dynamics. Simulation studies are also performed to evaluate the properties of the proposed method.
△ Less
Submitted 2 August, 2022;
originally announced August 2022.
-
The importance of transparency and reproducibility in artificial intelligence research
Authors:
Benjamin Haibe-Kains,
George Alexandru Adam,
Ahmed Hosny,
Farnoosh Khodakarami,
MAQC Society Board,
Levi Waldron,
Bo Wang,
Chris McIntosh,
Anshul Kundaje,
Casey S. Greene,
Michael M. Hoffman,
Jeffrey T. Leek,
Wolfgang Huber,
Alvis Brazma,
Joelle Pineau,
Robert Tibshirani,
Trevor Hastie,
John P. A. Ioannidis,
John Quackenbush,
Hugo J. W. L. Aerts
Abstract:
In their study, McKinney et al. showed the high potential of artificial intelligence for breast cancer screening. However, the lack of detailed methods and computer code undermines its scientific value. We identify obstacles hindering transparent and reproducible AI research as faced by McKinney et al and provide solutions with implications for the broader field.
In their study, McKinney et al. showed the high potential of artificial intelligence for breast cancer screening. However, the lack of detailed methods and computer code undermines its scientific value. We identify obstacles hindering transparent and reproducible AI research as faced by McKinney et al and provide solutions with implications for the broader field.
△ Less
Submitted 7 March, 2020; v1 submitted 28 February, 2020;
originally announced March 2020.
-
Bayesian nonparametric cross-study validation of prediction methods
Authors:
Lorenzo Trippa,
Levi Waldron,
Curtis Huttenhower,
Giovanni Parmigiani
Abstract:
We consider comparisons of statistical learning algorithms using multiple data sets, via leave-one-in cross-study validation: each of the algorithms is trained on one data set; the resulting model is then validated on each remaining data set. This poses two statistical challenges that need to be addressed simultaneously. The first is the assessment of study heterogeneity, with the aim of identifyi…
▽ More
We consider comparisons of statistical learning algorithms using multiple data sets, via leave-one-in cross-study validation: each of the algorithms is trained on one data set; the resulting model is then validated on each remaining data set. This poses two statistical challenges that need to be addressed simultaneously. The first is the assessment of study heterogeneity, with the aim of identifying a subset of studies within which algorithm comparisons can be reliably carried out. The second is the comparison of algorithms using the ensemble of data sets. We address both problems by integrating clustering and model comparison. We formulate a Bayesian model for the array of cross-study validation statistics, which defines clusters of studies with similar properties and provides the basis for meaningful algorithm comparison in the presence of study heterogeneity. We illustrate our approach through simulations involving studies with varying severity of systematic errors, and in the context of medical prognosis for patients diagnosed with cancer, using high-throughput measurements of the transcriptional activity of the tumor's genes.
△ Less
Submitted 1 June, 2015;
originally announced June 2015.