-
Analysis of stepped wedge cluster randomized trials in the presence of a time-varying treatment effect
Authors:
Avi Kenny,
Emily Voldal,
Fan Xia,
Patrick J. Heagerty,
James P. Hughes
Abstract:
Stepped wedge cluster randomized controlled trials are typically analyzed using models that assume the full effect of the treatment is achieved instantaneously. We provide an analytical framework for scenarios in which the treatment effect varies as a function of exposure time (time since the start of treatment) and define the "effect curve" as the magnitude of the treatment effect on the linear p…
▽ More
Stepped wedge cluster randomized controlled trials are typically analyzed using models that assume the full effect of the treatment is achieved instantaneously. We provide an analytical framework for scenarios in which the treatment effect varies as a function of exposure time (time since the start of treatment) and define the "effect curve" as the magnitude of the treatment effect on the linear predictor scale as a function of exposure time. The "time-averaged treatment effect", (TATE) and "long-term treatment effect" (LTE) are summaries of this curve. We analytically derive the expectation of the estimator resulting from a model that assumes an immediate treatment effect and show that it can be expressed as a weighted sum of the time-specific treatment effects corresponding to the observed exposure times. Surprisingly, although the weights sum to one, some of the weights can be negative. This implies that the estimator may be severely misleading and can even converge to a value of the opposite sign of the true TATE or LTE. We describe several models that can be used to simultaneously estimate the entire effect curve, the TATE, and the LTE, some of which make assumptions about the shape of the effect curve. We evaluate these models in a simulation study to examine the operating characteristics of the resulting estimators and apply them to two real datasets.
△ Less
Submitted 13 November, 2021;
originally announced November 2021.
-
Predictive case control designs for modification learning
Authors:
W Katherine Tan,
Patrick J Heagerty
Abstract:
Prediction models for clinical outcomes may be developed using a source dataset and additionally applied to new settings. Towards model external validation and model updating in the new setting, one procedure is model modification learning that involves the dual goals of recalibrating overall predictions as well as revising individual feature effects. Modification learning generally requires the c…
▽ More
Prediction models for clinical outcomes may be developed using a source dataset and additionally applied to new settings. Towards model external validation and model updating in the new setting, one procedure is model modification learning that involves the dual goals of recalibrating overall predictions as well as revising individual feature effects. Modification learning generally requires the collection of an adequate sample of true outcome labels from the new setting, which is frequently an expensive and time-consuming process, as it involves abstraction by human clinical experts. To reduce the abstraction burden for such new data collection, we propose a class of designs based on original model scores and their associated outcome predictions. We provide mathematical justification that the general predictive score sampling class results in valid samples for analysis. Then, we focus attention specifically on a stratified sampling procedure that we call predictive case control (PCC) sampling, which allows the dual modification learning goals to be achieved at a smaller sample size compared to simple random sampling (SRS). PCC sampling intentionally over-represents subjects with informative scores, where we suggest using the D-optimality and Binary Entropy information functions to summarize sample information. For design evaluation within the PCC class, we provide a computational framework to estimate and visualize empirical response surfaces of the proposed information functions. We demonstrate the benefit of using PCC designs for modification learning, relative to SRS, through Monte Carlo simulation. Finally, using radiology report data from the Lumbar Imaging with Reporting of Epidemiology (LIRE) study, we illustrate the application of PCC for new outcome label abstraction and subsequent modification learning across imaging modalities.
△ Less
Submitted 29 November, 2020;
originally announced November 2020.
-
Surrogate-guided sampling designs for classification of rare outcomes from electronic medical records data
Authors:
W. Katherine Tan,
Patrick J. Heagerty
Abstract:
Scalable and accurate identification of specific clinical outcomes has been enabled by machine-learning applied to electronic medical record (EMR) systems. The development of classification models requires the collection of a complete labeled data set, where true clinical outcomes are obtained by human expert manual review. For example, the development of natural language processing algorithms req…
▽ More
Scalable and accurate identification of specific clinical outcomes has been enabled by machine-learning applied to electronic medical record (EMR) systems. The development of classification models requires the collection of a complete labeled data set, where true clinical outcomes are obtained by human expert manual review. For example, the development of natural language processing algorithms requires the abstraction of clinical text data to obtain outcome information necessary for training models. However, if the outcome is rare then simple random sampling results in very few cases and insufficient information to develop accurate classifiers. Since large scale detailed abstraction is often expensive, time-consuming, and not feasible, more efficient strategies are needed. Under such resource constrained settings, we propose a class of enrichment sampling designs, where selection for abstraction is stratified by auxiliary variables related to the true outcome of interest. Stratified sampling on highly specific variables results in targeted samples that are more enriched with cases, which we show translates to increased model discrimination and better statistical learning performance. We provide mathematical details, and simulation evidence that links sampling designs to their resulting prediction model performance. We discuss the impact of our proposed sampling on both model training and validation. Finally, we illustrate the proposed designs for outcome label collection and subsequent machine-learning, using radiology report text data from the Lumbar Imaging with Reporting of Epidemiology (LIRE) study.
△ Less
Submitted 5 November, 2020; v1 submitted 31 March, 2019;
originally announced April 2019.
-
A tutorial on evaluating time-varying discrimination accuracy for survival models used in dynamic decision-making
Authors:
Aasthaa Bansal,
Patrick J. Heagerty
Abstract:
Many medical decisions involve the use of dynamic information collected on individual patients toward predicting likely transitions in their future health status. If accurate predictions are developed, then a prognostic mode can identify patients at greatest risk for future adverse events, and may be used clinically to define populations appropriate for targeted intervention. In practice, a progno…
▽ More
Many medical decisions involve the use of dynamic information collected on individual patients toward predicting likely transitions in their future health status. If accurate predictions are developed, then a prognostic mode can identify patients at greatest risk for future adverse events, and may be used clinically to define populations appropriate for targeted intervention. In practice, a prognostic model is often used to guide decisions at multiple time points over the course of disease, and classification performance, i.e. sensitivity and specificity, for distinguishing high-risk versus low-risk individuals may vary over time as an individual's disease status and prognostic information change. In this tutorial, we detail contemporary statistical methods that can characterize the time-varying accuracy of prognostic survival models when used for dynamic decision-making. Although statistical methods for evaluating prognostic models with simple binary outcomes are well established, methods appropriate for survival outcomes are less well known and require time-dependent extensions of sensitivity and specificity to fully characterize longitudinal biomarkers or models. The methods we review are particularly important in that they allow for appropriate handling of censored outcomes commonly encountered with event-time data. We highlight the importance of determining whether clinical interest is in predicting cumulative (or prevalent) cases over a fixed future time interval versus predicting incident cases over a range of follow-up time, and whether patient information is static or updated over time. We discuss implementation of time-dependent ROC approaches using relevant R statistical software packages. The statistical summaries are illustrated using a liver prognostic model to guide transplantation in primary biliary cirrhosis.
△ Less
Submitted 21 February, 2018; v1 submitted 29 June, 2017;
originally announced June 2017.
-
A Novel Tool to Evaluate the Accuracy of Predicting Survival in Cystic Fibrosis
Authors:
Aasthaa Bansal,
Nicole Mayer-Hamblett,
Christopher H. Goss,
Patrick J. Heagerty
Abstract:
Background: Effective allocation of limited donor lungs in cystic fibrosis (CF) requires accurate survival predictions, so that high-risk patients may be prioritized for transplantation. In practice, decisions about allocation are made dynamically, using routinely updated assessments. We present a novel tool for evaluating risk prediction models that, unlike traditional methods, captures the dynam…
▽ More
Background: Effective allocation of limited donor lungs in cystic fibrosis (CF) requires accurate survival predictions, so that high-risk patients may be prioritized for transplantation. In practice, decisions about allocation are made dynamically, using routinely updated assessments. We present a novel tool for evaluating risk prediction models that, unlike traditional methods, captures the dynamic nature of decision-making. Methods: Predicted risk is used as a score to rank incident deaths versus patients who survive, with the goal of ranking the deaths higher. The mean rank across deaths at a given time measures time-specific predictive accuracy; when assessed over time, it reflects time-varying accuracy. Results: Applying this approach to CF Registry data on patients followed from 1993-2011, we show that traditional methods do not capture the performance of models used dynamically in the clinical setting. Previously proposed multivariate risk scores perform no better than forced expiratory volume in 1 second as a percentage of predicted normal (FEV1%) alone. Despite its value for survival prediction, FEV1% has a low sensitivity of 45% over time (for fixed specificity of 95%), leaving room for improvement in prediction. Finally, prediction accuracy with annually-updated FEV1% shows minor differences compared to FEV1% updated every 2 years, which may have clinical implications regarding the optimal frequency of updating clinical information. Conclusions: It is imperative to continue to develop models that accurately predict survival in CF. Our proposed approach can serve as the basis for evaluating the predictive ability of these models by better accounting for their dynamic clinical use.
△ Less
Submitted 29 June, 2017;
originally announced June 2017.
-
Biased sampling designs to improve research efficiency: Factors influencing pulmonary function over time in children with asthma
Authors:
Jonathan S. Schildcrout,
Paul J. Rathouz,
Leila R. Zelnick,
Shawn P. Garbett,
Patrick J. Heagerty
Abstract:
Substudies of the Childhood Asthma Management Program [Control. Clin. Trials 20 (1999) 91-120; N. Engl. J. Med. 343 (2000) 1054-1063] seek to identify patient characteristics associated with asthma symptoms and lung function. To determine if genetic measures are associated with trajectories of lung function as measured by forced vital capacity (FVC), children in the primary cohort study retrospect…
▽ More
Substudies of the Childhood Asthma Management Program [Control. Clin. Trials 20 (1999) 91-120; N. Engl. J. Med. 343 (2000) 1054-1063] seek to identify patient characteristics associated with asthma symptoms and lung function. To determine if genetic measures are associated with trajectories of lung function as measured by forced vital capacity (FVC), children in the primary cohort study retrospectively had candidate loci evaluated. Given participant burden and constraints on financial resources, it is often desirable to target a subsample for ascertainment of costly measures. Methods that can leverage the longitudinal outcome on the full cohort to selectively measure informative individuals have been promising, but have been restricted in their use to analysis of the targeted subsample. In this paper we detail two multiple imputation analysis strategies that exploit outcome and partially observed covariate data on the nonsampled subjects, and we characterize alternative design and analysis combinations that could be used for future studies of pulmonary function and other outcomes. Candidate predictor (e.g., IL10 cytokine polymorphisms) associations obtained from targeted sampling designs can be estimated with very high efficiency compared to standard designs. Further, even though multiple imputation can dramatically improve estimation efficiency for covariates available on all subjects (e.g., gender and baseline age), relatively modest efficiency gains were observed in parameters associated with predictors that are exclusive to the targeted sample. Our results suggest that future studies of longitudinal trajectories can be efficiently conducted by use of outcome-dependent designs and associated full cohort analysis.
△ Less
Submitted 16 September, 2015;
originally announced September 2015.
-
Evaluating epoetin dosing strategies using observational longitudinal data
Authors:
Cecilia A. Cotton,
Patrick J. Heagerty
Abstract:
Epoetin is commonly used to treat anemia in chronic kidney disease and End Stage Renal Disease subjects undergoing dialysis, however, there is considerable uncertainty about what level of hemoglobin or hematocrit should be targeted in these subjects. In order to address this question, we treat epoetin dosing guidelines as a type of dynamic treatment regimen. Specifically, we present a methodology…
▽ More
Epoetin is commonly used to treat anemia in chronic kidney disease and End Stage Renal Disease subjects undergoing dialysis, however, there is considerable uncertainty about what level of hemoglobin or hematocrit should be targeted in these subjects. In order to address this question, we treat epoetin dosing guidelines as a type of dynamic treatment regimen. Specifically, we present a methodology for comparing the effects of alternative treatment regimens on survival using observational data. In randomized trials patients can be assigned to follow a specific management guideline, but in observational studies subjects can have treatment paths that appear to be adherent to multiple regimens at the same time. We present a cloning strategy in which each subject contributes follow-up data to each treatment regimen to which they are continuously adherent and artificially censored at first nonadherence. We detail an inverse probability weighted log-rank test with a valid asymptotic variance estimate that can be used to test survival distributions under two regimens. To compare multiple regimens, we propose several marginal structural Cox proportional hazards models with robust variance estimation to account for the creation of clones. The methods are illustrated through simulations and applied to an analysis comparing epoetin dosing regimens in a cohort of 33,873 adult hemodialysis patients from the United States Renal Data System.
△ Less
Submitted 3 February, 2015;
originally announced February 2015.