Search | arXiv e-print repository

Combining an experimental study with external data: study designs and identification strategies

Authors: Lawson Ung, Guanbo Wang, Sebastien Haneuse, Miguel A. Hernan, Issa J. Dahabreh

Abstract: There is increasing interest in combining information from experimental studies, including randomized and single-group trials, with information from external experimental or observational data sources. Such efforts are usually motivated by the desire to compare treatments evaluated in different studies -- for instance, through the introduction of external treatment groups -- or to estimate treatme… ▽ More There is increasing interest in combining information from experimental studies, including randomized and single-group trials, with information from external experimental or observational data sources. Such efforts are usually motivated by the desire to compare treatments evaluated in different studies -- for instance, through the introduction of external treatment groups -- or to estimate treatment effects with greater precision. Proposals to combine experimental studies with external data were made at least as early as the 1970s, but in recent years have come under increasing consideration by regulatory agencies involved in drug and device evaluation, particularly with the increasing availability of rich observational data. In this paper, we describe basic templates of study designs and data structures for combining information from experimental studies with external data, and use the potential (counterfactual) outcomes framework to elaborate identification strategies for potential outcome means and average treatment effects in these designs. In formalizing designs and identification strategies for combining information from experimental studies with external data, we hope to provide a conceptual foundation to support the systematic use and evaluation of such efforts. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: First submission

arXiv:2302.00322 [pdf, other]

doi 10.1016/j.asr.2022.11.041

First measurements of periodicities and anisotropies of cosmic ray flux observed with a water-Cherenkov detector at the Marambio Antarctic base

Authors: Santos Noelia, Dasso Sergio, Gulisano Adriana María, Areso Omar, Pereira Matías, Asorey Hernán, Rubinstein Lucas

Abstract: A new water-Cherenkov radiation detector, located at the Argentine Marambio Antarctic Base (64.24S-56.62W), has been monitoring the variability of galactic cosmic ray (GCR) flux since 2019. One of the main aims is to provide experimental data necessary to study interplanetary transport of GCRs during transient events at different space/time scales. In this paper we present the detector and analyze… ▽ More A new water-Cherenkov radiation detector, located at the Argentine Marambio Antarctic Base (64.24S-56.62W), has been monitoring the variability of galactic cosmic ray (GCR) flux since 2019. One of the main aims is to provide experimental data necessary to study interplanetary transport of GCRs during transient events at different space/time scales. In this paper we present the detector and analyze observations made during one full year. After the analysis and correction of the GCR flux variability due to the atmospheric conditions (pressure and temperature), a study of the periodicities is performed in order to analyze modulations due to heliospheric phenomena. We can observe two periods: (a) 1 day, associated with the Earth's rotation combined with the spatial anisotropy of the GCR flux; and (b) $\sim$ 30 days due to solar impact of stable solar structures combined with the rotation of the Sun. From a superposed epoch analysis, and considering the geomagnetic effects, the mean diurnal amplitude is $\sim$ 0.08% and the maximum flux is observed in $\sim$ 15 hr local time (LT) direction in the interplanetary space. In such a way, we determine the capability of Neurus to observe anisotropies and other interplanetary modulations on the GCR flux arriving at the Earth. △ Less

Submitted 1 February, 2023; originally announced February 2023.

Comments: to be published in Advances in Space Research

arXiv:2211.04876 [pdf, other]

Generalizing and transporting inferences about the effects of treatment assignment subject to non-adherence

Authors: Issa J. Dahabreh, Sarah E. Robertson, Miguel A. Hernán

Abstract: We discuss the identifiability of causal estimands for generalizability and transportability analyses, both under perfect and imperfect adherence to treatment assignment. We consider a setting where the trial data contain information on baseline covariates, assignment at baseline, intervention at baseline (point treatment), and outcomes; and where the data from non-randomized individuals only cont… ▽ More We discuss the identifiability of causal estimands for generalizability and transportability analyses, both under perfect and imperfect adherence to treatment assignment. We consider a setting where the trial data contain information on baseline covariates, assignment at baseline, intervention at baseline (point treatment), and outcomes; and where the data from non-randomized individuals only contain information on baseline covariates. In this setting, we review identification results under perfect adherence and study two examples in which non-adherence severely limits the ability to transport inferences about the effects of treatment assignment to the target population. In the first example, trial participation has a direct effect on treatment receipt and, through treatment receipt, on the outcome (a "trial engagement effect" via adherence). In the second example, participation in the trial has unmeasured common causes with treatment receipt. In both examples, the effect of assignment on the outcome in the target population is not identifiable. In the first example, however, the effect of joint interventions to scale-up trial activities that affect adherence and assign treatment is identifiable. We conclude that generalizability and transportability analyses should consider trial engagement effects via adherence and selection for participation on the basis of unmeasured factors that influence adherence. △ Less

Submitted 9 November, 2022; originally announced November 2022.

arXiv:2207.09982 [pdf, other]

Global sensitivity analysis for studies extending inferences from a randomized trial to a target population

Authors: Issa J. Dahabreh, James M. Robins, Sebastien J-P. A. Haneuse, Sarah E. Robertson, Jon A. Steingrimsson, Miguel A. Hernán

Abstract: When individuals participating in a randomized trial differ with respect to the distribution of effect modifiers compared compared with the target population where the trial results will be used, treatment effect estimates from the trial may not directly apply to target population. Methods for extending -- generalizing or transporting -- causal inferences from the trial to the target population re… ▽ More When individuals participating in a randomized trial differ with respect to the distribution of effect modifiers compared compared with the target population where the trial results will be used, treatment effect estimates from the trial may not directly apply to target population. Methods for extending -- generalizing or transporting -- causal inferences from the trial to the target population rely on conditional exchangeability assumptions between randomized and non-randomized individuals. The validity of these assumptions is often uncertain or controversial and investigators need to examine how violation of the assumptions would impact study conclusions. We describe methods for global sensitivity analysis that directly parameterize violations of the assumptions in terms of potential (counterfactual) outcome distributions. Our approach does not require detailed knowledge about the distribution of specific unmeasured effect modifiers or their relationship with the observed variables. We illustrate the methods using data from a trial nested within a cohort of trial-eligible individuals to compare coronary artery surgery plus medical therapy versus medical therapy alone for stable ischemic heart disease. △ Less

Submitted 20 July, 2022; originally announced July 2022.

Comments: first submission

arXiv:2203.14857 [pdf, ps, other]

Randomized trials and their observational emulations: a framework for benchmarking and joint analysis

Authors: Issa J. Dahabreh, Jon A. Steingrimsson, James M. Robins, Miguel A. Hernán

Abstract: A randomized trial and an analysis of observational data designed to emulate the trial sample observations separately, but have the same eligibility criteria, collect information on some shared baseline covariates, and compare the effects of the same treatments on the same outcomes. Treatment effect estimates from the trial and its emulation can be compared to benchmark observational analysis meth… ▽ More A randomized trial and an analysis of observational data designed to emulate the trial sample observations separately, but have the same eligibility criteria, collect information on some shared baseline covariates, and compare the effects of the same treatments on the same outcomes. Treatment effect estimates from the trial and its emulation can be compared to benchmark observational analysis methods. In a simplified setting with complete adherence to the assigned treatment strategy and no loss-to-follow-up, we show that benchmarking relies on an exchangeability condition between the populations underlying the trial and its emulation, to account for differences in the distribution of covariates between them. When this exchangeability condition holds, and the usual conditions needed for the estimates from the trial and its emulation to have a causal interpretation also hold, we derive restrictions on the law of the observed data. When the data are compatible with the restrictions, joint analysis of the trial and its emulation is possible. When the data are incompatible with the restrictions, a discrepancy between (1) estimates based on extending inferences from the trial to the population underlying the emulation and (2) the emulation itself may reflect either inability to benchmark (e.g., due to selective participation into the trial) or a failure of the emulation (e.g., due to unmeasured confounding), but we cannot use the data to determine which is the case. Our analysis reveals how benchmarking attempts combine causal assumptions, data analysis methods, and substantive knowledge to examine the validity of observational analysis methods. △ Less

Submitted 28 March, 2022; originally announced March 2022.

arXiv:2103.03857 [pdf, other]

doi 10.1097/EDE.0000000000001431

Revisiting the g-null paradox

Authors: Sean McGrath, Jessica G. Young, Miguel A. Hernán

Abstract: The parametric g-formula is an approach to estimating causal effects of sustained treatment strategies from observational data. An often cited limitation of the parametric g-formula is the g-null paradox: a phenomenon in which model misspecification in the parametric g-formula is guaranteed under the conditions that motivate its use (i.e., when identifiability conditions hold and measured time-var… ▽ More The parametric g-formula is an approach to estimating causal effects of sustained treatment strategies from observational data. An often cited limitation of the parametric g-formula is the g-null paradox: a phenomenon in which model misspecification in the parametric g-formula is guaranteed under the conditions that motivate its use (i.e., when identifiability conditions hold and measured time-varying confounders are affected by past treatment). Many users of the parametric g-formula know they must acknowledge the g-null paradox as a limitation when reporting results but still require clarity on its meaning and implications. Here we revisit the g-null paradox to clarify its role in causal inference studies. In doing so, we present analytic examples and a simulation-based illustration of the bias of parametric g-formula estimates under the conditions associated with this paradox. Our results highlight the importance of avoiding overly parsimonious models for the components of the g-formula when using this method. △ Less

Submitted 5 March, 2021; originally announced March 2021.

Journal ref: Epidemiology 33 (2022) 114-120

arXiv:2005.00221 [pdf, other]

A Formal Causal Interpretation of the Case-Crossover Design

Authors: Zach Shahn, Miguel A. Hernan, James M. Robins

Abstract: The case-crossover design (Maclure, 1991) is widely used in epidemiology and other fields to study causal effects of transient treatments on acute outcomes. However, its validity and causal interpretation have only been justified under informal conditions. Here, we place the design in a formal counterfactual framework for the first time. Doing so helps to clarify its assumptions and interpretation… ▽ More The case-crossover design (Maclure, 1991) is widely used in epidemiology and other fields to study causal effects of transient treatments on acute outcomes. However, its validity and causal interpretation have only been justified under informal conditions. Here, we place the design in a formal counterfactual framework for the first time. Doing so helps to clarify its assumptions and interpretation. In particular, when the treatment effect is non-null, we identify a previously unnoticed bias arising from common causes of the outcome at different person-times. We analytically characterize the direction and size of this bias and demonstrate its potential importance with a simulation. We also use our derivation of the limit of the case-crossover estimator to analyze its sensitivity to treatment effect heterogeneity, a violation of one of the informal criteria for validity. The upshot of this work for practitioners is that, while the case-crossover design can be useful for testing the causal null hypothesis in the presence of baseline confounders, extra caution is warranted when using the case-crossover design for point estimation of causal effects. △ Less

Submitted 19 November, 2021; v1 submitted 1 May, 2020; originally announced May 2020.

arXiv:2004.14824 [pdf, other]

Generalized interpretation and identification of separable effects in competing event settings

Authors: Mats J. Stensrud, Miguel A. Hernán, Eric J. Tchetgen Tchetgen, James M. Robins, Vanessa Didelez, Jessica G. Young

Abstract: In competing event settings, a counterfactual contrast of cause-specific cumulative incidences quantifies the total causal effect of a treatment on the event of interest. However, effects of treatment on the competing event may indirectly contribute to this total effect, complicating its interpretation. We previously proposed the separable effects (Stensrud et al, 2019) to define direct and indire… ▽ More In competing event settings, a counterfactual contrast of cause-specific cumulative incidences quantifies the total causal effect of a treatment on the event of interest. However, effects of treatment on the competing event may indirectly contribute to this total effect, complicating its interpretation. We previously proposed the separable effects (Stensrud et al, 2019) to define direct and indirect effects of the treatment on the event of interest. This definition presupposes a treatment decomposition into two components acting along two separate causal pathways, one exclusively outside of the competing event and the other exclusively through it. Unlike previous definitions of direct and indirect effects, the separable effects can be subject to empirical scrutiny in a study where separate interventions on the treatment components are available. Here we extend and generalize the notion of the separable effects in several ways, allowing for interpretation, identification and estimation under considerably weaker assumptions. We propose and discuss a definition of separable effects that is applicable to general time-varying structures, where the separable effects can still be meaningfully interpreted, even when they cannot be regarded as direct and indirect effects. We further derive weaker conditions for identification of separable effects in observational studies where decomposed treatments are not yet available; in particular, these conditions allow for time-varying common causes of the event of interest, the competing events and loss to follow-up. For these general settings, we propose semi-parametric weighted estimators that are straightforward to implement. As an illustration, we apply the estimators to study the separable effects of intensive blood pressure therapy on acute kidney injury, using data from a randomized clinical trial. △ Less

Submitted 4 May, 2020; v1 submitted 30 April, 2020; originally announced April 2020.

arXiv:2002.11846 [pdf, other]

Causal inference with limited resources: proportionally-representative interventions

Authors: Aaron L. Sarvet, Kerollos N. Wanis, Jessica Young, Roberto Hernandez-Alejandro, Miguel A. Hernán, Mats J. Stensrud

Abstract: Investigators often evaluate treatment effects by considering settings in which all individuals are assigned a treatment of interest, assuming that an unlimited number of treatment units are available. However, many real-life treatments are of limited supply and cannot be provided to all individuals in the population. For example, patients on the liver transplant waiting list cannot be assigned a… ▽ More Investigators often evaluate treatment effects by considering settings in which all individuals are assigned a treatment of interest, assuming that an unlimited number of treatment units are available. However, many real-life treatments are of limited supply and cannot be provided to all individuals in the population. For example, patients on the liver transplant waiting list cannot be assigned a liver transplant immediately at the time they reach highest priority because a suitable organ is not likely to be immediately available. In these cases, investigators may still be interested in the effects of treatment strategies in which a finite number of organs are available at a given time, that is, treatment regimes that satisfy resource constraints. Here, we describe an estimand that can be used to define causal effects of treatment strategies that satisfy resource constraints: proportionally-representative interventions for limited resources. We derive a simple class of inverse probability weighted estimators, and apply one such estimator to evaluate the effect of restricting or expanding utilization of "increased risk" liver organs to treat patients with end-stage liver disease. Our method is designed to evaluate policy-relevant interventions in the setting of finite treatment resources. △ Less

Submitted 28 February, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

arXiv:1911.06030 [pdf]

Guidelines for estimating causal effects in pragmatic randomized trials

Authors: Eleanor J. Murray, Sonja A. Swanson, Miguel A. Hernán

Abstract: Pragmatic randomized trials are designed to provide evidence for clinical decision-making rather than regulatory approval. Common features of these trials include the inclusion of heterogeneous or diverse patient populations in a wide range of care settings, the use of active treatment strategies as comparators, unblinded treatment assignment, and the study of long-term, clinically relevant outcom… ▽ More Pragmatic randomized trials are designed to provide evidence for clinical decision-making rather than regulatory approval. Common features of these trials include the inclusion of heterogeneous or diverse patient populations in a wide range of care settings, the use of active treatment strategies as comparators, unblinded treatment assignment, and the study of long-term, clinically relevant outcomes. These features can greatly increase the usefulness of the trial results for patients, clinicians, and other stakeholders. However, these features also introduce an increased risk of non-adherence, which reduces the value of the intention-to-treat effect as a patient-centered measure of causal effect. In these settings, the per-protocol effect provides useful complementary information for decision making. Unfortunately, there is little guidance for valid estimation of the per-protocol effect. Here, we present our full guidelines for analyses of pragmatic trials that will result in more informative causal inferences for both the intention-to-treat effect and the per-protocol effect. △ Less

Submitted 19 November, 2019; v1 submitted 14 November, 2019; originally announced November 2019.

arXiv:1908.09230 [pdf, ps, other]

Efficient and robust methods for causally interpretable meta-analysis: transporting inferences from multiple randomized trials to a target population

Authors: Issa J. Dahabreh, Sarah E. Robertson, Lucia C. Petito, Miguel A. Hernán, Jon A. Steingrimsson

Abstract: We present methods for causally interpretable meta-analyses that combine information from multiple randomized trials to estimate potential (counterfactual) outcome means and average treatment effects in a target population. We consider identifiability conditions, derive implications of the conditions for the law of the observed data, and obtain identification results for transporting causal infere… ▽ More We present methods for causally interpretable meta-analyses that combine information from multiple randomized trials to estimate potential (counterfactual) outcome means and average treatment effects in a target population. We consider identifiability conditions, derive implications of the conditions for the law of the observed data, and obtain identification results for transporting causal inferences from a collection of independent randomized trials to a new target population in which experimental data may not be available. We propose an estimator for the potential (counterfactual) outcome mean in the target population under each treatment studied in the trials. The estimator uses covariate, treatment, and outcome data from the collection of trials, but only covariate data from the target population sample. We show that it is doubly robust, in the sense that it is consistent and asymptotically normal when at least one of the models it relies on is correctly specified. We study the finite sample properties of the estimator in simulation studies and demonstrate its implementation using data from a multi-center randomized trial. △ Less

Submitted 4 February, 2022; v1 submitted 24 August, 2019; originally announced August 2019.

arXiv:1908.07072 [pdf, other]

doi 10.1016/j.patter.2020.100008

gfoRmula: An R package for estimating effects of general time-varying treatment interventions via the parametric g-formula

Authors: Victoria Lin, Sean McGrath, Zilu Zhang, Lucia C. Petito, Roger W. Logan, Miguel A. Hernán, Jessica G. Young

Abstract: Researchers are often interested in using longitudinal data to estimate the causal effects of hypothetical time-varying treatment interventions on the mean or risk of a future outcome. Standard regression/conditioning methods for confounding control generally fail to recover causal effects when time-varying confounders are themselves affected by past treatment. In such settings, estimators derived… ▽ More Researchers are often interested in using longitudinal data to estimate the causal effects of hypothetical time-varying treatment interventions on the mean or risk of a future outcome. Standard regression/conditioning methods for confounding control generally fail to recover causal effects when time-varying confounders are themselves affected by past treatment. In such settings, estimators derived from Robins's g-formula may recover time-varying treatment effects provided sufficient covariates are measured to control confounding by unmeasured risk factors. The package gfoRmula implements in R one such estimator: the parametric g-formula. This estimator easily adapts to binary or continuous time-varying treatments as well as contrasts defined by static or dynamic, deterministic or random treatment interventions, as well as interventions that depend on the natural value of treatment. The package accommodates survival outcomes as well as binary or continuous end of follow-up outcomes. For survival outcomes, the package has different options for handling competing events. This paper describes the gfoRmula package, along with motivating background, features, and examples. △ Less

Submitted 29 October, 2019; v1 submitted 19 August, 2019; originally announced August 2019.

Comments: V. Lin and S. McGrath made equal contributions. M.A. Hernan and J.G. Young made equal contributions

Journal ref: Patterns 1 (2020) 100008

arXiv:1906.10792 [pdf, ps, other]

Generalizing causal inferences from randomized trials: counterfactual and graphical identification

Authors: Issa J. Dahabreh, James M. Robins, Sebastien J-P. A. Haneuse, Miguel A. Hernán

Abstract: When engagement with a randomized trial is driven by factors that affect the outcome or when trial engagement directly affects the outcome independent of treatment, the average treatment effect among trial participants is unlikely to generalize to a target population. In this paper, we use counterfactual and graphical causal models to examine under what conditions we can generalize causal inferenc… ▽ More When engagement with a randomized trial is driven by factors that affect the outcome or when trial engagement directly affects the outcome independent of treatment, the average treatment effect among trial participants is unlikely to generalize to a target population. In this paper, we use counterfactual and graphical causal models to examine under what conditions we can generalize causal inferences from a randomized trial to the target population of trial-eligible individuals. We offer an interpretation of generalizability analyses using the notion of a hypothetical intervention to "scale-up" trial engagement to the target population. We consider the interpretation of generalizability analyses when trial engagement does or does not directly affect the outcome, highlight connections with censoring in longitudinal studies, and discuss identification of the distribution of counterfactual outcomes via g-formula computation and inverse probability weighting. Last, we show how the methods can be extended to address time-varying treatments, non-adherence, and censoring. △ Less

Submitted 25 June, 2019; originally announced June 2019.

Comments: first upload

arXiv:1905.10684 [pdf, other]

Sensitivity analysis using bias functions for studies extending inferences from a randomized trial to a target population

Authors: Issa J. Dahabreh, James M. Robins, Sebastien J-P. A. Haneuse, Iman Saeed, Sarah E. Robertson, Elisabeth A. Stuart, Miguel A. Hernán

Abstract: Extending (generalizing or transporting) causal inferences from a randomized trial to a target population requires ``generalizability'' or ``transportability'' assumptions, which state that randomized and non-randomized individuals are exchangeable conditional on baseline covariates. These assumptions are made on the basis of background knowledge, which is often uncertain or controversial, and nee… ▽ More Extending (generalizing or transporting) causal inferences from a randomized trial to a target population requires ``generalizability'' or ``transportability'' assumptions, which state that randomized and non-randomized individuals are exchangeable conditional on baseline covariates. These assumptions are made on the basis of background knowledge, which is often uncertain or controversial, and need to be subjected to sensitivity analysis. We present simple methods for sensitivity analyses that do not require detailed background knowledge about specific unknown or unmeasured determinants of the outcome or modifiers of the treatment effect. Instead, our methods directly parameterize violations of the assumptions using bias functions. We show how the methods can be applied to non-nested trial designs, where the trial data are combined with a separately obtained sample of non-randomized individuals, as well as to nested trial designs, where a clinical trial is embedded within a cohort sampled from the target population. We illustrate the methods using data from a clinical trial comparing treatments for chronic hepatitis C infection. △ Less

Submitted 25 May, 2019; originally announced May 2019.

Comments: first submission

arXiv:1905.07764 [pdf, other]

Study designs for extending causal inferences from a randomized trial to a target population

Authors: Issa J. Dahabreh, Sebastien J-P. A. Haneuse, James M. Robins, Sarah E. Robertson, Ashley L. Buchanan, Elisabeth A. Stuart, Miguel A. Hernán

Abstract: We examine study designs for extending (generalizing or transporting) causal inferences from a randomized trial to a target population. Specifically, we consider nested trial designs, where randomized individuals are nested within a sample from the target population, and non-nested trial designs, including composite dataset designs, where a randomized trial is combined with a separately obtained s… ▽ More We examine study designs for extending (generalizing or transporting) causal inferences from a randomized trial to a target population. Specifically, we consider nested trial designs, where randomized individuals are nested within a sample from the target population, and non-nested trial designs, including composite dataset designs, where a randomized trial is combined with a separately obtained sample of non-randomized individuals from the target population. We show that the causal quantities that can be identified in each study design depend on what is known about the probability of sampling non-randomized individuals. For each study design, we examine identification of potential outcome means via the g-formula and inverse probability weighting. Last, we explore the implications of the sampling properties underlying the designs for the identification and estimation of the probability of trial participation. △ Less

Submitted 19 May, 2019; originally announced May 2019.

Comments: first submission

arXiv:1903.11455 [pdf, other]

Towards causally interpretable meta-analysis: transporting inferences from multiple studies to a target population

Authors: Issa J. Dahabreh, Lucia C. Petito, Sarah E. Robertson, Miguel A. Hernán, Jon A. Steingrimsson

Abstract: We take steps towards causally interpretable meta-analysis by describing methods for transporting causal inferences from a collection of randomized trials to a new target population, one-trial-at-a-time and pooling all trials. We discuss identifiability conditions for average treatment effects in the target population and provide identification results. We show that assuming inferences are transpo… ▽ More We take steps towards causally interpretable meta-analysis by describing methods for transporting causal inferences from a collection of randomized trials to a new target population, one-trial-at-a-time and pooling all trials. We discuss identifiability conditions for average treatment effects in the target population and provide identification results. We show that assuming inferences are transportable from all trials in the collection to the same target population has implications for the law underlying the observed data. We propose average treatment effect estimators that rely on different working models and provide code for their implementation in statistical software. We discuss how to use the data to examine whether transported inferences are homogeneous across the collection of trials, sketch approaches for sensitivity analysis to violations of the identifiability conditions, and describe extensions to address non-adherence in the trials. Last, we illustrate the proposed methods using data from the HALT-C multi-center trial. △ Less

Submitted 8 February, 2020; v1 submitted 27 March, 2019; originally announced March 2019.

arXiv:1903.06488 [pdf, other]

A Note on Estimating Optimal Dynamic Treatment Strategies Under Resource Constraints Using Dynamic Marginal Structural Models

Authors: Ellen C Caniglia, Eleanor J Murray, Miguel A Hernan, Zach Shahn

Abstract: Existing strategies for determining the optimal treatment or monitoring strategy typically assume unlimited access to resources. However, when a health system has resource constraints, such as limited funds, access to medication, or monitoring capabilities, medical decisions must balance impacts on both individual and population health outcomes. That is, decisions should account for competition be… ▽ More Existing strategies for determining the optimal treatment or monitoring strategy typically assume unlimited access to resources. However, when a health system has resource constraints, such as limited funds, access to medication, or monitoring capabilities, medical decisions must balance impacts on both individual and population health outcomes. That is, decisions should account for competition between individuals in resource usage. One simple solution is to estimate the (counterfactual) resource usage under the possible interventions and choose the optimal strategy for which resource usage is within acceptable limits. We propose a method to identify the optimal dynamic intervention strategy that leads to the best expected health outcome accounting for a health system's resource constraints. We then apply this method to determine the optimal dynamic monitoring strategy for people living with HIV when resource limits on monitoring exist using observational data from the HIV-CAUSAL Collaboration. △ Less

Submitted 14 February, 2019; originally announced March 2019.

arXiv:1902.06080 [pdf, other]

Generalizing trial findings using nested trial designs with sub-sampling of non-randomized individuals

Authors: Issa J. Dahabreh, Miguel A. Hernan, Sarah E. Robertson, Ashley Buchanan, Jon A. Steingrimsson

Abstract: To generalize inferences from a randomized trial to the target population of all trial-eligible individuals, investigators can use nested trial designs, where the randomized individuals are nested within a cohort of trial-eligible individuals, including those who are not offered or refuse randomization. In these designs, data on baseline covariates are collected from the entire cohort, and treatme… ▽ More To generalize inferences from a randomized trial to the target population of all trial-eligible individuals, investigators can use nested trial designs, where the randomized individuals are nested within a cohort of trial-eligible individuals, including those who are not offered or refuse randomization. In these designs, data on baseline covariates are collected from the entire cohort, and treatment and outcome data need only be collected from randomized individuals. In this paper, we describe nested trial designs that improve research economy by collecting additional baseline covariate data after sub-sampling non-randomized individuals (i.e., a two-stage design), using sampling probabilities that may depend on the initial set of baseline covariates available from all individuals in the cohort. We propose an estimator for the potential outcome mean in the target population of all trial-eligible individuals and show that our estimator is doubly robust, in the sense that it is consistent when either the model for the conditional outcome mean among randomized individuals or the model for the probability of trial participation is correctly specified. We assess the impact of sub-sampling on the asymptotic variance of our estimator and examine the estimator's finite-sample performance in a simulation study. We illustrate the methods using data from the Coronary Artery Surgery Study (CASS). △ Less

Submitted 7 March, 2019; v1 submitted 16 February, 2019; originally announced February 2019.

Comments: added acknowledgements, fixed some wording imperfections

arXiv:1901.09472 [pdf, other]

Separable Effects for Causal Inference in the Presence of Competing Events

Authors: Mats J. Stensrud, Jessica G. Young, Vanessa Didelez, James M. Robins, Miguel A. Hernán

Abstract: In time-to-event settings, the presence of competing events complicates the definition of causal effects. Here we propose the new separable effects to study the causal effect of a treatment on an event of interest. The separable direct effect is the treatment effect on the event of interest not mediated by its effect on the competing event. The separable indirect effect is the treatment effect on… ▽ More In time-to-event settings, the presence of competing events complicates the definition of causal effects. Here we propose the new separable effects to study the causal effect of a treatment on an event of interest. The separable direct effect is the treatment effect on the event of interest not mediated by its effect on the competing event. The separable indirect effect is the treatment effect on the event of interest only through its effect on the competing event. Similar to Robins and Richardson's extended graphical approach for mediation analysis, the separable effects can only be identified under the assumption that the treatment can be decomposed into two distinct components that exert their effects through distinct causal pathways. Unlike existing definitions of causal effects in the presence of competing events, our estimands do not require cross-world contrasts or hypothetical interventions to prevent death. As an illustration, we apply our approach to a randomized clinical trial on estrogen therapy in individuals with prostate cancer. △ Less

Submitted 13 February, 2020; v1 submitted 27 January, 2019; originally announced January 2019.

arXiv:1806.06136 [pdf, ps, other]

A causal framework for classical statistical estimands in failure time settings with competing events

Authors: Jessica G. Young, Mats J. Stensrud, Eric J. Tchetgen Tchetgen, Miguel A. Hernán

Abstract: In failure-time settings, a competing risk event is any event that makes it impossible for the event of interest to occur. For example, cardiovascular disease death is a competing event for prostate cancer death because an individual cannot die of prostate cancer once he has died of cardiovascular disease. Various statistical estimands have been defined as possible targets of inference in the clas… ▽ More In failure-time settings, a competing risk event is any event that makes it impossible for the event of interest to occur. For example, cardiovascular disease death is a competing event for prostate cancer death because an individual cannot die of prostate cancer once he has died of cardiovascular disease. Various statistical estimands have been defined as possible targets of inference in the classical competing risks literature. Many reviews have described these statistical estimands and their estimating procedures with recommendations about their use. However, this previous work has not used a formal framework for characterizing causal effects and their identifying conditions, which makes it difficult to interpret effect estimates and assess recommendations regarding analytic choices. Here we use a counterfactual framework to explicitly define each of these classical estimands. We clarify that, depending on whether competing events are defined as censoring events, contrasts of risks can define a total effect of the treatment on the event of interest, or a direct effect of the treatment on the event of interest not mediated through the competing event. In contrast, regardless of whether competing events are defined as censoring events, counterfactual hazard contrasts cannot generally be interpreted as causal effects. We illustrate how identifying assumptions for all of these counterfactual estimands can be represented in causal diagrams in which competing events are depicted as time-varying covariates. We present an application of these ideas to data from a randomized trial designed to estimate the effect of estrogen therapy on prostate cancer mortality. △ Less

Submitted 6 November, 2019; v1 submitted 15 June, 2018; originally announced June 2018.

arXiv:1805.00550 [pdf, other]

Extending inferences from a randomized trial to a new target population

Authors: Issa J. Dahabreh, Sarah E. Robertson, Jon A. Steingrimsson, Elizabeth A. Stuart, Miguel A. Hernan

Abstract: When treatment effect modifiers influence the decision to participate in a randomized trial, the average treatment effect in the population represented by the randomized individuals will differ from the effect in other populations. In this tutorial, we consider methods for extending causal inferences about time-fixed treatments from a trial to a new target population of non-participants, using dat… ▽ More When treatment effect modifiers influence the decision to participate in a randomized trial, the average treatment effect in the population represented by the randomized individuals will differ from the effect in other populations. In this tutorial, we consider methods for extending causal inferences about time-fixed treatments from a trial to a new target population of non-participants, using data from a completed randomized trial and baseline covariate data from a sample from the target population. We examine methods based on modeling the expectation of the outcome, the probability of participation, or both (doubly robust). We compare the methods in a simulation study and show how they can be implemented in software. We apply the methods to a randomized trial nested within a cohort of trial-eligible patients to compare coronary artery surgery plus medical therapy versus medical therapy alone for patients with chronic coronary artery disease. We conclude by discussing issues that arise when using the methods in applied analyses. △ Less

Submitted 28 October, 2019; v1 submitted 1 May, 2018; originally announced May 2018.

Comments: changes in section 5

arXiv:1804.10846 [pdf]

doi 10.1080/09332480.2019.1579578

Data science is science's second chance to get causal inference right: A classification of data science tasks

Authors: Miguel A. Hernán, John Hsu, Brian Healy

Abstract: Causal inference from observational data is the goal of many data analyses in the health and social sciences. However, academic statistics has often frowned upon data analyses with a causal objective. The introduction of the term "data science" provides a historic opportunity to redefine data analysis in such a way that it naturally accommodates causal inference from observational data. Like other… ▽ More Causal inference from observational data is the goal of many data analyses in the health and social sciences. However, academic statistics has often frowned upon data analyses with a causal objective. The introduction of the term "data science" provides a historic opportunity to redefine data analysis in such a way that it naturally accommodates causal inference from observational data. Like others before, we organize the scientific contributions of data science into three classes of tasks: Description, prediction, and counterfactual prediction (which includes causal inference). An explicit classification of data science tasks is necessary to discuss the data, assumptions, and analytics required to successfully accomplish each task. We argue that a failure to adequately describe the role of subject-matter expert knowledge in data analysis is a source of widespread misunderstandings about data science. Specifically, causal analyses typically require not only good data and algorithms, but also domain expert knowledge. We discuss the implications for the use of data science to guide decision-making in the real world and to train data scientists. △ Less

Submitted 6 April, 2019; v1 submitted 28 April, 2018; originally announced April 2018.

Journal ref: Chance 32(1):42-49 (2019)

arXiv:1410.0477 [pdf, ps, other]

doi 10.1214/14-STS491

Think Globally, Act Globally: An Epidemiologist's Perspective on Instrumental Variable Estimation

Authors: Sonja A. Swanson, Miguel A. Hernán

Abstract: Discussion of "Instrumental Variables: An Econometrician's Perspective" by Guido W. Imbens [arXiv:1410.0163]. Discussion of "Instrumental Variables: An Econometrician's Perspective" by Guido W. Imbens [arXiv:1410.0163]. △ Less

Submitted 2 October, 2014; originally announced October 2014.

Comments: Published in at http://dx.doi.org/10.1214/14-STS491 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-STS-STS491

Journal ref: Statistical Science 2014, Vol. 29, No. 3, 371-374

Showing 1–23 of 23 results for author: Hernán, A