-
Empirically assessing the plausibility of unconfoundedness in observational studies
Authors:
Fernando Pires Hartwig,
Kate Tilling,
George Davey Smith
Abstract:
The possibility of unmeasured confounding is one of the main limitations for causal inference from observational studies. There are different methods for partially empirically assessing the plausibility of unconfoundedness. However, most currently available methods require (at least partial) assumptions about the confounding structure, which may be difficult to know in practice. In this paper we d…
▽ More
The possibility of unmeasured confounding is one of the main limitations for causal inference from observational studies. There are different methods for partially empirically assessing the plausibility of unconfoundedness. However, most currently available methods require (at least partial) assumptions about the confounding structure, which may be difficult to know in practice. In this paper we describe a simple strategy for empirically assessing the plausibility of conditional unconfoundedness (i.e., whether the candidate set of covariates suffices for confounding adjustment) which does not require any assumptions about the confounding structure, requiring instead assumptions related to temporal ordering between covariates, exposure and outcome (which can be guaranteed by design), measurement error and selection into the study. The proposed method essentially relies on testing the association between a subset of covariates (those associated with the exposure given all other covariates) and the outcome conditional on the remaining covariates and the exposure. We describe the assumptions underlying the method, provide proofs, use simulations to corroborate the theory and illustrate the method with an applied example assessing the causal effect of length-for-age measured in childhood and intelligence quotient measured in adulthood using data from the 1982 Pelotas (Brazil) birth cohort. We also discuss the implications of measurement error and some important limitations.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Relationship between Collider Bias and Interactions on the Log-Additive Scale
Authors:
Apostolos Gkatzionis,
Shaun R. Seaman,
Rachael A. Hughes,
Kate Tilling
Abstract:
Collider bias occurs when conditioning on a common effect (collider) of two variables $X, Y$. In this manuscript, we quantify the collider bias in the estimated association between exposure $X$ and outcome $Y$ induced by selecting on one value of a binary collider $S$ of the exposure and the outcome. In the case of logistic regression, it is known that the magnitude of the collider bias in the exp…
▽ More
Collider bias occurs when conditioning on a common effect (collider) of two variables $X, Y$. In this manuscript, we quantify the collider bias in the estimated association between exposure $X$ and outcome $Y$ induced by selecting on one value of a binary collider $S$ of the exposure and the outcome. In the case of logistic regression, it is known that the magnitude of the collider bias in the exposure-outcome regression coefficient is proportional to the strength of interaction $δ_3$ between $X$ and $Y$ in a log-additive model for the collider: $\mathbb{P} (S = 1 | X, Y) = \exp \left\{ δ_0 + δ_1 X + δ_2 Y + δ_3 X Y \right\}$. We show that this result also holds under a linear or Poisson regression model for the exposure-outcome association. We then illustrate by simulation that even if a log-additive model with interactions is not the true model for the collider, the interaction term in such a model is still informative about the magnitude of collider bias. Finally, we discuss the implications of these findings for methods that attempt to adjust for collider bias, such as inverse probability weighting which is often implemented without including interactions between variables in the weighting model.
△ Less
Submitted 7 August, 2023; v1 submitted 1 August, 2023;
originally announced August 2023.
-
Using Instruments for Selection to Adjust for Selection Bias in Mendelian Randomization
Authors:
Apostolos Gkatzionis,
Eric J. Tchetgen Tchetgen,
Jon Heron,
Kate Northstone,
Kate Tilling
Abstract:
Selection bias is a common concern in epidemiologic studies. In the literature, selection bias is often viewed as a missing data problem. Popular approaches to adjust for bias due to missing data, such as inverse probability weighting, rely on the assumption that data are missing at random and can yield biased results if this assumption is violated. In observational studies with outcome data missi…
▽ More
Selection bias is a common concern in epidemiologic studies. In the literature, selection bias is often viewed as a missing data problem. Popular approaches to adjust for bias due to missing data, such as inverse probability weighting, rely on the assumption that data are missing at random and can yield biased results if this assumption is violated. In observational studies with outcome data missing not at random, Heckman's sample selection model can be used to adjust for bias due to missing data. In this paper, we review Heckman's method and a similar approach proposed by Tchetgen Tchetgen and Wirth (2017). We then discuss how to apply these methods to Mendelian randomization analyses using individual-level data, with missing data for either the exposure or outcome or both. We explore whether genetic variants associated with participation can be used as instruments for selection. We then describe how to obtain missingness-adjusted Wald ratio, two-stage least squares and inverse variance weighted estimates. The two methods are evaluated and compared in simulations, with results suggesting that they can both mitigate selection bias but may yield parameter estimates with large standard errors in some settings. In an illustrative real-data application, we investigate the effects of body mass index on smoking using data from the Avon Longitudinal Study of Parents and Children.
△ Less
Submitted 13 April, 2024; v1 submitted 4 August, 2022;
originally announced August 2022.
-
Mixed-effects location scale models for joint modelling school value-added effects on the mean and variance of student achievement
Authors:
George Leckie,
Richard Parker,
Harvey Goldstein,
Kate Tilling
Abstract:
School value-added models are widely applied to study, monitor, and hold schools to account for school differences in student learning. The traditional model is a mixed-effects linear regression of student current achievement on student prior achievement, background characteristics, and a school random intercept effect. The latter is referred to as the school value-added score and measures the mea…
▽ More
School value-added models are widely applied to study, monitor, and hold schools to account for school differences in student learning. The traditional model is a mixed-effects linear regression of student current achievement on student prior achievement, background characteristics, and a school random intercept effect. The latter is referred to as the school value-added score and measures the mean student covariate-adjusted achievement in each school. In this article, we argue that further insights may be gained by additionally studying the variance in this quantity in each school. These include the ability to identify both individual schools and school types that exhibit unusually high or low variability in student achievement, even after accounting for differences in student intakes. We explore and illustrate how this can be done via fitting mixed-effects location scale versions of the traditional school value-added model. We discuss the implications of our work for research and school accountability systems.
△ Less
Submitted 30 September, 2023; v1 submitted 5 October, 2021;
originally announced October 2021.
-
Conceptualising Natural and Quasi Experiments in Public Health
Authors:
Frank de Vocht,
Srinivasa Vittal Katikireddi,
Cheryl McQuire,
Kate Tilling,
Matthew Hickman,
Peter Craig
Abstract:
Background: Natural or quasi experiments are appealing for public health research because they enable the evaluation of events or interventions that are difficult or impossible to manipulate experimentally, such as many policy and health system reforms. However, there remains ambiguity in the literature about their definition and how they differ from randomised controlled experiments and from othe…
▽ More
Background: Natural or quasi experiments are appealing for public health research because they enable the evaluation of events or interventions that are difficult or impossible to manipulate experimentally, such as many policy and health system reforms. However, there remains ambiguity in the literature about their definition and how they differ from randomised controlled experiments and from other observational designs.
Methods: We conceptualise natural experiments in in the context of public health evaluations, align the study design to the Target Trial Framework, and provide recommendation for improvement of their design and reporting.
Results: Natural experiment studies combine features of experiments and non-experiments. They differ from RCTs in that exposure allocation is not controlled by researchers while they differ from other observational designs in that they evaluate the impact of event or exposure changes. As a result they are, in theory, less susceptible to bias than other observational study designs. Importantly, the strength of causal inferences relies on the plausibility that the exposure allocation can be considered "as-if randomised". The target trial framework provides a systematic basis for assessing the plausibility of such claims, and enables a structured method for assessing other design elements.
Conclusions: Natural experiment studies should be considered a distinct study design rather than a set of tools for analyses of non-randomised interventions. Alignment of natural experiments to the Target Trial framework will clarify the strength of evidence underpinning claims about the effectiveness of public health interventions.
△ Less
Submitted 6 August, 2020;
originally announced August 2020.
-
Framework for the Treatment And Reporting of Missing data in Observational Studies: The TARMOS framework
Authors:
Katherine J Lee,
Kate Tilling,
Rosie P Cornish,
Roderick JA Little,
Melanie L Bell,
Els Goetghebeur,
Joseph W Hogan,
James R Carpenter
Abstract:
Missing data are ubiquitous in medical research. Although there is increasing guidance on how to handle missing data, practice is changing slowly and misapprehensions abound, particularly in observational research. We present a practical framework for handling and reporting the analysis of incomplete data in observational studies, which we illustrate using a case study from the Avon Longitudinal S…
▽ More
Missing data are ubiquitous in medical research. Although there is increasing guidance on how to handle missing data, practice is changing slowly and misapprehensions abound, particularly in observational research. We present a practical framework for handling and reporting the analysis of incomplete data in observational studies, which we illustrate using a case study from the Avon Longitudinal Study of Parents and Children. The framework consists of three steps: 1) Develop an analysis plan specifying the analysis model and how missing data are going to be addressed. An important consideration is whether a complete records analysis is likely to be valid, whether multiple imputation or an alternative approach is likely to offer benefits, and whether a sensitivity analysis regarding the missingness mechanism is required. 2) Explore the data, checking the methods outlined in the analysis plan are appropriate, and conduct the pre-planned analysis. 3) Report the results, including a description of the missing data, details on how the missing data were addressed, and the results from all analyses, interpreted in light of the missing data and the clinical relevance. This framework seeks to support researchers in thinking systematically about missing data, and transparently reporting the potential effect on the study results.
△ Less
Submitted 29 April, 2020;
originally announced April 2020.
-
Sample-constrained partial identification with application to selection bias
Authors:
Matthew Tudball,
Rachael Hughes,
Kate Tilling,
Jack Bowden,
Qingyuan Zhao
Abstract:
Many partial identification problems can be characterized by the optimal value of a function over a set where both the function and set need to be estimated by empirical data. Despite some progress for convex problems, statistical inference in this general setting remains to be developed. To address this, we derive an asymptotically valid confidence interval for the optimal value through an approp…
▽ More
Many partial identification problems can be characterized by the optimal value of a function over a set where both the function and set need to be estimated by empirical data. Despite some progress for convex problems, statistical inference in this general setting remains to be developed. To address this, we derive an asymptotically valid confidence interval for the optimal value through an appropriate relaxation of the estimated set. We then apply this general result to the problem of selection bias in population-based cohort studies. We show that existing sensitivity analyses, which are often conservative and difficult to implement, can be formulated in our framework and made significantly more informative via auxiliary information on the population. We conduct a simulation study to evaluate the finite sample performance of our inference procedure and conclude with a substantive motivating example on the causal effect of education on income in the highly-selected UK Biobank cohort. We demonstrate that our method can produce informative bounds using plausible population-level auxiliary constraints. We implement this method in the R package selectioninterval.
△ Less
Submitted 4 October, 2021; v1 submitted 24 June, 2019;
originally announced June 2019.