-
Towards more scientific meta-analyses
Authors:
Lily H. Zhang,
Menelaos Konstantinidis,
Marie-Abèle Bind,
Donald B. Rubin
Abstract:
Meta-analysis can be a critical part of the research process, often serving as the primary analysis on which the practitioners, policymakers, and individuals base their decisions. However, current literature synthesis approaches to meta-analysis typically estimate a different quantity than what is implicitly intended; concretely, standard approaches estimate the average effect of a treatment for a…
▽ More
Meta-analysis can be a critical part of the research process, often serving as the primary analysis on which the practitioners, policymakers, and individuals base their decisions. However, current literature synthesis approaches to meta-analysis typically estimate a different quantity than what is implicitly intended; concretely, standard approaches estimate the average effect of a treatment for a population of imperfect studies, rather than the true scientific effect that would be measured in a population of hypothetical perfect studies. We advocate for an alternative method, called response-surface meta-analysis, which models the relationship between the quality of the study design as predictor variables and its reported estimated effect size as the outcome variable in order to estimate the effect size obtained by the hypothetical ideal study. The idea was first introduced by Rubin several decades ago, and here we provide a practical implementation. First, we reintroduce the idea of response-surface meta-analysis, highlighting its focus on a scientifically-motivated estimand while proposing a straightforward implementation. Then we compare the approach to traditional meta-analysis techniques used in practice. We then implement response-surface meta-analysis and contrast its results with existing literature-synthesis approaches on both simulated data and a real-world example published by the Cochrane Collaboration. We conclude by detailing the primary challenges in the implementation of response-surface meta-analysis and offer some suggestions to tackle these challenges.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Estimating the Local Air Pollution Impacts of Maritime Traffic: A Principled Approach for Observational Data
Authors:
Léo Zabrocki,
Marion Leroutier,
Marie-Abèle Bind
Abstract:
We propose a new approach to estimate the causal effects of maritime traffic when natural or policy experiments are not available. We apply this method to the case of Marseille, a large Mediterranean port city, where air pollution emitted by cruise vessels is a growing concern. Using a recent matching algorithm designed for time series data, we create hypothetical randomized experiments to estimat…
▽ More
We propose a new approach to estimate the causal effects of maritime traffic when natural or policy experiments are not available. We apply this method to the case of Marseille, a large Mediterranean port city, where air pollution emitted by cruise vessels is a growing concern. Using a recent matching algorithm designed for time series data, we create hypothetical randomized experiments to estimate the change in local air pollution caused by a short-term increase in cruise traffic. We then rely on randomization inference to compute nonparametric 95\% uncertainty intervals. We find that cruise vessels' arrivals have large impacts on city-level hourly concentrations of nitrogen dioxide, particulate matter, and sulfur dioxide. At the daily level, road traffic seems however to have a much larger impact than cruise traffic. Our procedure also helps assess in a transparent manner the identification challenges specific to this type of high-frequency time series data.
△ Less
Submitted 5 August, 2022; v1 submitted 9 May, 2021;
originally announced May 2021.
-
Conceptualizing experimental controls using the potential outcomes framework
Authors:
Kristen B. Hunter,
Kristen Koenig,
Marie-Abèle Bind
Abstract:
The goal of a well-controlled study is to remove unwanted variation when estimating the causal effect of the intervention of interest. Experiments conducted in the basic sciences frequently achieve this goal using experimental controls, such as "negative" and "positive" controls, which are measurements designed to detect systematic sources of unwanted variation. Here, we introduce clear, mathemati…
▽ More
The goal of a well-controlled study is to remove unwanted variation when estimating the causal effect of the intervention of interest. Experiments conducted in the basic sciences frequently achieve this goal using experimental controls, such as "negative" and "positive" controls, which are measurements designed to detect systematic sources of unwanted variation. Here, we introduce clear, mathematically precise definitions of experimental controls using potential outcomes. Our definitions provide a unifying statistical framework for fundamental concepts of experimental design from the biological and other basic sciences. These controls are defined in terms of whether assumptions are being made about a specific treatment level, outcome, or contrast between outcomes. We discuss experimental controls as tools for researchers to wield in designing experiments and detecting potential design flaws, including using controls to diagnose unintended factors that influence the outcome of interest, assess measurement error, and identify important subpopulations. We believe that experimental controls are powerful tools for reproducible research that are possibly underutilized by statisticians, epidemiologists, and social science researchers.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
Addressing Spatially Structured Interference in Causal Analysis Using Propensity Scores
Authors:
Keith W. Zirkle,
Marie-Abele Bind,
Jenise L. Swall,
David C. Wheeler
Abstract:
Environmental epidemiologists are increasingly interested in establishing causality between exposures and health outcomes. A popular model for causal inference is the Rubin Causal Model (RCM), which typically seeks to estimate the average difference in study units' potential outcomes. An important assumption under RCM is no interference; that is, the potential outcomes of one unit are not affected…
▽ More
Environmental epidemiologists are increasingly interested in establishing causality between exposures and health outcomes. A popular model for causal inference is the Rubin Causal Model (RCM), which typically seeks to estimate the average difference in study units' potential outcomes. An important assumption under RCM is no interference; that is, the potential outcomes of one unit are not affected by the exposure status of other units. The no interference assumption is violated if we expect spillover or diffusion of exposure effects based on units' proximity to other units and several other causal estimands arise. Air pollution epidemiology typically violates this assumption when we expect upwind events to affect downwind or nearby locations. This paper adapts causal assumptions from social network research to address interference and allow estimation of both direct and spillover causal effects. We use propensity score-based methods to estimate these effects when considering the effects of the Environmental Protection Agency's 2005 nonattainment designations for particulate matter with aerodynamic diameter less than 2.5 micrograms per cubic meter (PM2.5) on lung cancer incidence using county-level data obtained from the Surveillance, Epidemiology, and End Results (SEER) Program. We compare these methods in a rigorous simulation study that considers both spatially autocorrelated variables, interference, and missing confounders. We find that pruning and matching based on the propensity score produces the highest probability coverage of the true causal effects and lower mean squared error. When applied to the research question, we found protective direct and spillover causal effects.
△ Less
Submitted 22 January, 2021;
originally announced January 2021.
-
Bayesian causal inference for count potential outcomes
Authors:
Young Lee,
Wicher P. Bergsma,
Marie-Abele C. Bind
Abstract:
The literature for count modeling provides useful tools to conduct causal inference when outcomes take non-negative integer values. Applied to the potential outcomes framework, we link the Bayesian causal inference literature to statistical models for count data. We discuss the general architectural considerations for constructing the predictive posterior of the missing potential outcomes. Special…
▽ More
The literature for count modeling provides useful tools to conduct causal inference when outcomes take non-negative integer values. Applied to the potential outcomes framework, we link the Bayesian causal inference literature to statistical models for count data. We discuss the general architectural considerations for constructing the predictive posterior of the missing potential outcomes. Special considerations for estimating average treatment effects are discussed, some generalizing certain relationships and some not yet encountered in the causal inference literature.
△ Less
Submitted 7 August, 2020;
originally announced August 2020.
-
Causal Inference for Multiple Treatments using Fractional Factorial Designs
Authors:
Nicole E. Pashley,
Marie-Abele C. Bind
Abstract:
We consider the design and analysis of multi-factor experiments using fractional factorial and incomplete designs within the potential outcome framework. These designs are particularly useful when limited resources make running a full factorial design infeasible. We connect our design-based methods to standard regression methods. We further motivate the usefulness of these designs in multi-factor…
▽ More
We consider the design and analysis of multi-factor experiments using fractional factorial and incomplete designs within the potential outcome framework. These designs are particularly useful when limited resources make running a full factorial design infeasible. We connect our design-based methods to standard regression methods. We further motivate the usefulness of these designs in multi-factor observational studies, where certain treatment combinations may be so rare that there are no measured outcomes in the observed data corresponding to them. Therefore, conceptualizing a hypothetical fractional factorial experiment instead of a full factorial experiment allows for appropriate analysis in those settings. We illustrate our approach using biomedical data from the 2003-2004 cycle of the National Health and Nutrition Examination Survey to examine the effects of four common pesticides on body mass index.
△ Less
Submitted 28 January, 2022; v1 submitted 18 May, 2019;
originally announced May 2019.
-
The Role of Body Mass Index at Diagnosis on Black-White Disparities in Colorectal Cancer Survival: A Density Regression Mediation Approach
Authors:
Katrina L. Devick,
Linda Valeri,
Jarvis Chen,
Alejandro Jara,
Marie-Abèle Bind,
Brent A. Coull
Abstract:
The study of racial/ethnic inequalities in health is important to reduce the uneven burden of disease. In the case of colorectal cancer (CRC), disparities in survival among non-Hispanic Whites and Blacks are well documented, and mechanisms leading to these disparities need to be studied formally. It has also been established that body mass index (BMI) is a risk factor for develo** CRC, and recen…
▽ More
The study of racial/ethnic inequalities in health is important to reduce the uneven burden of disease. In the case of colorectal cancer (CRC), disparities in survival among non-Hispanic Whites and Blacks are well documented, and mechanisms leading to these disparities need to be studied formally. It has also been established that body mass index (BMI) is a risk factor for develo** CRC, and recent literature shows BMI at diagnosis of CRC is associated with survival. Since BMI varies by racial/ethnic group, a question that arises is whether disparities in BMI is partially responsible for observed racial/ethnic disparities in CRC survival. This paper presents new methodology to quantify the impact of the hypothetical intervention that matches the BMI distribution in the Black population to a potentially complex distributional form observed in the White population on racial/ethnic disparities in survival. We perform a simulation that shows our proposed Bayesian density regression approach performs as well as or better than current methodology allowing for a shift in the mean of the distribution only, and that standard practice of categorizing BMI leads to large biases. When applied to motivating data from the Cancer Care Outcomes Research and Surveillance (CanCORS) Consortium, our approach suggests the proposed intervention is potentially beneficial for elderly and low income Black patients, yet harmful for young and high income Black populations.
△ Less
Submitted 16 November, 2018;
originally announced December 2018.
-
Bridging observational studies and randomized experiments by embedding the former in the latter
Authors:
Marie-Abele C. Bind,
Donald B. Rubin
Abstract:
The health effects of environmental exposures have been studied for decades, typically using standard regression models to assess exposure-outcome associations found in observational non-experimental data. We propose and illustrate a different approach to examine causal effects of environmental exposures on health outcomes from observational data. Our strategy attempts to structure the observation…
▽ More
The health effects of environmental exposures have been studied for decades, typically using standard regression models to assess exposure-outcome associations found in observational non-experimental data. We propose and illustrate a different approach to examine causal effects of environmental exposures on health outcomes from observational data. Our strategy attempts to structure the observational data to approximate data from a hypothetical, but realistic, randomized experiment. This approach, based on insights from classical experimental design, involves four stages, and relies on modern computing to implement the effort in two of the four stages.More specifically, our strategy involves: 1) a conceptual stage that involves the precise formulation of the causal question in terms of a hypothetical randomized experiment where the exposure is assigned to units; 2) a design stage that attempts to reconstruct (or approximate) a randomized experiment before any outcome data are observed, 3) a statistical analysis comparing the outcomes of interest in the exposed and non-exposed units of the hypothetical randomized experiment, and 4) a summary stage providing conclusions about statistical evidence for the sizes of possible causal effects of the exposure on outcomes. We illustrate our approach using an example examining the effect of parental smoking on children's lung function collected in families living in East Boston in the 1970's. To complement the traditional purely model-based approaches, our strategy, which includes outcome free matched-sampling, provides workable tools to quantify possible detrimental exposure effects on human health outcomes especially because it also includes transparent diagnostics to assess the assumptions of the four-stage statistical approach being applied.
△ Less
Submitted 18 September, 2017;
originally announced September 2017.
-
Randomization-based Inference for Bernoulli-Trial Experiments and Implications for Observational Studies
Authors:
Zach Branson,
Marie-Abele Bind
Abstract:
We present a randomization-based inferential framework for experiments characterized by a strongly ignorable assignment mechanism where units have independent probabilities of receiving treatment. Previous works on randomization tests often assume these probabilities are equal within blocks of units. We consider the general case where they differ across units and show how to perform randomization…
▽ More
We present a randomization-based inferential framework for experiments characterized by a strongly ignorable assignment mechanism where units have independent probabilities of receiving treatment. Previous works on randomization tests often assume these probabilities are equal within blocks of units. We consider the general case where they differ across units and show how to perform randomization tests and obtain point estimates and confidence intervals. Furthermore, we develop a rejection-sampling algorithm to conduct randomization-based inference conditional on ancillary statistics, covariate balance, or other statistics of interest. Through simulation we demonstrate how our algorithm can yield powerful randomization tests and thus precise inference. Our work also has implications for observational studies, which commonly assume a strongly ignorable assignment mechanism. Most methodologies for observational studies make additional modeling or asymptotic assumptions, while our framework only assumes the strongly ignorable assignment mechanism, and thus can be considered a minimal-assumption approach.
△ Less
Submitted 9 December, 2017; v1 submitted 13 July, 2017;
originally announced July 2017.