-
Minimax optimal subgroup identification
Authors:
Matteo Bonvini,
Edward H. Kennedy,
Luke J. Keele
Abstract:
Quantifying treatment effect heterogeneity is a crucial task in many areas of causal inference, e.g. optimal treatment allocation and estimation of subgroup effects. We study the problem of estimating the level sets of the conditional average treatment effect (CATE), identified under the no-unmeasured-confounders assumption. Given a user-specified threshold, the goal is to estimate the set of all…
▽ More
Quantifying treatment effect heterogeneity is a crucial task in many areas of causal inference, e.g. optimal treatment allocation and estimation of subgroup effects. We study the problem of estimating the level sets of the conditional average treatment effect (CATE), identified under the no-unmeasured-confounders assumption. Given a user-specified threshold, the goal is to estimate the set of all units for whom the treatment effect exceeds that threshold. For example, if the cutoff is zero, the estimand is the set of all units who would benefit from receiving treatment. Assigning treatment just to this set represents the optimal treatment rule that maximises the mean population outcome. Similarly, cutoffs greater than zero represent optimal rules under resource constraints. The level set estimator that we study follows the plug-in principle and consists of simply thresholding a good estimator of the CATE. While many CATE estimators have been recently proposed and analysed, how their properties relate to those of the corresponding level set estimators remains unclear. Our first goal is thus to fill this gap by deriving the asymptotic properties of level set estimators depending on which estimator of the CATE is used. Next, we identify a minimax optimal estimator in a model where the CATE, the propensity score and the outcome model are Holder-smooth of varying orders. We consider data generating processes that satisfy a margin condition governing the probability of observing units for whom the CATE is close to the threshold. We investigate the performance of the estimators in simulations and illustrate our methods on a dataset used to study the effects on mortality of laparoscopic vs open surgery in the treatment of various conditions of the colon.
△ Less
Submitted 30 June, 2023;
originally announced June 2023.
-
A note on post-treatment selection in studying racial discrimination in policing
Authors:
Qingyuan Zhao,
Luke J Keele,
Dylan S Small,
Marshall M Joffe
Abstract:
We discuss some causal estimands used to study racial discrimination in policing. A central challenge is that not all police-civilian encounters are recorded in administrative datasets and available to researchers. One possible solution is to consider the average causal effect of race conditional on the civilian already being detained by the police. We find that such an estimand can be quite diffe…
▽ More
We discuss some causal estimands used to study racial discrimination in policing. A central challenge is that not all police-civilian encounters are recorded in administrative datasets and available to researchers. One possible solution is to consider the average causal effect of race conditional on the civilian already being detained by the police. We find that such an estimand can be quite different from the more familiar ones in causal inference and needs to be interpreted with caution. We propose using an estimand new for this context -- the causal risk ratio, which has more transparent interpretation and requires weaker identification assumptions. We demonstrate this through a reanalysis of the NYPD Stop-and-Frisk dataset. Our reanalysis shows that the naive estimator that ignores the post-treatment selection in administrative records may severely underestimate the disparity in police violence between minorities and whites in these and similar data.
△ Less
Submitted 14 June, 2021; v1 submitted 10 September, 2020;
originally announced September 2020.
-
Biased Encouragements and Heterogeneous Effects in an Instrumental Variable Study of Emergency General Surgical Outcomes
Authors:
Colin B. Fogarty,
Kwonsang Lee,
Rachel R. Kelz,
Luke J. Keele
Abstract:
We investigate the efficacy of surgical versus non-surgical management for two gastrointestinal conditions, colitis and diverticulitis, using observational data. We deploy an instrumental variable design with surgeons' tendencies to operate as an instrument. Assuming instrument validity, we find that non-surgical alternatives can reduce both hospital length of stay and the risk of complications, w…
▽ More
We investigate the efficacy of surgical versus non-surgical management for two gastrointestinal conditions, colitis and diverticulitis, using observational data. We deploy an instrumental variable design with surgeons' tendencies to operate as an instrument. Assuming instrument validity, we find that non-surgical alternatives can reduce both hospital length of stay and the risk of complications, with estimated effects larger for septic patients than for non-septic patients. The validity of our instrument is plausible but not ironclad, necessitating a sensitivity analysis. Existing sensitivity analyses for IV designs assume effect homogeneity, unlikely to hold here because of patient-specific physiology. We develop a new sensitivity analysis that accommodates arbitrary effect heterogeneity and exploits components explainable by observed features. We find that the results for non-septic patients prove more robust to hidden bias despite having smaller estimated effects. For non-septic patients, two individuals with identical observed characteristics would have to differ in their odds of assignment to a high tendency to operate surgeon by a factor of 2.34 to overturn our finding of a benefit for non-surgical management in reducing length of stay. For septic patients, this value is only 1.64. Simulations illustrate that this phenomenon may be explained by differences in within-group heterogeneity.
△ Less
Submitted 9 December, 2020; v1 submitted 20 September, 2019;
originally announced September 2019.
-
Patterns of Effects and Sensitivity Analysis for Differences-in-Differences
Authors:
Luke J. Keele,
Dylan S. Small,
Jesse Y. Hsu,
Colin B. Fogarty
Abstract:
Applied analysts often use the differences-in-differences (DID) method to estimate the causal effect of policy interventions with observational data. The method is widely used, as the required before and after comparison of a treated and control group is commonly encountered in practice. DID removes bias from unobserved time-invariant confounders. While DID removes bias from time-invariant confoun…
▽ More
Applied analysts often use the differences-in-differences (DID) method to estimate the causal effect of policy interventions with observational data. The method is widely used, as the required before and after comparison of a treated and control group is commonly encountered in practice. DID removes bias from unobserved time-invariant confounders. While DID removes bias from time-invariant confounders, bias from time-varying confounders may be present. Hence, like any observational comparison, DID studies remain susceptible to bias from hidden confounders. Here, we develop a method of sensitivity analysis that allows investigators to quantify the amount of bias necessary to change a study's conclusions. Our method operates within a matched design that removes bias from observed baseline covariates. We develop methods for both binary and continuous outcomes. We then apply our methods to two different empirical examples from the social sciences. In the first application, we study the effect of changes to disability payments in Germany. In the second, we re-examine whether election day registration increased turnout in Wisconsin.
△ Less
Submitted 1 February, 2019; v1 submitted 7 January, 2019;
originally announced January 2019.
-
Survivor-complier effects in the presence of selection on treatment, with application to a study of prompt ICU admission
Authors:
Edward H. Kennedy,
Steve Harris,
Luke J. Keele
Abstract:
Pre-treatment selection or censoring (`selection on treatment') can occur when two treatment levels are compared ignoring the third option of neither treatment, in `censoring by death' settings where treatment is only defined for those who survive long enough to receive it, or in general in studies where the treatment is only defined for a subset of the population. Unfortunately, the standard inst…
▽ More
Pre-treatment selection or censoring (`selection on treatment') can occur when two treatment levels are compared ignoring the third option of neither treatment, in `censoring by death' settings where treatment is only defined for those who survive long enough to receive it, or in general in studies where the treatment is only defined for a subset of the population. Unfortunately, the standard instrumental variable (IV) estimand is not defined in the presence of such selection, so we consider estimating a new survivor-complier causal effect. Although this effect is generally not identified under standard IV assumptions, it is possible to construct sharp bounds. We derive these bounds and give a corresponding data-driven sensitivity analysis, along with nonparametric yet efficient estimation methods. Importantly, our approach allows for high-dimensional confounding adjustment, and valid inference even after employing machine learning. Incorporating covariates can tighten bounds dramatically, especially when they are strong predictors of the selection process. We apply the methods in a UK cohort study of critical care patients to examine the mortality effects of prompt admission to the intensive care unit, using ICU bed availability as an instrument.
△ Less
Submitted 18 September, 2017; v1 submitted 19 April, 2017;
originally announced April 2017.