-
Treatment heterogeneity with right-censored outcomes using grf
Authors:
Erik Sverdrup,
Stefan Wager
Abstract:
This article walks through how to estimate conditional average treatment effects (CATEs) with right-censored time-to-event outcomes using the function causal_survival_forest (Cui et al., 2023) in the R package grf (Athey et al., 2019, Tibshirani et al., 2024) using data from the National Job Training Partnership Act.
This article walks through how to estimate conditional average treatment effects (CATEs) with right-censored time-to-event outcomes using the function causal_survival_forest (Cui et al., 2023) in the R package grf (Athey et al., 2019, Tibshirani et al., 2024) using data from the National Job Training Partnership Act.
△ Less
Submitted 25 February, 2024; v1 submitted 4 December, 2023;
originally announced December 2023.
-
Qini Curves for Multi-Armed Treatment Rules
Authors:
Erik Sverdrup,
Han Wu,
Susan Athey,
Stefan Wager
Abstract:
Qini curves have emerged as an attractive and popular approach for evaluating the benefit of data-driven targeting rules for treatment allocation. We propose a generalization of the Qini curve to multiple costly treatment arms, that quantifies the value of optimally selecting among both units and treatment arms at different budget levels. We develop an efficient algorithm for computing these curve…
▽ More
Qini curves have emerged as an attractive and popular approach for evaluating the benefit of data-driven targeting rules for treatment allocation. We propose a generalization of the Qini curve to multiple costly treatment arms, that quantifies the value of optimally selecting among both units and treatment arms at different budget levels. We develop an efficient algorithm for computing these curves and propose bootstrap-based confidence intervals that are exact in large samples for any point on the curve. These confidence intervals can be used to conduct hypothesis tests comparing the value of treatment targeting using an optimal combination of arms with using just a subset of arms, or with a non-targeting assignment rule ignoring covariates, at different budget levels. We demonstrate the statistical performance in a simulation experiment and an application to treatment targeting for election turnout.
△ Less
Submitted 23 April, 2024; v1 submitted 20 June, 2023;
originally announced June 2023.
-
Proximal Causal Learning of Conditional Average Treatment Effects
Authors:
Erik Sverdrup,
Yifan Cui
Abstract:
Efficiently and flexibly estimating treatment effect heterogeneity is an important task in a wide variety of settings ranging from medicine to marketing, and there are a considerable number of promising conditional average treatment effect estimators currently available. These, however, typically rely on the assumption that the measured covariates are enough to justify conditional exchangeability.…
▽ More
Efficiently and flexibly estimating treatment effect heterogeneity is an important task in a wide variety of settings ranging from medicine to marketing, and there are a considerable number of promising conditional average treatment effect estimators currently available. These, however, typically rely on the assumption that the measured covariates are enough to justify conditional exchangeability. We propose the P-learner, motivated by the R- and DR-learner, a tailored two-stage loss function for learning heterogeneous treatment effects in settings where exchangeability given observed covariates is an implausible assumption, and we wish to rely on proxy variables for causal inference. Our proposed estimator can be implemented by off-the-shelf loss-minimizing machine learning methods, which in the case of kernel regression satisfies an oracle bound on the estimated error as long as the nuisance components are estimated reasonably well.
△ Less
Submitted 9 May, 2023; v1 submitted 25 January, 2023;
originally announced January 2023.
-
Treatment Heterogeneity for Survival Outcomes
Authors:
Yizhe Xu,
Nikolaos Ignatiadis,
Erik Sverdrup,
Scott Fleming,
Stefan Wager,
Nigam Shah
Abstract:
Estimation of conditional average treatment effects (CATEs) plays an essential role in modern medicine by informing treatment decision-making at a patient level. Several metalearners have been proposed recently to estimate CATEs in an effective and flexible way by re-purposing predictive machine learning models for causal estimation. In this chapter, we summarize the literature on metalearners and…
▽ More
Estimation of conditional average treatment effects (CATEs) plays an essential role in modern medicine by informing treatment decision-making at a patient level. Several metalearners have been proposed recently to estimate CATEs in an effective and flexible way by re-purposing predictive machine learning models for causal estimation. In this chapter, we summarize the literature on metalearners and provide concrete guidance for their application for treatment heterogeneity estimation from randomized controlled trials' data with survival outcomes. The guidance we provide is supported by a comprehensive simulation study in which we vary the complexity of the underlying baseline risk and CATE functions, the magnitude of the heterogeneity in the treatment effect, the censoring mechanism, and the balance in treatment assignment. To demonstrate the applicability of our findings, we reanalyze the data from the Systolic Blood Pressure Intervention Trial (SPRINT) and the Action to Control Cardiovascular Risk in Diabetes (ACCORD) study. While recent literature reports the existence of heterogeneous effects of intensive blood pressure treatment with multiple treatment effect modifiers, our results suggest that many of these modifiers may be spurious discoveries. This chapter is accompanied by survlearners, an R package that provides well-documented implementations of the CATE estimation strategies described in this work, to allow easy use of our recommendations as well as the reproduction of our numerical study.
△ Less
Submitted 6 September, 2022; v1 submitted 15 July, 2022;
originally announced July 2022.
-
What Makes Forest-Based Heterogeneous Treatment Effect Estimators Work?
Authors:
Susanne Dandl,
Torsten Hothorn,
Heidi Seibold,
Erik Sverdrup,
Stefan Wager,
Achim Zeileis
Abstract:
Estimation of heterogeneous treatment effects (HTE) is of prime importance in many disciplines, ranging from personalized medicine to economics among many others. Random forests have been shown to be a flexible and powerful approach to HTE estimation in both randomized trials and observational studies. In particular "causal forests", introduced by Athey, Tibshirani and Wager (2019), along with the…
▽ More
Estimation of heterogeneous treatment effects (HTE) is of prime importance in many disciplines, ranging from personalized medicine to economics among many others. Random forests have been shown to be a flexible and powerful approach to HTE estimation in both randomized trials and observational studies. In particular "causal forests", introduced by Athey, Tibshirani and Wager (2019), along with the R implementation in package grf were rapidly adopted. A related approach, called "model-based forests", that is geared towards randomized trials and simultaneously captures effects of both prognostic and predictive variables, was introduced by Seibold, Zeileis and Hothorn (2018) along with a modular implementation in the R package model4you.
Here, we present a unifying view that goes beyond the theoretical motivations and investigates which computational elements make causal forests so successful and how these can be blended with the strengths of model-based forests. To do so, we show that both methods can be understood in terms of the same parameters and model assumptions for an additive model under L2 loss. This theoretical insight allows us to implement several flavors of "model-based causal forests" and dissect their different elements in silico.
The original causal forests and model-based forests are compared with the new blended versions in a benchmark study exploring both randomized trials and observational settings. In the randomized setting, both approaches performed akin. If confounding was present in the data generating process, we found local centering of the treatment indicator with the corresponding propensities to be the main driver for good performance. Local centering of the outcome was less important, and might be replaced or enhanced by simultaneous split selection with respect to both prognostic and predictive effects.
△ Less
Submitted 20 December, 2023; v1 submitted 21 June, 2022;
originally announced June 2022.
-
Estimating heterogeneous treatment effects with right-censored data via causal survival forests
Authors:
Yifan Cui,
Michael R. Kosorok,
Erik Sverdrup,
Stefan Wager,
Ruoqing Zhu
Abstract:
Forest-based methods have recently gained in popularity for non-parametric treatment effect estimation. Building on this line of work, we introduce causal survival forests, which can be used to estimate heterogeneous treatment effects in a survival and observational setting where outcomes may be right-censored. Our approach relies on orthogonal estimating equations to robustly adjust for both cens…
▽ More
Forest-based methods have recently gained in popularity for non-parametric treatment effect estimation. Building on this line of work, we introduce causal survival forests, which can be used to estimate heterogeneous treatment effects in a survival and observational setting where outcomes may be right-censored. Our approach relies on orthogonal estimating equations to robustly adjust for both censoring and selection effects under unconfoundedness. In our experiments, we find our approach to perform well relative to a number of baselines.
△ Less
Submitted 28 February, 2023; v1 submitted 27 January, 2020;
originally announced January 2020.
-
Doubly robust treatment effect estimation with missing attributes
Authors:
Imke Mayer,
Erik Sverdrup,
Tobias Gauss,
Jean-Denis Moyer,
Stefan Wager,
Julie Josse
Abstract:
Missing attributes are ubiquitous in causal inference, as they are in most applied statistical work. In this paper, we consider various sets of assumptions under which causal inference is possible despite missing attributes and discuss corresponding approaches to average treatment effect estimation, including generalized propensity score methods and multiple imputation. Across an extensive simulat…
▽ More
Missing attributes are ubiquitous in causal inference, as they are in most applied statistical work. In this paper, we consider various sets of assumptions under which causal inference is possible despite missing attributes and discuss corresponding approaches to average treatment effect estimation, including generalized propensity score methods and multiple imputation. Across an extensive simulation study, we show that no single method systematically out-performs others. We find, however, that doubly robust modifications of standard methods for average treatment effect estimation with missing data repeatedly perform better than their non-doubly robust baselines; for example, doubly robust generalized propensity score methods beat inverse-weighting with the generalized propensity score. This finding is reinforced in an analysis of an observations study on the effect on mortality of tranexamic acid administration among patients with traumatic brain injury in the context of critical care management. Here, doubly robust estimators recover confidence intervals that are consistent with evidence from randomized trials, whereas non-doubly robust estimators do not.
△ Less
Submitted 22 May, 2020; v1 submitted 23 October, 2019;
originally announced October 2019.