-
LaLonde (1986) after Nearly Four Decades: Lessons Learned
Authors:
Guido Imbens,
Yiqing Xu
Abstract:
In 1986, Robert LaLonde published an article that compared nonexperimental estimates to experimental benchmarks (LaLonde 1986). He concluded that the nonexperimental methods at the time could not systematically replicate experimental benchmarks, casting doubt on the credibility of these methods. Following LaLonde's critical assessment, there have been significant methodological advances and practi…
▽ More
In 1986, Robert LaLonde published an article that compared nonexperimental estimates to experimental benchmarks (LaLonde 1986). He concluded that the nonexperimental methods at the time could not systematically replicate experimental benchmarks, casting doubt on the credibility of these methods. Following LaLonde's critical assessment, there have been significant methodological advances and practical changes, including (i) an emphasis on estimators based on unconfoundedness, (ii) a focus on the importance of overlap in covariate distributions, (iii) the introduction of propensity score-based methods leading to doubly robust estimators, (iv) a greater emphasis on validation exercises to bolster research credibility, and (v) methods for estimating and exploiting treatment effect heterogeneity. To demonstrate the practical lessons from these advances, we reexamine the LaLonde data and the Imbens-Rubin-Sacerdote lottery data. We show that modern methods, when applied in contexts with sufficient covariate overlap, yield robust estimates for the adjusted differences between the treatment and control groups. However, this does not mean that these estimates are valid. To assess their credibility, validation exercises (such as placebo tests) are essential, whereas goodness of fit tests alone are inadequate. Our findings highlight the importance of closely examining the assignment process, carefully inspecting overlap, and conducting validation exercises when analyzing causal effects with nonexperimental data.
△ Less
Submitted 8 June, 2024; v1 submitted 2 June, 2024;
originally announced June 2024.
-
Multiple Randomization Designs: Estimation and Inference with Interference
Authors:
Lorenzo Masoero,
Suhas Vijaykumar,
Thomas Richardson,
James McQueen,
Ido Rosen,
Brian Burdick,
Pat Bajari,
Guido Imbens
Abstract:
Classical designs of randomized experiments, going back to Fisher and Neyman in the 1930s still dominate practice even in online experimentation. However, such designs are of limited value for answering standard questions in settings, common in marketplaces, where multiple populations of agents interact strategically, leading to complex patterns of spillover effects. In this paper, we discuss new…
▽ More
Classical designs of randomized experiments, going back to Fisher and Neyman in the 1930s still dominate practice even in online experimentation. However, such designs are of limited value for answering standard questions in settings, common in marketplaces, where multiple populations of agents interact strategically, leading to complex patterns of spillover effects. In this paper, we discuss new experimental designs and corresponding estimands to account for and capture these complex spillovers. We derive the finite-sample properties of tractable estimators for main effects, direct effects, and spillovers, and present associated central limit theorems.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
Identification and Inference for Synthetic Controls with Confounding
Authors:
Guido W. Imbens,
Davide Viviano
Abstract:
This paper studies inference on treatment effects in panel data settings with unobserved confounding. We model outcome variables through a factor model with random factors and loadings. Such factors and loadings may act as unobserved confounders: when the treatment is implemented depends on time-varying factors, and who receives the treatment depends on unit-level confounders. We study the identif…
▽ More
This paper studies inference on treatment effects in panel data settings with unobserved confounding. We model outcome variables through a factor model with random factors and loadings. Such factors and loadings may act as unobserved confounders: when the treatment is implemented depends on time-varying factors, and who receives the treatment depends on unit-level confounders. We study the identification of treatment effects and illustrate the presence of a trade-off between time and unit-level confounding. We provide asymptotic results for inference for several Synthetic Control estimators and show that different sources of randomness should be considered for inference, depending on the nature of confounding. We conclude with a comparison of Synthetic Control estimators with alternatives for factor models.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Causal Models for Longitudinal and Panel Data: A Survey
Authors:
Dmitry Arkhangelsky,
Guido Imbens
Abstract:
In this survey we discuss the recent causal panel data literature. This recent literature has focused on credibly estimating causal effects of binary interventions in settings with longitudinal data, emphasizing practical advice for empirical researchers. It pays particular attention to heterogeneity in the causal effects, often in situations where few units are treated and with particular structu…
▽ More
In this survey we discuss the recent causal panel data literature. This recent literature has focused on credibly estimating causal effects of binary interventions in settings with longitudinal data, emphasizing practical advice for empirical researchers. It pays particular attention to heterogeneity in the causal effects, often in situations where few units are treated and with particular structures on the assignment pattern. The literature has extended earlier work on difference-in-differences or two-way-fixed-effect estimators. It has more generally incorporated factor models or interactive fixed effects. It has also developed novel methods using synthetic control approaches.
△ Less
Submitted 25 June, 2024; v1 submitted 26 November, 2023;
originally announced November 2023.
-
Causal clustering: design of cluster experiments under network interference
Authors:
Davide Viviano,
Lihua Lei,
Guido Imbens,
Brian Karrer,
Okke Schrijvers,
Liang Shi
Abstract:
This paper studies the design of cluster experiments to estimate the global treatment effect in the presence of network spillovers. We provide a framework to choose the clustering that minimizes the worst-case mean-squared error of the estimated global effect. We show that optimal clustering solves a novel penalized min-cut optimization problem computed via off-the-shelf semi-definite programming…
▽ More
This paper studies the design of cluster experiments to estimate the global treatment effect in the presence of network spillovers. We provide a framework to choose the clustering that minimizes the worst-case mean-squared error of the estimated global effect. We show that optimal clustering solves a novel penalized min-cut optimization problem computed via off-the-shelf semi-definite programming algorithms. Our analysis also characterizes simple conditions to choose between any two cluster designs, including choosing between a cluster or individual-level randomization. We illustrate the method's properties using unique network data from the universe of Facebook's users and existing data from a field experiment.
△ Less
Submitted 13 January, 2024; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Estimating the Value of Evidence-Based Decision Making
Authors:
Alberto Abadie,
Anish Agarwal,
Guido Imbens,
Siwei Jia,
James McQueen,
Serguei Stepaniants
Abstract:
Business/policy decisions are often based on evidence from randomized experiments and observational studies. In this article we propose an empirical framework to estimate the value of evidence-based decision making (EBDM) and the return on the investment in statistical precision.
Business/policy decisions are often based on evidence from randomized experiments and observational studies. In this article we propose an empirical framework to estimate the value of evidence-based decision making (EBDM) and the return on the investment in statistical precision.
△ Less
Submitted 9 September, 2023; v1 submitted 21 June, 2023;
originally announced June 2023.
-
Double and Single Descent in Causal Inference with an Application to High-Dimensional Synthetic Control
Authors:
Jann Spiess,
Guido Imbens,
Amar Venugopal
Abstract:
Motivated by a recent literature on the double-descent phenomenon in machine learning, we consider highly over-parameterized models in causal inference, including synthetic control with many control units. In such models, there may be so many free parameters that the model fits the training data perfectly. We first investigate high-dimensional linear regression for imputing wage data and estimatin…
▽ More
Motivated by a recent literature on the double-descent phenomenon in machine learning, we consider highly over-parameterized models in causal inference, including synthetic control with many control units. In such models, there may be so many free parameters that the model fits the training data perfectly. We first investigate high-dimensional linear regression for imputing wage data and estimating average treatment effects, where we find that models with many more covariates than sample size can outperform simple ones. We then document the performance of high-dimensional synthetic control estimators with many control units. We find that adding control units can help improve imputation performance even beyond the point where the pre-treatment fit is perfect. We provide a unified theoretical perspective on the performance of these high-dimensional models. Specifically, we show that more complex models can be interpreted as model-averaging estimators over simpler ones, which we link to an improvement in average performance. This perspective yields concrete insights into the use of synthetic control when control units are many relative to the number of pre-treatment periods.
△ Less
Submitted 12 October, 2023; v1 submitted 1 May, 2023;
originally announced May 2023.
-
Synthetic Difference In Differences Estimation
Authors:
Damian Clarke,
Daniel PailaƱir,
Susan Athey,
Guido Imbens
Abstract:
In this paper, we describe a computational implementation of the Synthetic difference-in-differences (SDID) estimator of Arkhangelsky et al. (2021) for Stata. Synthetic difference-in-differences can be used in a wide class of circumstances where treatment effects on some particular policy or event are desired, and repeated observations on treated and untreated units are available over time. We lay…
▽ More
In this paper, we describe a computational implementation of the Synthetic difference-in-differences (SDID) estimator of Arkhangelsky et al. (2021) for Stata. Synthetic difference-in-differences can be used in a wide class of circumstances where treatment effects on some particular policy or event are desired, and repeated observations on treated and untreated units are available over time. We lay out the theory underlying SDID, both when there is a single treatment adoption date and when adoption is staggered over time, and discuss estimation and inference in each of these cases. We introduce the sdid command which implements these methods in Stata, and provide a number of examples of use, discussing estimation, inference, and visualization of results.
△ Less
Submitted 13 February, 2023; v1 submitted 27 January, 2023;
originally announced January 2023.
-
Long-term Causal Inference Under Persistent Confounding via Data Combination
Authors:
Guido Imbens,
Nathan Kallus,
Xiaojie Mao,
Yuhao Wang
Abstract:
We study the identification and estimation of long-term treatment effects when both experimental and observational data are available. Since the long-term outcome is observed only after a long delay, it is not measured in the experimental data, but only recorded in the observational data. However, both types of data include observations of some short-term outcomes. In this paper, we uniquely tackl…
▽ More
We study the identification and estimation of long-term treatment effects when both experimental and observational data are available. Since the long-term outcome is observed only after a long delay, it is not measured in the experimental data, but only recorded in the observational data. However, both types of data include observations of some short-term outcomes. In this paper, we uniquely tackle the challenge of persistent unmeasured confounders, i.e., some unmeasured confounders that can simultaneously affect the treatment, short-term outcomes and the long-term outcome, noting that they invalidate identification strategies in previous literature. To address this challenge, we exploit the sequential structure of multiple short-term outcomes, and develop three novel identification strategies for the average long-term treatment effect. We further propose three corresponding estimators and prove their asymptotic consistency and asymptotic normality. We finally apply our methods to estimate the effect of a job training program on long-term employment using semi-synthetic data. We numerically show that our proposals outperform existing methods that fail to handle persistent confounders.
△ Less
Submitted 14 May, 2024; v1 submitted 15 February, 2022;
originally announced February 2022.
-
Multiple Randomization Designs
Authors:
Patrick Bajari,
Brian Burdick,
Guido W. Imbens,
Lorenzo Masoero,
James McQueen,
Thomas Richardson,
Ido M. Rosen
Abstract:
In this study we introduce a new class of experimental designs. In a classical randomized controlled trial (RCT), or A/B test, a randomly selected subset of a population of units (e.g., individuals, plots of land, or experiences) is assigned to a treatment (treatment A), and the remainder of the population is assigned to the control treatment (treatment B). The difference in average outcome by tre…
▽ More
In this study we introduce a new class of experimental designs. In a classical randomized controlled trial (RCT), or A/B test, a randomly selected subset of a population of units (e.g., individuals, plots of land, or experiences) is assigned to a treatment (treatment A), and the remainder of the population is assigned to the control treatment (treatment B). The difference in average outcome by treatment group is an estimate of the average effect of the treatment. However, motivating our study, the setting for modern experiments is often different, with the outcomes and treatment assignments indexed by multiple populations. For example, outcomes may be indexed by buyers and sellers, by content creators and subscribers, by drivers and riders, or by travelers and airlines and travel agents, with treatments potentially varying across these indices. Spillovers or interference can arise from interactions between units across populations. For example, sellers' behavior may depend on buyers' treatment assignment, or vice versa. This can invalidate the simple comparison of means as an estimator for the average effect of the treatment in classical RCTs. We propose new experiment designs for settings in which multiple populations interact. We show how these designs allow us to study questions about interference that cannot be answered by classical randomized experiments. Finally, we develop new statistical methods for analyzing these Multiple Randomization Designs.
△ Less
Submitted 26 December, 2021;
originally announced December 2021.
-
Covariate Balancing Sensitivity Analysis for Extrapolating Randomized Trials across Locations
Authors:
Xinkun Nie,
Guido Imbens,
Stefan Wager
Abstract:
The ability to generalize experimental results from randomized control trials (RCTs) across locations is crucial for informing policy decisions in targeted regions. Such generalization is often hindered by the lack of identifiability due to unmeasured effect modifiers that compromise direct transport of treatment effect estimates from one location to another. We build upon sensitivity analysis in…
▽ More
The ability to generalize experimental results from randomized control trials (RCTs) across locations is crucial for informing policy decisions in targeted regions. Such generalization is often hindered by the lack of identifiability due to unmeasured effect modifiers that compromise direct transport of treatment effect estimates from one location to another. We build upon sensitivity analysis in observational studies and propose an optimization procedure that allows us to get bounds on the treatment effects in targeted regions. Furthermore, we construct more informative bounds by balancing on the moments of covariates. In simulation experiments, we show that the covariate balancing approach is promising in getting sharper identification intervals.
△ Less
Submitted 9 December, 2021;
originally announced December 2021.
-
Synthetic Design: An Optimization Approach to Experimental Design with Synthetic Controls
Authors:
Nick Doudchenko,
Khashayar Khosravi,
Jean Pouget-Abadie,
Sebastien Lahaie,
Miles Lubin,
Vahab Mirrokni,
Jann Spiess,
Guido Imbens
Abstract:
We investigate the optimal design of experimental studies that have pre-treatment outcome data available. The average treatment effect is estimated as the difference between the weighted average outcomes of the treated and control units. A number of commonly used approaches fit this formulation, including the difference-in-means estimator and a variety of synthetic-control techniques. We propose s…
▽ More
We investigate the optimal design of experimental studies that have pre-treatment outcome data available. The average treatment effect is estimated as the difference between the weighted average outcomes of the treated and control units. A number of commonly used approaches fit this formulation, including the difference-in-means estimator and a variety of synthetic-control techniques. We propose several methods for choosing the set of treated units in conjunction with the weights. Observing the NP-hardness of the problem, we introduce a mixed-integer programming formulation which selects both the treatment and control sets and unit weightings. We prove that these proposed approaches lead to qualitatively different experimental units being selected for treatment. We use simulations based on publicly available data from the US Bureau of Labor Statistics that show improvements in terms of mean squared error and statistical power when compared to simple and commonly used alternatives such as randomized trials.
△ Less
Submitted 1 December, 2021;
originally announced December 2021.
-
Semiparametric Estimation of Treatment Effects in Randomized Experiments
Authors:
Susan Athey,
Peter J. Bickel,
Aiyou Chen,
Guido W. Imbens,
Michael Pollmann
Abstract:
We develop new semiparametric methods for estimating treatment effects. We focus on settings where the outcome distributions may be thick tailed, where treatment effects may be small, where sample sizes are large and where assignment is completely random. This setting is of particular interest in recent online experimentation. We propose using parametric models for the treatment effects, leading t…
▽ More
We develop new semiparametric methods for estimating treatment effects. We focus on settings where the outcome distributions may be thick tailed, where treatment effects may be small, where sample sizes are large and where assignment is completely random. This setting is of particular interest in recent online experimentation. We propose using parametric models for the treatment effects, leading to semiparametric models for the outcome distributions. We derive the semiparametric efficiency bound for the treatment effects for this setting, and propose efficient estimators. In the leading case with constant quantile treatment effects one of the proposed efficient estimators has an interesting interpretation as a weighted average of quantile treatment effects, with the weights proportional to minus the second derivative of the log of the density of the potential outcomes. Our analysis also suggests an extension of Huber's model and trimmed mean to include asymmetry.
△ Less
Submitted 22 August, 2023; v1 submitted 6 September, 2021;
originally announced September 2021.
-
Controlling for Unmeasured Confounding in Panel Data Using Minimal Bridge Functions: From Two-Way Fixed Effects to Factor Models
Authors:
Guido Imbens,
Nathan Kallus,
Xiaojie Mao
Abstract:
We develop a new approach for identifying and estimating average causal effects in panel data under a linear factor model with unmeasured confounders. Compared to other methods tackling factor models such as synthetic controls and matrix completion, our method does not require the number of time periods to grow infinitely. Instead, we draw inspiration from the two-way fixed effect model as a speci…
▽ More
We develop a new approach for identifying and estimating average causal effects in panel data under a linear factor model with unmeasured confounders. Compared to other methods tackling factor models such as synthetic controls and matrix completion, our method does not require the number of time periods to grow infinitely. Instead, we draw inspiration from the two-way fixed effect model as a special case of the linear factor model, where a simple difference-in-differences transformation identifies the effect. We show that analogous, albeit more complex, transformations exist in the more general linear factor model, providing a new means to identify the effect in that model. In fact many such transformations exist, called bridge functions, all identifying the same causal effect estimand. This poses a unique challenge for estimation and inference, which we solve by targeting the minimal bridge function using a regularized estimation approach. We prove that our resulting average causal effect estimator is root-N consistent and asymptotically normal, and we provide asymptotically valid confidence intervals. Finally, we provide extensions for the case of a linear factor model with time-varying unmeasured confounders.
△ Less
Submitted 9 August, 2021;
originally announced August 2021.
-
Design-Robust Two-Way-Fixed-Effects Regression For Panel Data
Authors:
Dmitry Arkhangelsky,
Guido W. Imbens,
Lihua Lei,
Xiaoman Luo
Abstract:
We propose a new estimator for average causal effects of a binary treatment with panel data in settings with general treatment patterns. Our approach augments the popular two-way-fixed-effects specification with unit-specific weights that arise from a model for the assignment mechanism. We show how to construct these weights in various settings, including the staggered adoption setting, where unit…
▽ More
We propose a new estimator for average causal effects of a binary treatment with panel data in settings with general treatment patterns. Our approach augments the popular two-way-fixed-effects specification with unit-specific weights that arise from a model for the assignment mechanism. We show how to construct these weights in various settings, including the staggered adoption setting, where units opt into the treatment sequentially but permanently. The resulting estimator converges to an average (over units and time) treatment effect under the correct specification of the assignment model, even if the fixed effect model is misspecified. We show that our estimator is more robust than the conventional two-way estimator: it remains consistent if either the assignment mechanism or the two-way regression model is correctly specified. In addition, the proposed estimator performs better than the two-way-fixed-effect estimator if the outcome model and assignment mechanism are locally misspecified. This strong double robustness property underlines and quantifies the benefits of modeling the assignment process and motivates using our estimator in practice. We also discuss an extension of our estimator to handle dynamic treatment effects.
△ Less
Submitted 4 March, 2024; v1 submitted 29 July, 2021;
originally announced July 2021.
-
Semiparametric Estimation of Treatment Effects in Observational Studies with Heterogeneous Partial Interference
Authors:
Zhaonan Qu,
Ruoxuan Xiong,
Jizhou Liu,
Guido Imbens
Abstract:
In many observational studies in social science and medicine, subjects or units are connected, and one unit's treatment and attributes may affect another's treatment and outcome, violating the stable unit treatment value assumption (SUTVA) and resulting in interference. To enable feasible estimation and inference, many previous works assume exchangeability of interfering units (neighbors). However…
▽ More
In many observational studies in social science and medicine, subjects or units are connected, and one unit's treatment and attributes may affect another's treatment and outcome, violating the stable unit treatment value assumption (SUTVA) and resulting in interference. To enable feasible estimation and inference, many previous works assume exchangeability of interfering units (neighbors). However, in many applications with distinctive units, interference is heterogeneous and needs to be modeled explicitly. In this paper, we focus on the partial interference setting, and only restrict units to be exchangeable conditional on observable characteristics. Under this framework, we propose generalized augmented inverse propensity weighted (AIPW) estimators for general causal estimands that include heterogeneous direct and spillover effects. We show that they are semiparametric efficient and robust to heterogeneous interference as well as model misspecifications. We apply our methods to the Add Health dataset to study the direct effects of alcohol consumption on academic performance and the spillover effects of parental incarceration on adolescent well-being.
△ Less
Submitted 22 June, 2024; v1 submitted 26 July, 2021;
originally announced July 2021.
-
A Design-Based Perspective on Synthetic Control Methods
Authors:
Lea Bottmer,
Guido Imbens,
Jann Spiess,
Merrill Warnick
Abstract:
Since their introduction in Abadie and Gardeazabal (2003), Synthetic Control (SC) methods have quickly become one of the leading methods for estimating causal effects in observational studies in settings with panel data. Formal discussions often motivate SC methods by the assumption that the potential outcomes were generated by a factor model. Here we study SC methods from a design-based perspecti…
▽ More
Since their introduction in Abadie and Gardeazabal (2003), Synthetic Control (SC) methods have quickly become one of the leading methods for estimating causal effects in observational studies in settings with panel data. Formal discussions often motivate SC methods by the assumption that the potential outcomes were generated by a factor model. Here we study SC methods from a design-based perspective, assuming a model for the selection of the treated unit(s) and period(s). We show that the standard SC estimator is generally biased under random assignment. We propose a Modified Unbiased Synthetic Control (MUSC) estimator that guarantees unbiasedness under random assignment and derive its exact, randomization-based, finite-sample variance. We also propose an unbiased estimator for this variance. We document in settings with real data that under random assignment, SC-type estimators can have root mean-squared errors that are substantially lower than that of other common estimators. We show that such an improvement is weakly guaranteed if the treated period is similar to the other periods, for example, if the treated period was randomly selected. While our results only directly apply in settings where treatment is assigned randomly, we believe that they can complement model-based approaches even for observational studies.
△ Less
Submitted 19 July, 2023; v1 submitted 22 January, 2021;
originally announced January 2021.
-
Combining Experimental and Observational Data to Estimate Treatment Effects on Long Term Outcomes
Authors:
Susan Athey,
Raj Chetty,
Guido Imbens
Abstract:
There has been an increase in interest in experimental evaluations to estimate causal effects, partly because their internal validity tends to be high. At the same time, as part of the big data revolution, large, detailed, and representative, administrative data sets have become more widely available. However, the credibility of estimates of causal effects based on such data sets alone can be low.…
▽ More
There has been an increase in interest in experimental evaluations to estimate causal effects, partly because their internal validity tends to be high. At the same time, as part of the big data revolution, large, detailed, and representative, administrative data sets have become more widely available. However, the credibility of estimates of causal effects based on such data sets alone can be low.
In this paper, we develop statistical methods for systematically combining experimental and observational data to obtain credible estimates of the causal effect of a binary treatment on a primary outcome that we only observe in the observational sample. Both the observational and experimental samples contain data about a treatment, observable individual characteristics, and a secondary (often short term) outcome. To estimate the effect of a treatment on the primary outcome while addressing the potential confounding in the observational sample, we propose a method that makes use of estimates of the relationship between the treatment and the secondary outcome from the experimental sample. If assignment to the treatment in the observational sample were unconfounded, we would expect the treatment effects on the secondary outcome in the two samples to be similar. We interpret differences in the estimated causal effects on the secondary outcome between the two samples as evidence of unobserved confounders in the observational sample, and develop control function methods for using those differences to adjust the estimates of the treatment effects on the primary outcome.
We illustrate these ideas by combining data on class size and third grade test scores from the Project STAR experiment with observational data on class size and both third and eighth grade test scores from the New York school system.
△ Less
Submitted 17 June, 2020;
originally announced June 2020.
-
Bayesian Meta-Prior Learning Using Empirical Bayes
Authors:
Sareh Nabi,
Houssam Nassif,
Joseph Hong,
Hamed Mamani,
Guido Imbens
Abstract:
Adding domain knowledge to a learning system is known to improve results. In multi-parameter Bayesian frameworks, such knowledge is incorporated as a prior. On the other hand, various model parameters can have different learning rates in real-world problems, especially with skewed data. Two often-faced challenges in Operation Management and Management Science applications are the absence of inform…
▽ More
Adding domain knowledge to a learning system is known to improve results. In multi-parameter Bayesian frameworks, such knowledge is incorporated as a prior. On the other hand, various model parameters can have different learning rates in real-world problems, especially with skewed data. Two often-faced challenges in Operation Management and Management Science applications are the absence of informative priors, and the inability to control parameter learning rates. In this study, we propose a hierarchical Empirical Bayes approach that addresses both challenges, and that can generalize to any Bayesian framework. Our method learns empirical meta-priors from the data itself and uses them to decouple the learning rates of first-order and second-order features (or any other given feature grou**) in a Generalized Linear Model. As the first-order features are likely to have a more pronounced effect on the outcome, focusing on learning first-order weights first is likely to improve performance and convergence time. Our Empirical Bayes method clamps features in each group together and uses the deployed model's observed data to empirically compute a hierarchical prior in hindsight. We report theoretical results for the unbiasedness, strong consistency, and optimal frequentist cumulative regret properties of our meta-prior variance estimator. We apply our method to a standard supervised learning optimization problem, as well as an online combinatorial optimization problem in a contextual bandit setting implemented in an Amazon production system. Both during simulations and live experiments, our method shows marked improvements, especially in cases of small traffic. Our findings are promising, as optimizing over sparse data is often a challenge.
△ Less
Submitted 12 July, 2021; v1 submitted 4 February, 2020;
originally announced February 2020.
-
Optimal Experimental Design for Staggered Rollouts
Authors:
Ruoxuan Xiong,
Susan Athey,
Mohsen Bayati,
Guido Imbens
Abstract:
In this paper, we study the design and analysis of experiments conducted on a set of units over multiple time periods where the starting time of the treatment may vary by unit. The design problem involves selecting an initial treatment time for each unit in order to most precisely estimate both the instantaneous and cumulative effects of the treatment. We first consider non-adaptive experiments, w…
▽ More
In this paper, we study the design and analysis of experiments conducted on a set of units over multiple time periods where the starting time of the treatment may vary by unit. The design problem involves selecting an initial treatment time for each unit in order to most precisely estimate both the instantaneous and cumulative effects of the treatment. We first consider non-adaptive experiments, where all treatment assignment decisions are made prior to the start of the experiment. For this case, we show that the optimization problem is generally NP-hard, and we propose a near-optimal solution. Under this solution, the fraction entering treatment each period is initially low, then high, and finally low again. Next, we study an adaptive experimental design problem, where both the decision to continue the experiment and treatment assignment decisions are updated after each period's data is collected. For the adaptive case, we propose a new algorithm, the Precision-Guided Adaptive Experiment (PGAE) algorithm, that addresses the challenges at both the design stage and at the stage of estimating treatment effects, ensuring valid post-experiment inference accounting for the adaptive nature of the design. Using realistic settings, we demonstrate that our proposed solutions can reduce the opportunity cost of the experiments by over 50%, compared to static design benchmarks.
△ Less
Submitted 25 September, 2023; v1 submitted 9 November, 2019;
originally announced November 2019.
-
Doubly Robust Identification for Causal Panel Data Models
Authors:
Dmitry Arkhangelsky,
Guido W. Imbens
Abstract:
We study identification and estimation of causal effects in settings with panel data. Traditionally researchers follow model-based identification strategies relying on assumptions governing the relation between the potential outcomes and the observed and unobserved confounders. We focus on a different, complementary approach to identification where assumptions are made about the connection between…
▽ More
We study identification and estimation of causal effects in settings with panel data. Traditionally researchers follow model-based identification strategies relying on assumptions governing the relation between the potential outcomes and the observed and unobserved confounders. We focus on a different, complementary approach to identification where assumptions are made about the connection between the treatment assignment and the unobserved confounders. Such strategies are common in cross-section settings but rarely used with panel data. We introduce different sets of assumptions that follow the two paths to identification and develop a doubly robust approach. We propose estimation methods that build on these identification strategies.
△ Less
Submitted 17 February, 2022; v1 submitted 20 September, 2019;
originally announced September 2019.
-
Using Wasserstein Generative Adversarial Networks for the Design of Monte Carlo Simulations
Authors:
Susan Athey,
Guido Imbens,
Jonas Metzger,
Evan Munro
Abstract:
When researchers develop new econometric methods it is common practice to compare the performance of the new methods to those of existing methods in Monte Carlo studies. The credibility of such Monte Carlo studies is often limited because of the freedom the researcher has in choosing the design. In recent years a new class of generative models emerged in the machine learning literature, termed Gen…
▽ More
When researchers develop new econometric methods it is common practice to compare the performance of the new methods to those of existing methods in Monte Carlo studies. The credibility of such Monte Carlo studies is often limited because of the freedom the researcher has in choosing the design. In recent years a new class of generative models emerged in the machine learning literature, termed Generative Adversarial Networks (GANs) that can be used to systematically generate artificial data that closely mimics real economic datasets, while limiting the degrees of freedom for the researcher and optionally satisfying privacy guarantees with respect to their training data. In addition if an applied researcher is concerned with the performance of a particular statistical method on a specific data set (beyond its theoretical properties in large samples), she may wish to assess the performance, e.g., the coverage rate of confidence intervals or the bias of the estimator, using simulated data which resembles her setting. Tol illustrate these methods we apply Wasserstein GANs (WGANs) to compare a number of different estimators for average treatment effects under unconfoundedness in three distinct settings (corresponding to three real data sets) and present a methodology for assessing the robustness of the results. In this example, we find that (i) there is not one estimator that outperforms the others in all three settings, so researchers should tailor their analytic approach to a given setting, and (ii) systematic simulation studies can be helpful for selecting among competing methods in this situation.
△ Less
Submitted 21 July, 2020; v1 submitted 5 September, 2019;
originally announced September 2019.
-
Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics
Authors:
Guido W. Imbens
Abstract:
In this essay I discuss potential outcome and graphical approaches to causality, and their relevance for empirical work in economics. I review some of the work on directed acyclic graphs, including the recent "The Book of Why," by Pearl and MacKenzie. I also discuss the potential outcome framework developed by Rubin and coauthors, building on work by Neyman. I then discuss the relative merits of t…
▽ More
In this essay I discuss potential outcome and graphical approaches to causality, and their relevance for empirical work in economics. I review some of the work on directed acyclic graphs, including the recent "The Book of Why," by Pearl and MacKenzie. I also discuss the potential outcome framework developed by Rubin and coauthors, building on work by Neyman. I then discuss the relative merits of these approaches for empirical work in economics, focusing on the questions each answer well, and why much of the the work in economics is closer in spirit to the potential outcome framework.
△ Less
Submitted 22 March, 2020; v1 submitted 16 July, 2019;
originally announced July 2019.
-
Ensemble Methods for Causal Effects in Panel Data Settings
Authors:
Susan Athey,
Mohsen Bayati,
Guido Imbens,
Zhaonan Qu
Abstract:
This paper studies a panel data setting where the goal is to estimate causal effects of an intervention by predicting the counterfactual values of outcomes for treated units, had they not received the treatment. Several approaches have been proposed for this problem, including regression methods, synthetic control methods and matrix completion methods. This paper considers an ensemble approach, an…
▽ More
This paper studies a panel data setting where the goal is to estimate causal effects of an intervention by predicting the counterfactual values of outcomes for treated units, had they not received the treatment. Several approaches have been proposed for this problem, including regression methods, synthetic control methods and matrix completion methods. This paper considers an ensemble approach, and shows that it performs better than any of the individual methods in several economic datasets. Matrix completion methods are often given the most weight by the ensemble, but this clearly depends on the setting. We argue that ensemble methods present a fruitful direction for further research in the causal panel data setting.
△ Less
Submitted 24 March, 2019;
originally announced March 2019.
-
Machine Learning Methods Economists Should Know About
Authors:
Susan Athey,
Guido Imbens
Abstract:
We discuss the relevance of the recent Machine Learning (ML) literature for economics and econometrics. First we discuss the differences in goals, methods and settings between the ML literature and the traditional econometrics and statistics literatures. Then we discuss some specific methods from the machine learning literature that we view as important for empirical researchers in economics. Thes…
▽ More
We discuss the relevance of the recent Machine Learning (ML) literature for economics and econometrics. First we discuss the differences in goals, methods and settings between the ML literature and the traditional econometrics and statistics literatures. Then we discuss some specific methods from the machine learning literature that we view as important for empirical researchers in economics. These include supervised learning methods for regression and classification, unsupervised learning methods, as well as matrix completion methods. Finally, we highlight newly developed methods at the intersection of ML and econometrics, methods that typically perform better than either off-the-shelf ML or more traditional econometric methods when applied to particular classes of problems, problems that include causal inference for average treatment effects, optimal policy estimation, and estimation of the counterfactual effect of price changes in consumer choice models.
△ Less
Submitted 24 March, 2019;
originally announced March 2019.
-
Synthetic Difference in Differences
Authors:
Dmitry Arkhangelsky,
Susan Athey,
David A. Hirshberg,
Guido W. Imbens,
Stefan Wager
Abstract:
We present a new estimator for causal effects with panel data that builds on insights behind the widely used difference in differences and synthetic control methods. Relative to these methods we find, both theoretically and empirically, that this "synthetic difference in differences" estimator has desirable robustness properties, and that it performs well in settings where the conventional estimat…
▽ More
We present a new estimator for causal effects with panel data that builds on insights behind the widely used difference in differences and synthetic control methods. Relative to these methods we find, both theoretically and empirically, that this "synthetic difference in differences" estimator has desirable robustness properties, and that it performs well in settings where the conventional estimators are commonly used in practice. We study the asymptotic behavior of the estimator when the systematic part of the outcome model includes latent unit factors interacted with latent time factors, and we present conditions for consistency and asymptotic normality.
△ Less
Submitted 2 July, 2021; v1 submitted 24 December, 2018;
originally announced December 2018.
-
Balanced Linear Contextual Bandits
Authors:
Maria Dimakopoulou,
Zhengyuan Zhou,
Susan Athey,
Guido Imbens
Abstract:
Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models, which can lead to difficult estimation problems along the path of learning. We develop algorithms for contextual bandits with linear payoffs that integrate balancing methods from the causal inf…
▽ More
Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models, which can lead to difficult estimation problems along the path of learning. We develop algorithms for contextual bandits with linear payoffs that integrate balancing methods from the causal inference literature in their estimation to make it less prone to problems of estimation bias. We provide the first regret bound analyses for linear contextual bandits with balancing and show that our algorithms match the state of the art theoretical guarantees. We demonstrate the strong practical advantage of balanced contextual bandits on a large number of supervised learning datasets and on a synthetic example that simulates model misspecification and prejudice in the initial training data.
△ Less
Submitted 14 December, 2018;
originally announced December 2018.
-
Design-based Analysis in Difference-In-Differences Settings with Staggered Adoption
Authors:
Susan Athey,
Guido Imbens
Abstract:
In this paper we study estimation of and inference for average treatment effects in a setting with panel data. We focus on the setting where units, e.g., individuals, firms, or states, adopt the policy or treatment of interest at a particular point in time, and then remain exposed to this treatment at all times afterwards. We take a design perspective where we investigate the properties of estimat…
▽ More
In this paper we study estimation of and inference for average treatment effects in a setting with panel data. We focus on the setting where units, e.g., individuals, firms, or states, adopt the policy or treatment of interest at a particular point in time, and then remain exposed to this treatment at all times afterwards. We take a design perspective where we investigate the properties of estimators and procedures given assumptions on the assignment process. We show that under random assignment of the adoption date the standard Difference-In-Differences estimator is is an unbiased estimator of a particular weighted average causal effect. We characterize the proeperties of this estimand, and show that the standard variance estimator is conservative.
△ Less
Submitted 1 September, 2018; v1 submitted 15 August, 2018;
originally announced August 2018.
-
A Causal Bootstrap
Authors:
Guido Imbens,
Konrad Menzel
Abstract:
The bootstrap, introduced by Efron (1982), has become a very popular method for estimating variances and constructing confidence intervals. A key insight is that one can approximate the properties of estimators by using the empirical distribution function of the sample as an approximation for the true distribution function. This approach views the uncertainty in the estimator as coming exclusively…
▽ More
The bootstrap, introduced by Efron (1982), has become a very popular method for estimating variances and constructing confidence intervals. A key insight is that one can approximate the properties of estimators by using the empirical distribution function of the sample as an approximation for the true distribution function. This approach views the uncertainty in the estimator as coming exclusively from sampling uncertainty. We argue that for causal estimands the uncertainty arises entirely, or partially, from a different source, corresponding to the stochastic nature of the treatment received. We develop a bootstrap procedure that accounts for this uncertainty, and compare its properties to that of the classical bootstrap.
△ Less
Submitted 25 January, 2019; v1 submitted 7 July, 2018;
originally announced July 2018.
-
Fixed Effects and the Generalized Mundlak Estimator
Authors:
Dmitry Arkhangelsky,
Guido Imbens
Abstract:
We develop a new approach for estimating average treatment effects in observational studies with unobserved group-level heterogeneity. We consider a general model with group-level unconfoundedness and provide conditions under which aggregate balancing statistics -- group-level averages of functions of treatments and covariates -- are sufficient to eliminate differences between groups. Building on…
▽ More
We develop a new approach for estimating average treatment effects in observational studies with unobserved group-level heterogeneity. We consider a general model with group-level unconfoundedness and provide conditions under which aggregate balancing statistics -- group-level averages of functions of treatments and covariates -- are sufficient to eliminate differences between groups. Building on these results, we reinterpret commonly used linear fixed-effect regression estimators by writing them in the Mundlak form as linear regression estimators without fixed effects but including group averages. We use this representation to develop Generalized Mundlak Estimators (GMEs) that capture group differences through group averages of (functions of) the unit-level variables and adjust for these group differences in flexible and robust ways in the spirit of the modern causal literature.
△ Less
Submitted 30 August, 2023; v1 submitted 5 July, 2018;
originally announced July 2018.
-
Estimation Considerations in Contextual Bandits
Authors:
Maria Dimakopoulou,
Zhengyuan Zhou,
Susan Athey,
Guido Imbens
Abstract:
Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models, which can lead to difficult estimation problems along the path of learning. We study a consideration for the exploration vs. exploitation framework that does not arise in multi-armed bandits bu…
▽ More
Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models, which can lead to difficult estimation problems along the path of learning. We study a consideration for the exploration vs. exploitation framework that does not arise in multi-armed bandits but is crucial in contextual bandits; the way exploration and exploitation is conducted in the present affects the bias and variance in the potential outcome model estimation in subsequent stages of learning. We develop parametric and non-parametric contextual bandits that integrate balancing methods from the causal inference literature in their estimation to make it less prone to problems of estimation bias. We provide the first regret bound analyses for contextual bandits with balancing in the domain of linear contextual bandits that match the state of the art regret bounds. We demonstrate the strong practical advantage of balanced contextual bandits on a large number of supervised learning datasets and on a synthetic example that simulates model mis-specification and prejudice in the initial training data. Additionally, we develop contextual bandits with simpler assignment policies by leveraging sparse model estimation methods from the econometrics literature and demonstrate empirically that in the early stages they can improve the rate of learning and decrease regret.
△ Less
Submitted 16 December, 2018; v1 submitted 19 November, 2017;
originally announced November 2017.
-
Matrix Completion Methods for Causal Panel Data Models
Authors:
Susan Athey,
Mohsen Bayati,
Nikolay Doudchenko,
Guido Imbens,
Khashayar Khosravi
Abstract:
In this paper we study methods for estimating causal effects in settings with panel data, where some units are exposed to a treatment during some periods and the goal is estimating counterfactual (untreated) outcomes for the treated unit/period combinations. We propose a class of matrix completion estimators that uses the observed elements of the matrix of control outcomes corresponding to untreat…
▽ More
In this paper we study methods for estimating causal effects in settings with panel data, where some units are exposed to a treatment during some periods and the goal is estimating counterfactual (untreated) outcomes for the treated unit/period combinations. We propose a class of matrix completion estimators that uses the observed elements of the matrix of control outcomes corresponding to untreated unit/periods to impute the "missing" elements of the control outcome matrix, corresponding to treated units/periods. This leads to a matrix that well-approximates the original (incomplete) matrix, but has lower complexity according to the nuclear norm for matrices. We generalize results from the matrix completion literature by allowing the patterns of missing data to have a time series dependency structure that is common in social science applications. We present novel insights concerning the connections between the matrix completion literature, the literature on interactive fixed effects models and the literatures on program evaluation under unconfoundedness and synthetic control methods. We show that all these estimators can be viewed as focusing on the same objective function. They differ solely in the way they deal with identification, in some cases solely through regularization (our proposed nuclear norm matrix completion estimator) and in other cases primarily through imposing hard restrictions (the unconfoundedness and synthetic control approaches). The proposed method outperforms unconfoundedness-based or synthetic control estimators in simulations based on real data.
△ Less
Submitted 21 April, 2022; v1 submitted 27 October, 2017;
originally announced October 2017.
-
When Should You Adjust Standard Errors for Clustering?
Authors:
Alberto Abadie,
Susan Athey,
Guido Imbens,
Jeffrey Wooldridge
Abstract:
In empirical work it is common to estimate parameters of models and report associated standard errors that account for "clustering" of units, where clusters are defined by factors such as geography. Clustering adjustments are typically motivated by the concern that unobserved components of outcomes for units within clusters are correlated. However, this motivation does not provide guidance about q…
▽ More
In empirical work it is common to estimate parameters of models and report associated standard errors that account for "clustering" of units, where clusters are defined by factors such as geography. Clustering adjustments are typically motivated by the concern that unobserved components of outcomes for units within clusters are correlated. However, this motivation does not provide guidance about questions such as: (i) Why should we adjust standard errors for clustering in some situations but not others? How can we justify the common practice of clustering in observational studies but not randomized experiments, or clustering by state but not by gender? (ii) Why is conventional clustering a potentially conservative "all-or-nothing" adjustment, and are there alternative methods that respond to data and are less conservative? (iii) In what settings does the choice of whether and how to cluster make a difference? We address these questions using a framework of sampling and design inference. We argue that clustering can be needed to address sampling issues if sampling follows a two stage process where in the first stage, a subset of clusters are sampled from a population of clusters, and in the second stage, units are sampled from the sampled clusters. Then, clustered standard errors account for the existence of clusters in the population that we do not see in the sample. Clustering can be needed to account for design issues if treatment assignment is correlated with membership in a cluster. We propose new variance estimators to deal with intermediate settings where conventional cluster standard errors are unnecessarily conservative and robust standard errors are too small.
△ Less
Submitted 19 September, 2022; v1 submitted 8 October, 2017;
originally announced October 2017.
-
Sampling-based vs. Design-based Uncertainty in Regression Analysis
Authors:
Alberto Abadie,
Susan Athey,
Guido W. Imbens,
Jeffrey M. Wooldridge
Abstract:
Consider a researcher estimating the parameters of a regression function based on data for all 50 states in the United States or on data for all visits to a website. What is the interpretation of the estimated parameters and the standard errors? In practice, researchers typically assume that the sample is randomly drawn from a large population of interest and report standard errors that are design…
▽ More
Consider a researcher estimating the parameters of a regression function based on data for all 50 states in the United States or on data for all visits to a website. What is the interpretation of the estimated parameters and the standard errors? In practice, researchers typically assume that the sample is randomly drawn from a large population of interest and report standard errors that are designed to capture sampling variation. This is common even in applications where it is difficult to articulate what that population of interest is, and how it differs from the sample. In this article, we explore an alternative approach to inference, which is partly design-based. In a design-based setting, the values of some of the regressors can be manipulated, perhaps through a policy intervention. Design-based uncertainty emanates from lack of knowledge about the values that the regression outcome would have taken under alternative interventions. We derive standard errors that account for design-based uncertainty instead of, or in addition to, sampling-based uncertainty. We show that our standard errors in general are smaller than the usual infinite-population sampling-based standard errors and provide conditions under which they coincide.
△ Less
Submitted 21 June, 2019; v1 submitted 6 June, 2017;
originally announced June 2017.
-
Optimized Regression Discontinuity Designs
Authors:
Guido Imbens,
Stefan Wager
Abstract:
The increasing popularity of regression discontinuity methods for causal inference in observational studies has led to a proliferation of different estimating strategies, most of which involve first fitting non-parametric regression models on both sides of a treatment assignment boundary and then reporting plug-in estimates for the effect of interest. In applications, however, it is often difficul…
▽ More
The increasing popularity of regression discontinuity methods for causal inference in observational studies has led to a proliferation of different estimating strategies, most of which involve first fitting non-parametric regression models on both sides of a treatment assignment boundary and then reporting plug-in estimates for the effect of interest. In applications, however, it is often difficult to tune the non-parametric regressions in a way that is well calibrated for the specific target of inference; for example, the model with the best global in-sample fit may provide poor estimates of the discontinuity parameter. In this paper, we propose an alternative method for estimation and statistical inference in regression discontinuity designs that uses numerical convex optimization to directly obtain the finite-sample-minimax linear estimator for the regression discontinuity parameter, subject to bounds on the second derivative of the conditional response function. Given a bound on the second derivative, our proposed method is fully data-driven, and provides uniform confidence intervals for the regression discontinuity parameter with both discrete and continuous running variables. The method also naturally extends to the case of multiple running variables.
△ Less
Submitted 7 June, 2018; v1 submitted 3 May, 2017;
originally announced May 2017.
-
Estimating Average Treatment Effects: Supplementary Analyses and Remaining Challenges
Authors:
Susan Athey,
Guido Imbens,
Thai Pham,
Stefan Wager
Abstract:
There is a large literature on semiparametric estimation of average treatment effects under unconfounded treatment assignment in settings with a fixed number of covariates. More recently attention has focused on settings with a large number of covariates. In this paper we extend lessons from the earlier literature to this new setting. We propose that in addition to reporting point estimates and st…
▽ More
There is a large literature on semiparametric estimation of average treatment effects under unconfounded treatment assignment in settings with a fixed number of covariates. More recently attention has focused on settings with a large number of covariates. In this paper we extend lessons from the earlier literature to this new setting. We propose that in addition to reporting point estimates and standard errors, researchers report results from a number of supplementary analyses to assist in assessing the credibility of their estimates.
△ Less
Submitted 4 February, 2017;
originally announced February 2017.
-
Balancing, Regression, Difference-In-Differences and Synthetic Control Methods: A Synthesis
Authors:
Nikolay Doudchenko,
Guido W. Imbens
Abstract:
In a seminal paper Abadie, Diamond, and Hainmueller [2010] (ADH), see also Abadie and Gardeazabal [2003], Abadie et al. [2014], develop the synthetic control procedure for estimating the effect of a treatment, in the presence of a single treated unit and a number of control units, with pre-treatment outcomes observed for all units. The method constructs a set of weights such that selected covariat…
▽ More
In a seminal paper Abadie, Diamond, and Hainmueller [2010] (ADH), see also Abadie and Gardeazabal [2003], Abadie et al. [2014], develop the synthetic control procedure for estimating the effect of a treatment, in the presence of a single treated unit and a number of control units, with pre-treatment outcomes observed for all units. The method constructs a set of weights such that selected covariates and pre-treatment outcomes of the treated unit are approximately matched by a weighted average of control units (the synthetic control). The weights are restricted to be nonnegative and sum to one, which is important because it allows the procedure to obtain unique weights even when the number of lagged outcomes is modest relative to the number of control units, a common setting in applications. In the current paper we propose a generalization that allows the weights to be negative, and their sum to differ from one, and that allows for a permanent additive difference between the treated unit and the controls, similar to difference-in-difference procedures. The weights directly minimize the distance between the lagged outcomes for the treated and the control units, using regularization methods to deal with a potentially large number of possible control units.
△ Less
Submitted 19 September, 2017; v1 submitted 25 October, 2016;
originally announced October 2016.
-
Peer Encouragement Designs in Causal Inference with Partial Interference and Identification of Local Average Network Effects
Authors:
Hyunseung Kang,
Guido Imbens
Abstract:
In non-network settings, encouragement designs have been widely used to analyze causal effects of a treatment, policy, or intervention on an outcome of interest when randomizing the treatment was considered impractical or when compliance to treatment cannot be perfectly enforced. Unfortunately, such questions related to treatment compliance have received less attention in network settings and the…
▽ More
In non-network settings, encouragement designs have been widely used to analyze causal effects of a treatment, policy, or intervention on an outcome of interest when randomizing the treatment was considered impractical or when compliance to treatment cannot be perfectly enforced. Unfortunately, such questions related to treatment compliance have received less attention in network settings and the most well-studied experimental design in networks, the two-stage randomization design, requires perfect compliance with treatment. The paper proposes a new experimental design called peer encouragement design to study network treatment effects when enforcing treatment randomization is not feasible. The key idea in peer encouragement design is the idea of personalized encouragement, which allows point-identification of familiar estimands in the encouragement design literature. The paper also defines new causal estimands, local average network effects, that can be identified under the new design and analyzes the effect of non-compliance behavior in randomized experiments on networks.
△ Less
Submitted 14 September, 2016;
originally announced September 2016.
-
The State of Applied Econometrics - Causality and Policy Evaluation
Authors:
Susan Athey,
Guido Imbens
Abstract:
In this paper we discuss recent developments in econometrics that we view as important for empirical researchers working on policy evaluation questions. We focus on three main areas, where in each case we highlight recommendations for applied work. First, we discuss new research on identification strategies in program evaluation, with particular focus on synthetic control methods, regression disco…
▽ More
In this paper we discuss recent developments in econometrics that we view as important for empirical researchers working on policy evaluation questions. We focus on three main areas, where in each case we highlight recommendations for applied work. First, we discuss new research on identification strategies in program evaluation, with particular focus on synthetic control methods, regression discontinuity, external validity, and the causal interpretation of regression methods. Second, we discuss various forms of supplementary analyses to make the identification strategies more credible. These include placebo analyses as well as sensitivity and robustness analyses. Third, we discuss recent advances in machine learning methods for causal effects. These advances include methods to adjust for differences between treated and control units in high-dimensional settings, and methods for identifying and estimating heterogeneous treatment effects.
△ Less
Submitted 3 July, 2016;
originally announced July 2016.
-
The Econometrics of Randomized Experiments
Authors:
Susan Athey,
Guido Imbens
Abstract:
In this review, we present econometric and statistical methods for analyzing randomized experiments. For basic experiments we stress randomization-based inference as opposed to sampling-based inference. In randomization-based inference, uncertainty in estimates arises naturally from the random assignment of the treatments, rather than from hypothesized sampling from a large population. We show how…
▽ More
In this review, we present econometric and statistical methods for analyzing randomized experiments. For basic experiments we stress randomization-based inference as opposed to sampling-based inference. In randomization-based inference, uncertainty in estimates arises naturally from the random assignment of the treatments, rather than from hypothesized sampling from a large population. We show how this perspective relates to regression analyses for randomized experiments. We discuss the analyses of stratified, paired, and clustered randomized experiments, and we stress the general efficiency gains from stratification. We also discuss complications in randomized experiments such as non-compliance. In the presence of non-compliance we contrast intention-to-treat analyses with instrumental variables analyses allowing for general treatment effect heterogeneity. We consider in detail estimation and inference for heterogeneous treatment effects in settings with (possibly many) covariates. These methods allow researchers to explore heterogeneity by identifying subpopulations with different treatment effects while maintaining the ability to construct valid confidence intervals. We also discuss optimal assignment to treatment based on covariates in such settings. Finally, we discuss estimation and inference in experiments in settings with interactions between units, both in general network settings and in settings where the population is partitioned into groups with all interactions contained within these groups.
△ Less
Submitted 3 July, 2016;
originally announced July 2016.
-
Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions
Authors:
Susan Athey,
Guido W. Imbens,
Stefan Wager
Abstract:
There are many settings where researchers are interested in estimating average treatment effects and are willing to rely on the unconfoundedness assumption, which requires that the treatment assignment be as good as random conditional on pre-treatment variables. The unconfoundedness assumption is often more plausible if a large number of pre-treatment variables are included in the analysis, but th…
▽ More
There are many settings where researchers are interested in estimating average treatment effects and are willing to rely on the unconfoundedness assumption, which requires that the treatment assignment be as good as random conditional on pre-treatment variables. The unconfoundedness assumption is often more plausible if a large number of pre-treatment variables are included in the analysis, but this can worsen the performance of standard approaches to treatment effect estimation. In this paper, we develop a method for de-biasing penalized regression adjustments to allow sparse regression methods like the lasso to be used for sqrt{n}-consistent inference of average treatment effects in high-dimensional linear models. Given linearity, we do not need to assume that the treatment propensities are estimable, or that the average treatment effect is a sparse contrast of the outcome model parameters. Rather, in addition standard assumptions used to make lasso regression on the outcome model consistent under 1-norm error, we only require overlap, i.e., that the propensity score be uniformly bounded away from 0 and 1. Procedurally, our method combines balancing weights with a regularized regression adjustment.
△ Less
Submitted 31 January, 2018; v1 submitted 25 April, 2016;
originally announced April 2016.
-
Estimating Treatment Effects using Multiple Surrogates: The Role of the Surrogate Score and the Surrogate Index
Authors:
Susan Athey,
Raj Chetty,
Guido Imbens,
Hyunseung Kang
Abstract:
Estimating the long-term effects of treatments is of interest in many fields. A common challenge in estimating such treatment effects is that long-term outcomes are unobserved in the time frame needed to make policy decisions. One approach to overcome this missing data problem is to analyze treatments effects on an intermediate outcome, often called a statistical surrogate, if it satisfies the con…
▽ More
Estimating the long-term effects of treatments is of interest in many fields. A common challenge in estimating such treatment effects is that long-term outcomes are unobserved in the time frame needed to make policy decisions. One approach to overcome this missing data problem is to analyze treatments effects on an intermediate outcome, often called a statistical surrogate, if it satisfies the condition that treatment and outcome are independent conditional on the statistical surrogate. The validity of the surrogacy condition is often controversial. Here we exploit that fact that in modern datasets, researchers often observe a large number, possibly hundreds or thousands, of intermediate outcomes, thought to lie on or close to the causal chain between the treatment and the long-term outcome of interest. Even if none of the individual proxies satisfies the statistical surrogacy criterion by itself, using multiple proxies can be useful in causal inference. We focus primarily on a setting with two samples, an experimental sample containing data about the treatment indicator and the surrogates and an observational sample containing information about the surrogates and the primary outcome. We state assumptions under which the average treatment effect be identified and estimated with a high-dimensional vector of proxies that collectively satisfy the surrogacy assumption, and derive the bias from violations of the surrogacy assumption, and show that even if the primary outcome is also observed in the experimental sample, there is still information to be gained from using surrogates.
△ Less
Submitted 2 April, 2024; v1 submitted 30 March, 2016;
originally announced March 2016.
-
Propensity Score Matching and Subclassification in Observational Studies with Multi-level Treatments
Authors:
Shu Yang,
Guido W. Imbens,
Zhanglin Cui,
Douglas Faries,
Zbigniew Kadziola
Abstract:
In this paper, we develop new methods for estimating average treatment effects in observational studies, focusing on settings with more than two treatment levels under unconfoundedness given pre-treatment variables. We emphasize subclassification and matching methods which have been found to be effective in the binary treatment literature and which are among the most popular methods in that settin…
▽ More
In this paper, we develop new methods for estimating average treatment effects in observational studies, focusing on settings with more than two treatment levels under unconfoundedness given pre-treatment variables. We emphasize subclassification and matching methods which have been found to be effective in the binary treatment literature and which are among the most popular methods in that setting. Whereas the literature has suggested that these particular propensity-based methods do not naturally extend to the multi-level treatment case, we show, using the concept of weak unconfoundedness, that adjusting for or matching on a scalar function of the pre-treatment variables removes all biases associated with observed pre-treatment variables. We apply the proposed methods to an analysis of the effect of treatments for fibromyalgia. We also carry out a simulation study to assess the finite sample performance of the methods relative to previously proposed methods.
△ Less
Submitted 14 December, 2015; v1 submitted 27 August, 2015;
originally announced August 2015.
-
Exact P-values for Network Interference
Authors:
Susan Athey,
Dean Eckles,
Guido Imbens
Abstract:
We study the calculation of exact p-values for a large class of non-sharp null hypotheses about treatment effects in a setting with data from experiments involving members of a single connected network. The class includes null hypotheses that limit the effect of one unit's treatment status on another according to the distance between units; for example, the hypothesis might specify that the treatm…
▽ More
We study the calculation of exact p-values for a large class of non-sharp null hypotheses about treatment effects in a setting with data from experiments involving members of a single connected network. The class includes null hypotheses that limit the effect of one unit's treatment status on another according to the distance between units; for example, the hypothesis might specify that the treatment status of immediate neighbors has no effect, or that units more than two edges away have no effect. We also consider hypotheses concerning the validity of sparsification of a network (for example based on the strength of ties) and hypotheses restricting heterogeneity in peer effects (so that, for example, only the number or fraction treated among neighboring units matters). Our general approach is to define an artificial experiment, such that the null hypothesis that was not sharp for the original experiment is sharp for the artificial experiment, and such that the randomization analysis for the artificial experiment is validated by the design of the original experiment.
△ Less
Submitted 5 June, 2015;
originally announced June 2015.
-
Recursive Partitioning for Heterogeneous Causal Effects
Authors:
Susan Athey,
Guido Imbens
Abstract:
In this paper we study the problems of estimating heterogeneity in causal effects in experimental or observational studies and conducting inference about the magnitude of the differences in treatment effects across subsets of the population. In applications, our method provides a data-driven approach to determine which subpopulations have large or small treatment effects and to test hypotheses abo…
▽ More
In this paper we study the problems of estimating heterogeneity in causal effects in experimental or observational studies and conducting inference about the magnitude of the differences in treatment effects across subsets of the population. In applications, our method provides a data-driven approach to determine which subpopulations have large or small treatment effects and to test hypotheses about the differences in these effects. For experiments, our method allows researchers to identify heterogeneity in treatment effects that was not specified in a pre-analysis plan, without concern about invalidating inference due to multiple testing. In most of the literature on supervised machine learning (e.g. regression trees, random forests, LASSO, etc.), the goal is to build a model of the relationship between a unit's attributes and an observed outcome. A prominent role in these methods is played by cross-validation which compares predictions to actual outcomes in test samples, in order to select the level of complexity of the model that provides the best predictive power. Our method is closely related, but it differs in that it is tailored for predicting causal effects of a treatment rather than a unit's outcome. The challenge is that the "ground truth" for a causal effect is not observed for any individual unit: we observe the unit with the treatment, or without the treatment, but not both at the same time. Thus, it is not obvious how to use cross-validation to determine whether a causal effect has been accurately predicted. We propose several novel cross-validation criteria for this problem and demonstrate through simulations the conditions under which they perform better than standard methods for the problem of causal effects. We then apply the method to a large-scale field experiment re-ranking results on a search engine.
△ Less
Submitted 30 December, 2015; v1 submitted 5 April, 2015;
originally announced April 2015.
-
Rejoinder of "Instrumental Variables: An Econometrician's Perspective"
Authors:
Guido Imbens
Abstract:
Rejoinder of "Instrumental Variables: An Econometrician's Perspective" by Guido W. Imbens [arXiv:1410.0163].
Rejoinder of "Instrumental Variables: An Econometrician's Perspective" by Guido W. Imbens [arXiv:1410.0163].
△ Less
Submitted 2 October, 2014;
originally announced October 2014.
-
Instrumental Variables: An Econometrician's Perspective
Authors:
Guido W. Imbens
Abstract:
I review recent work in the statistics literature on instrumental variables methods from an econometrics perspective. I discuss some of the older, economic, applications including supply and demand models and relate them to the recent applications in settings of randomized experiments with noncompliance. I discuss the assumptions underlying instrumental variables methods and in what settings these…
▽ More
I review recent work in the statistics literature on instrumental variables methods from an econometrics perspective. I discuss some of the older, economic, applications including supply and demand models and relate them to the recent applications in settings of randomized experiments with noncompliance. I discuss the assumptions underlying instrumental variables methods and in what settings these may be plausible. By providing context to the current applications, a better understanding of the applicability of these methods may arise.
△ Less
Submitted 1 October, 2014;
originally announced October 2014.