-
An Introduction to Causal Discovery
Authors:
Martin Huber
Abstract:
In social sciences and economics, causal inference traditionally focuses on assessing the impact of predefined treatments (or interventions) on predefined outcomes, such as the effect of education programs on earnings. Causal discovery, in contrast, aims to uncover causal relationships among multiple variables in a data-driven manner, by investigating statistical associations rather than relying o…
▽ More
In social sciences and economics, causal inference traditionally focuses on assessing the impact of predefined treatments (or interventions) on predefined outcomes, such as the effect of education programs on earnings. Causal discovery, in contrast, aims to uncover causal relationships among multiple variables in a data-driven manner, by investigating statistical associations rather than relying on predefined causal structures. This approach, more common in computer science, seeks to understand causality in an entire system of variables, which can be visualized by causal graphs. This survey provides an introduction to key concepts, algorithms, and applications of causal discovery from the perspectives of economics and social sciences. It covers fundamental concepts like d-separation, causal faithfulness, and Markov equivalence, sketches various algorithms for causal discovery, and discusses the back-door and front-door criteria for identifying causal effects. The survey concludes with more specific examples of causal discovery, e.g. for learning all variables that directly affect an outcome of interest and/or testing identification of causal effects in observational data.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Learning control variables and instruments for causal analysis in observational data
Authors:
Nicolas Apfel,
Julia Hatamyar,
Martin Huber,
Jannis Kueck
Abstract:
This study introduces a data-driven, machine learning-based method to detect suitable control variables and instruments for assessing the causal effect of a treatment on an outcome in observational data, if they exist. Our approach tests the joint existence of instruments, which are associated with the treatment but not directly with the outcome (at least conditional on observables), and suitable…
▽ More
This study introduces a data-driven, machine learning-based method to detect suitable control variables and instruments for assessing the causal effect of a treatment on an outcome in observational data, if they exist. Our approach tests the joint existence of instruments, which are associated with the treatment but not directly with the outcome (at least conditional on observables), and suitable control variables, conditional on which the treatment is exogenous, and learns the partition of instruments and control variables from the observed data. The detection of sets of instruments and control variables relies on the condition that proper instruments are conditionally independent of the outcome given the treatment and suitable control variables. We establish the consistency of our method for detecting control variables and instruments under certain regularity conditions, investigate the finite sample performance through a simulation study, and provide an empirical application to labor market data from the Job Corps study.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Testing identification in mediation and dynamic treatment models
Authors:
Martin Huber,
Kevin Kloiber,
Lukas Laffers
Abstract:
We propose a test for the identification of causal effects in mediation and dynamic treatment models that is based on two sets of observed variables, namely covariates to be controlled for and suspected instruments, building on the test by Huber and Kueck (2022) for single treatment models. We consider models with a sequential assignment of a treatment and a mediator to assess the direct treatment…
▽ More
We propose a test for the identification of causal effects in mediation and dynamic treatment models that is based on two sets of observed variables, namely covariates to be controlled for and suspected instruments, building on the test by Huber and Kueck (2022) for single treatment models. We consider models with a sequential assignment of a treatment and a mediator to assess the direct treatment effect (net of the mediator), the indirect treatment effect (via the mediator), or the joint effect of both treatment and mediator. We establish testable conditions for identifying such effects in observational data. These conditions jointly imply (1) the exogeneity of the treatment and the mediator conditional on covariates and (2) the validity of distinct instruments for the treatment and the mediator, meaning that the instruments do not directly affect the outcome (other than through the treatment or mediator) and are unconfounded given the covariates. Our framework extends to post-treatment sample selection or attrition problems when replacing the mediator by a selection indicator for observing the outcome, enabling joint testing of the selectivity of treatment and attrition. We propose a machine learning-based test to control for covariates in a data-driven manner and analyze its finite sample performance in a simulation study. Additionally, we apply our method to Slovak labor market data and find that our testable implications are not rejected for a sequence of training programs typically considered in dynamic treatment evaluations.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
A joint test of unconfoundedness and common trends
Authors:
Martin Huber,
Eva-Maria Oeß
Abstract:
This paper introduces an overidentification test of two alternative assumptions to identify the average treatment effect on the treated in a two-period panel data setting: unconfoundedness and common trends. Under the unconfoundedness assumption, treatment assignment and post-treatment outcomes are independent, conditional on control variables and pre-treatment outcomes, which motivates including…
▽ More
This paper introduces an overidentification test of two alternative assumptions to identify the average treatment effect on the treated in a two-period panel data setting: unconfoundedness and common trends. Under the unconfoundedness assumption, treatment assignment and post-treatment outcomes are independent, conditional on control variables and pre-treatment outcomes, which motivates including pre-treatment outcomes in the set of controls. Conversely, under the common trends assumption, the trend and the treatment assignment are independent, conditional on control variables. This motivates employing a Difference-in-Differences (DiD) approach by comparing the differences between pre- and post-treatment outcomes of the treatment and control group. Given the non-nested nature of these assumptions and their often ambiguous plausibility in empirical settings, we propose a joint test using a doubly robust statistic that can be combined with machine learning to control for observed confounders in a data-driven manner. We discuss various causal models that imply the satisfaction of either common trends, unconfoundedness, or both assumptions jointly, and we investigate the finite sample properties of our test through a simulation study. Additionally, we apply the proposed method to five empirical examples using publicly available datasets and find the test to reject the null hypothesis in two cases.
△ Less
Submitted 24 June, 2024; v1 submitted 25 April, 2024;
originally announced April 2024.
-
Machine Learning for Staggered Difference-in-Differences and Dynamic Treatment Effect Heterogeneity
Authors:
Julia Hatamyar,
Noemi Kreif,
Rudi Rocha,
Martin Huber
Abstract:
We combine two recently proposed nonparametric difference-in-differences methods, extending them to enable the examination of treatment effect heterogeneity in the staggered adoption setting using machine learning. The proposed method, machine learning difference-in-differences (MLDID), allows for estimation of time-varying conditional average treatment effects on the treated, which can be used to…
▽ More
We combine two recently proposed nonparametric difference-in-differences methods, extending them to enable the examination of treatment effect heterogeneity in the staggered adoption setting using machine learning. The proposed method, machine learning difference-in-differences (MLDID), allows for estimation of time-varying conditional average treatment effects on the treated, which can be used to conduct detailed inference on drivers of treatment effect heterogeneity. We perform simulations to evaluate the performance of MLDID and find that it accurately identifies the true predictors of treatment effect heterogeneity. We then use MLDID to evaluate the heterogeneous impacts of Brazil's Family Health Program on infant mortality, and find those in poverty and urban locations experienced the impact of the policy more quickly than other subgroups.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Doubly Robust Estimation of Direct and Indirect Quantile Treatment Effects with Machine Learning
Authors:
Yu-Chin Hsu,
Martin Huber,
Yu-Min Yen
Abstract:
We suggest double/debiased machine learning estimators of direct and indirect quantile treatment effects under a selection-on-observables assumption. This permits disentangling the causal effect of a binary treatment at a specific outcome rank into an indirect component that operates through an intermediate variable called mediator and an (unmediated) direct impact. The proposed method is based on…
▽ More
We suggest double/debiased machine learning estimators of direct and indirect quantile treatment effects under a selection-on-observables assumption. This permits disentangling the causal effect of a binary treatment at a specific outcome rank into an indirect component that operates through an intermediate variable called mediator and an (unmediated) direct impact. The proposed method is based on the efficient score functions of the cumulative distribution functions of potential outcomes, which are robust to certain misspecifications of the nuisance parameters, i.e., the outcome, treatment, and mediator models. We estimate these nuisance parameters by machine learning and use cross-fitting to reduce overfitting bias in the estimation of direct and indirect quantile treatment effects. We establish uniform consistency and asymptotic normality of our effect estimators. We also propose a multiplier bootstrap for statistical inference and show the validity of the multiplier bootstrap. Finally, we investigate the finite sample performance of our method in a simulation study and apply it to empirical data from the National Job Corp Study to assess the direct and indirect earnings effects of training.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
Treatment Effect Analysis for Pairs with Endogenous Treatment Takeup
Authors:
Mate Kormos,
Robert P. Lieli,
Martin Huber
Abstract:
We study causal inference in a setting in which units consisting of pairs of individuals (such as married couples) are assigned randomly to one of four categories: a treatment targeted at pair member A, a potentially different treatment targeted at pair member B, joint treatment, or no treatment. The setup includes the important special case in which the pair members are the same individual target…
▽ More
We study causal inference in a setting in which units consisting of pairs of individuals (such as married couples) are assigned randomly to one of four categories: a treatment targeted at pair member A, a potentially different treatment targeted at pair member B, joint treatment, or no treatment. The setup includes the important special case in which the pair members are the same individual targeted by two different treatments A and B. Allowing for endogenous non-compliance, including coordinated treatment takeup, as well as interference across treatments, we derive the causal interpretation of various instrumental variable estimands using weaker monotonicity conditions than in the literature. In general, coordinated treatment takeup makes it difficult to separate treatment interaction from treatment effect heterogeneity. We provide auxiliary conditions and various bounding strategies that may help zero in on causally interesting parameters. As an empirical illustration, we apply our results to a program randomly offering two different treatments, namely tutoring and financial incentives, to first year college students, in order to assess the treatments' effects on academic performance.
△ Less
Submitted 12 January, 2023;
originally announced January 2023.
-
The finite sample performance of instrumental variable-based estimators of the Local Average Treatment Effect when controlling for covariates
Authors:
Hugo Bodory,
Martin Huber,
Michael Lechner
Abstract:
This paper investigates the finite sample performance of a range of parametric, semi-parametric, and non-parametric instrumental variable estimators when controlling for a fixed set of covariates to evaluate the local average treatment effect. Our simulation designs are based on empirical labor market data from the US and vary in several dimensions, including effect heterogeneity, instrument selec…
▽ More
This paper investigates the finite sample performance of a range of parametric, semi-parametric, and non-parametric instrumental variable estimators when controlling for a fixed set of covariates to evaluate the local average treatment effect. Our simulation designs are based on empirical labor market data from the US and vary in several dimensions, including effect heterogeneity, instrument selectivity, instrument strength, outcome distribution, and sample size. Among the estimators and simulations considered, non-parametric estimation based on the random forest (a machine learner controlling for covariates in a data-driven way) performs competitive in terms of the average coverage rates of the (bootstrap-based) 95% confidence intervals, while also being relatively precise. Non-parametric kernel regression as well as certain versions of semi-parametric radius matching on the propensity score, pair matching on the covariates, and inverse probability weighting also have a decent coverage, but are less precise than the random forest-based method. In terms of the average root mean squared error of LATE estimation, kernel regression performs best, closely followed by the random forest method, which has the lowest average absolute bias.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
Detecting Grouped Local Average Treatment Effects and Selecting True Instruments
Authors:
Nicolas Apfel,
Helmut Farbmacher,
Rebecca Groh,
Martin Huber,
Henrika Langen
Abstract:
Under an endogenous binary treatment with heterogeneous effects and multiple instruments, we propose a two-step procedure for identifying complier groups with identical local average treatment effects (LATE) despite relying on distinct instruments, even if several instruments violate the identifying assumptions. We use the fact that the LATE is homogeneous for instruments which (i) satisfy the LAT…
▽ More
Under an endogenous binary treatment with heterogeneous effects and multiple instruments, we propose a two-step procedure for identifying complier groups with identical local average treatment effects (LATE) despite relying on distinct instruments, even if several instruments violate the identifying assumptions. We use the fact that the LATE is homogeneous for instruments which (i) satisfy the LATE assumptions (instrument validity and treatment monotonicity in the instrument) and (ii) generate identical complier groups in terms of treatment propensities given the respective instruments. We propose a two-step procedure, where we first cluster the propensity scores in the first step and find groups of IVs with the same reduced form parameters in the second step. Under the plurality assumption that within each set of instruments with identical treatment propensities, instruments truly satisfying the LATE assumptions are the largest group, our procedure permits identifying these true instruments in a data driven way. We show that our procedure is consistent and provides consistent and asymptotically normal estimators of underlying LATEs. We also provide a simulation study investigating the finite sample properties of our approach and an empirical application investigating the effect of incarceration on recidivism in the US with judge assignments serving as instruments.
△ Less
Submitted 31 October, 2023; v1 submitted 10 July, 2022;
originally announced July 2022.
-
How causal machine learning can leverage marketing strategies: Assessing and improving the performance of a coupon campaign
Authors:
Henrika Langen,
Martin Huber
Abstract:
We apply causal machine learning algorithms to assess the causal effect of a marketing intervention, namely a coupon campaign, on the sales of a retailer. Besides assessing the average impacts of different types of coupons, we also investigate the heterogeneity of causal effects across different subgroups of customers, e.g., between clients with relatively high vs. low prior purchases. Finally, we…
▽ More
We apply causal machine learning algorithms to assess the causal effect of a marketing intervention, namely a coupon campaign, on the sales of a retailer. Besides assessing the average impacts of different types of coupons, we also investigate the heterogeneity of causal effects across different subgroups of customers, e.g., between clients with relatively high vs. low prior purchases. Finally, we use optimal policy learning to determine (in a data-driven way) which customer groups should be targeted by the coupon campaign in order to maximize the marketing intervention's effectiveness in terms of sales. We find that only two out of the five coupon categories examined, namely coupons applicable to the product categories of drugstore items and other food, have a statistically significant positive effect on retailer sales. The assessment of group average treatment effects reveals substantial differences in the impact of coupon provision across customer groups, particularly across customer groups as defined by prior purchases at the store, with drugstore coupons being particularly effective among customers with high prior purchases and other food coupons among customers with low prior purchases. Our study provides a use case for the application of causal machine learning in business analytics to evaluate the causal impact of specific firm policies (like marketing campaigns) for decision support.
△ Less
Submitted 22 June, 2022; v1 submitted 22 April, 2022;
originally announced April 2022.
-
Testing the identification of causal effects in observational data
Authors:
Martin Huber,
Jannis Kueck
Abstract:
This study demonstrates the existence of a testable condition for the identification of the causal effect of a treatment on an outcome in observational data, which relies on two sets of variables: observed covariates to be controlled for and a suspected instrument. Under a causal structure commonly found in empirical applications, the testable conditional independence of the suspected instrument a…
▽ More
This study demonstrates the existence of a testable condition for the identification of the causal effect of a treatment on an outcome in observational data, which relies on two sets of variables: observed covariates to be controlled for and a suspected instrument. Under a causal structure commonly found in empirical applications, the testable conditional independence of the suspected instrument and the outcome given the treatment and the covariates has two implications. First, the instrument is valid, i.e. it does not directly affect the outcome (other than through the treatment) and is unconfounded conditional on the covariates. Second, the treatment is unconfounded conditional on the covariates such that the treatment effect is identified. We suggest tests of this conditional independence based on machine learning methods that account for covariates in a data-driven way and investigate their asymptotic behavior and finite sample performance in a simulation study. We also apply our testing approach to evaluating the impact of fertility on female labor supply when using the sibling sex ratio of the first two children as supposed instrument, which by and large points to a violation of our testable implication for the moderate set of socio-economic covariates considered.
△ Less
Submitted 11 June, 2023; v1 submitted 29 March, 2022;
originally announced March 2022.
-
From homemakers to breadwinners? How mandatory kindergarten affects maternal labour market outcomes
Authors:
Selina Gangl,
Martin Huber
Abstract:
We analyse the effect of mandatory kindergarten attendance for four-year-old children on maternal labour market outcomes in Switzerland. To determine the causal effect of this policy, we combine two different datasets and quasi-experiments in this paper: Firstly, we investigate a large administrative dataset and apply a non-parametric regression discontinuity design (RDD) to evaluate the effect of…
▽ More
We analyse the effect of mandatory kindergarten attendance for four-year-old children on maternal labour market outcomes in Switzerland. To determine the causal effect of this policy, we combine two different datasets and quasi-experiments in this paper: Firstly, we investigate a large administrative dataset and apply a non-parametric regression discontinuity design (RDD) to evaluate the effect of the reform at the birthday cut-off for entering the kindergarten in the same versus in the following year. Secondly, we complement this analysis by exploiting spatial variation and staggered treatment implementation of the reform across cantons (administrative units in Switzerland) in a difference-in-differences (DiD) approach based on a household survey. All in all, the results suggest that if anything, mandatory kindergarten increases the labour market outcomes of mothers very moderately. The effects are driven by previous non-employed mothers and by older rather than younger mothers.
△ Less
Submitted 8 March, 2022; v1 submitted 29 November, 2021;
originally announced November 2021.
-
Testing Monotonicity of Mean Potential Outcomes in a Continuous Treatment with High-Dimensional Data
Authors:
Yu-Chin Hsu,
Martin Huber,
Ying-Ying Lee,
Chu-An Liu
Abstract:
While most treatment evaluations focus on binary interventions, a growing literature also considers continuously distributed treatments. We propose a Cramér-von Mises-type test for testing whether the mean potential outcome given a specific treatment has a weakly monotonic relationship with the treatment dose under a weak unconfoundedness assumption. In a nonseparable structural model, applying ou…
▽ More
While most treatment evaluations focus on binary interventions, a growing literature also considers continuously distributed treatments. We propose a Cramér-von Mises-type test for testing whether the mean potential outcome given a specific treatment has a weakly monotonic relationship with the treatment dose under a weak unconfoundedness assumption. In a nonseparable structural model, applying our method amounts to testing monotonicity of the average structural function in the continuous treatment of interest. To flexibly control for a possibly high-dimensional set of covariates in our testing approach, we propose a double debiased machine learning estimator that accounts for covariates in a data-driven way. We show that the proposed test controls asymptotic size and is consistent against any fixed alternative. These theoretical findings are supported by the Monte-Carlo simulations. As an empirical illustration, we apply our test to the Job Corps study and reject a weakly negative relationship between the treatment (hours in academic and vocational training) and labor market performance among relatively low treatment values.
△ Less
Submitted 27 August, 2022; v1 submitted 8 June, 2021;
originally announced June 2021.
-
How residence permits affect the labor market attachment of foreign workers: Evidence from a migration lottery in Liechtenstein
Authors:
Berno Buechel,
Selina Gangl,
Martin Huber
Abstract:
We analyze the impact of obtaining a residence permit on foreign workers' labor market and residential attachment. To overcome the usually severe selection issues, we exploit a unique migration lottery that randomly assigns access to otherwise restricted residence permits in Liechtenstein (situated between Austria and Switzerland). Using an instrumental variable approach, our results show that lot…
▽ More
We analyze the impact of obtaining a residence permit on foreign workers' labor market and residential attachment. To overcome the usually severe selection issues, we exploit a unique migration lottery that randomly assigns access to otherwise restricted residence permits in Liechtenstein (situated between Austria and Switzerland). Using an instrumental variable approach, our results show that lottery compliers (whose migration behavior complies with the assignment in their first lottery) raise their employment probability in Liechtenstein by on average 24 percentage points across outcome periods (2008 to 2018) as a result of receiving a permit. Relatedly, their activity level and employment duration in Liechtenstein increase by on average 20 percentage points and 1.15 years, respectively, over the outcome window. These substantial and statistically significant effects are mainly driven by individuals not (yet) working in Liechtenstein prior to the lottery rather than by previous cross-border commuters. Moreover, we find both the labor market and residential effects to be persistent even several years after the lottery with no sign of fading out. These results suggest that granting resident permits to foreign workers can be effective to foster labor supply even beyond the effect of cross-border commuting from adjacent regions.
△ Less
Submitted 25 May, 2021;
originally announced May 2021.
-
Business analytics meets artificial intelligence: Assessing the demand effects of discounts on Swiss train tickets
Authors:
Martin Huber,
Jonas Meier,
Hannes Wallimann
Abstract:
We assess the demand effects of discounts on train tickets issued by the Swiss Federal Railways, the so-called `supersaver tickets', based on machine learning, a subfield of artificial intelligence. Considering a survey-based sample of buyers of supersaver tickets, we investigate which customer- or trip-related characteristics (including the discount rate) predict buying behavior, namely: booking…
▽ More
We assess the demand effects of discounts on train tickets issued by the Swiss Federal Railways, the so-called `supersaver tickets', based on machine learning, a subfield of artificial intelligence. Considering a survey-based sample of buyers of supersaver tickets, we investigate which customer- or trip-related characteristics (including the discount rate) predict buying behavior, namely: booking a trip otherwise not realized by train, buying a first- rather than second-class ticket, or rescheduling a trip (e.g.\ away from rush hours) when being offered a supersaver ticket. Predictive machine learning suggests that customer's age, demand-related information for a specific connection (like departure time and utilization), and the discount level permit forecasting buying behavior to a certain extent. Furthermore, we use causal machine learning to assess the impact of the discount rate on rescheduling a trip, which seems relevant in the light of capacity constraints at rush hours. Assuming that (i) the discount rate is quasi-random conditional on our rich set of characteristics and (ii) the buying decision increases weakly monotonically in the discount rate, we identify the discount rate's effect among `always buyers', who would have traveled even without a discount, based on our survey that asks about customer behavior in the absence of discounts. We find that on average, increasing the discount rate by one percentage point increases the share of rescheduled trips by 0.16 percentage points among always buyers. Investigating effect heterogeneity across observables suggests that the effects are higher for leisure travelers and during peak hours when controlling several other characteristics.
△ Less
Submitted 30 June, 2022; v1 submitted 4 May, 2021;
originally announced May 2021.
-
The fiscal response to revenue shocks
Authors:
Simon Berset,
Martin Huber,
Mark Schelker
Abstract:
We study the impact of fiscal revenue shocks on local fiscal policy. We focus on the very volatile revenues from the immovable property gains tax in the canton of Zurich, Switzerland, and analyze fiscal behavior following large and rare positive and negative revenue shocks. We apply causal machine learning strategies and implement the post-double-selection LASSO estimator to identify the causal ef…
▽ More
We study the impact of fiscal revenue shocks on local fiscal policy. We focus on the very volatile revenues from the immovable property gains tax in the canton of Zurich, Switzerland, and analyze fiscal behavior following large and rare positive and negative revenue shocks. We apply causal machine learning strategies and implement the post-double-selection LASSO estimator to identify the causal effect of revenue shocks on public finances. We show that local policymakers overall predominantly smooth fiscal shocks. However, we also find some patterns consistent with fiscal conservatism, where positive shocks are smoothed, while negative ones are mitigated by spending cuts.
△ Less
Submitted 19 January, 2021;
originally announced January 2021.
-
Assessing the effects of seasonal tariff-rate quotas on vegetable prices in Switzerland
Authors:
Daria Loginova,
Marco Portmann,
Martin Huber
Abstract:
Causal estimation of the short-term effects of tariff-rate quotas (TRQs) on vegetable producer prices is hampered by the large variety and different growing seasons of vegetables and is therefore rarely performed. We quantify the effects of Swiss seasonal TRQs on domestic producer prices of a variety of vegetables based on a difference-in-differences estimation using a novel dataset of weekly prod…
▽ More
Causal estimation of the short-term effects of tariff-rate quotas (TRQs) on vegetable producer prices is hampered by the large variety and different growing seasons of vegetables and is therefore rarely performed. We quantify the effects of Swiss seasonal TRQs on domestic producer prices of a variety of vegetables based on a difference-in-differences estimation using a novel dataset of weekly producer prices for Switzerland and neighbouring countries. We find that TRQs increase prices of most vegetables by more than 20% above the prices in neighbouring countries during the main harvest time for most vegetables and even more than 50% for some vegetables. The effects are stronger for more perishable vegetables and for conventionally produced ones compared with organic vegetables. However, we do not find clear-cut effects of TRQs on the week-to-week price volatility of vegetables although the overall lower price volatility in Switzerland compared with neighbouring countries might be a result of the TRQ system in place.
△ Less
Submitted 5 December, 2020;
originally announced December 2020.
-
Double machine learning for sample selection models
Authors:
Michela Bia,
Martin Huber,
Lukáš Lafférs
Abstract:
This paper considers the evaluation of discretely distributed treatments when outcomes are only observed for a subpopulation due to sample selection or outcome attrition. For identification, we combine a selection-on-observables assumption for treatment assignment with either selection-on-observables or instrumental variable assumptions concerning the outcome attrition/sample selection process. We…
▽ More
This paper considers the evaluation of discretely distributed treatments when outcomes are only observed for a subpopulation due to sample selection or outcome attrition. For identification, we combine a selection-on-observables assumption for treatment assignment with either selection-on-observables or instrumental variable assumptions concerning the outcome attrition/sample selection process. We also consider dynamic confounding, meaning that covariates that jointly affect sample selection and the outcome may (at least partly) be influenced by the treatment. To control in a data-driven way for a potentially high dimensional set of pre- and/or post-treatment covariates, we adapt the double machine learning framework for treatment evaluation to sample selection problems. We make use of (a) Neyman-orthogonal, doubly robust, and efficient score functions, which imply the robustness of treatment effect estimation to moderate regularization biases in the machine learning-based estimation of the outcome, treatment, or sample selection models and (b) sample splitting (or cross-fitting) to prevent overfitting bias. We demonstrate that the proposed estimators are asymptotically normal and root-n consistent under specific regularity conditions concerning the machine learners and investigate their finite sample properties in a simulation study. We also apply our proposed methodology to the Job Corps data for evaluating the effect of training on hourly wages which are only observed conditional on employment. The estimator is available in the causalweight package for the statistical software R.
△ Less
Submitted 15 July, 2021; v1 submitted 30 November, 2020;
originally announced December 2020.
-
Evaluating (weighted) dynamic treatment effects by double machine learning
Authors:
Hugo Bodory,
Martin Huber,
Lukáš Lafférs
Abstract:
We consider evaluating the causal effects of dynamic treatments, i.e. of multiple treatment sequences in various periods, based on double machine learning to control for observed, time-varying covariates in a data-driven way under a selection-on-observables assumption. To this end, we make use of so-called Neyman-orthogonal score functions, which imply the robustness of treatment effect estimation…
▽ More
We consider evaluating the causal effects of dynamic treatments, i.e. of multiple treatment sequences in various periods, based on double machine learning to control for observed, time-varying covariates in a data-driven way under a selection-on-observables assumption. To this end, we make use of so-called Neyman-orthogonal score functions, which imply the robustness of treatment effect estimation to moderate (local) misspecifications of the dynamic outcome and treatment models. This robustness property permits approximating outcome and treatment models by double machine learning even under high dimensional covariates and is combined with data splitting to prevent overfitting. In addition to effect estimation for the total population, we consider weighted estimation that permits assessing dynamic treatment effects in specific subgroups, e.g. among those treated in the first treatment period. We demonstrate that the estimators are asymptotically normal and $\sqrt{n}$-consistent under specific regularity conditions and investigate their finite sample properties in a simulation study. Finally, we apply the methods to the Job Corps study in order to assess different sequences of training programs under a large set of covariates.
△ Less
Submitted 19 June, 2021; v1 submitted 1 December, 2020;
originally announced December 2020.
-
On the plausibility of the latent ignorability assumption
Authors:
Martin Huber
Abstract:
The estimation of the causal effect of an endogenous treatment based on an instrumental variable (IV) is often complicated by attrition, sample selection, or non-response in the outcome of interest. To tackle the latter problem, the latent ignorability (LI) assumption imposes that attrition/sample selection is independent of the outcome conditional on the treatment compliance type (i.e. how the tr…
▽ More
The estimation of the causal effect of an endogenous treatment based on an instrumental variable (IV) is often complicated by attrition, sample selection, or non-response in the outcome of interest. To tackle the latter problem, the latent ignorability (LI) assumption imposes that attrition/sample selection is independent of the outcome conditional on the treatment compliance type (i.e. how the treatment behaves as a function of the instrument), the instrument, and possibly further observed covariates. As a word of caution, this note formally discusses the strong behavioral implications of LI in rather standard IV models. We also provide an empirical illustration based on the Job Corps experimental study, in which the sensitivity of the estimated program effect to LI and alternative assumptions about outcome attrition is investigated.
△ Less
Submitted 3 June, 2020; v1 submitted 2 June, 2020;
originally announced June 2020.
-
A Machine Learning Approach for Flagging Incomplete Bid-rigging Cartels
Authors:
Hannes Wallimann,
David Imhof,
Martin Huber
Abstract:
We propose a new method for flagging bid rigging, which is particularly useful for detecting incomplete bid-rigging cartels. Our approach combines screens, i.e. statistics derived from the distribution of bids in a tender, with machine learning to predict the probability of collusion. As a methodological innovation, we calculate such screens for all possible subgroups of three or four bids within…
▽ More
We propose a new method for flagging bid rigging, which is particularly useful for detecting incomplete bid-rigging cartels. Our approach combines screens, i.e. statistics derived from the distribution of bids in a tender, with machine learning to predict the probability of collusion. As a methodological innovation, we calculate such screens for all possible subgroups of three or four bids within a tender and use summary statistics like the mean, median, maximum, and minimum of each screen as predictors in the machine learning algorithm. This approach tackles the issue that competitive bids in incomplete cartels distort the statistical signals produced by bid rigging. We demonstrate that our algorithm outperforms previously suggested methods in applications to incomplete cartels based on empirical data from Switzerland.
△ Less
Submitted 12 April, 2020;
originally announced April 2020.
-
Gender Differences in Wage Expectations
Authors:
Ana Fernandes,
Martin Huber,
Giannina Vaccaro
Abstract:
Using a survey on wage expectations among students at two Swiss institutions of higher education, we examine the wage expectations of our respondents along two main lines. First, we investigate the rationality of wage expectations by comparing average expected wages from our sample with those of similar graduates; we further examine how our respondents revise their expectations when provided infor…
▽ More
Using a survey on wage expectations among students at two Swiss institutions of higher education, we examine the wage expectations of our respondents along two main lines. First, we investigate the rationality of wage expectations by comparing average expected wages from our sample with those of similar graduates; we further examine how our respondents revise their expectations when provided information about actual wages. Second, using causal mediation analysis, we test whether the consideration of a rich set of personal and professional controls, namely concerning family formation and children in addition to professional preferences, accounts for the difference in wage expectations across genders. We find that males and females overestimate their wages compared to actual ones, and that males respond in an overconfident manner to information about outside wages. Despite the attenuation of the gender difference in wage expectations brought about by the comprehensive set of controls, gender generally retains a significant direct, unexplained effect on wage expectations.
△ Less
Submitted 25 March, 2020;
originally announced March 2020.
-
Causal mediation analysis with double machine learning
Authors:
Helmut Farbmacher,
Martin Huber,
Lukáš Lafférs,
Henrika Langen,
Martin Spindler
Abstract:
This paper combines causal mediation analysis with double machine learning to control for observed confounders in a data-driven way under a selection-on-observables assumption in a high-dimensional setting. We consider the average indirect effect of a binary treatment operating through an intermediate variable (or mediator) on the causal path between the treatment and the outcome, as well as the u…
▽ More
This paper combines causal mediation analysis with double machine learning to control for observed confounders in a data-driven way under a selection-on-observables assumption in a high-dimensional setting. We consider the average indirect effect of a binary treatment operating through an intermediate variable (or mediator) on the causal path between the treatment and the outcome, as well as the unmediated direct effect. Estimation is based on efficient score functions, which possess a multiple robustness property w.r.t. misspecifications of the outcome, mediator, and treatment models. This property is key for selecting these models by double machine learning, which is combined with data splitting to prevent overfitting in the estimation of the effects of interest. We demonstrate that the direct and indirect effect estimators are asymptotically normal and root-n consistent under specific regularity conditions and investigate the finite sample properties of the suggested methods in a simulation study when considering lasso as machine learner. We also provide an empirical application to the U.S. National Longitudinal Survey of Youth, assessing the indirect effect of health insurance coverage on general health operating via routine checkups as mediator, as well as the direct effect. We find a moderate short term effect of health insurance coverage on general health which is, however, not mediated by routine checkups.
△ Less
Submitted 16 February, 2021; v1 submitted 28 February, 2020;
originally announced February 2020.
-
Bounds on direct and indirect effects under treatment/mediator endogeneity and outcome attrition
Authors:
Martin Huber,
Lukáš Lafférs
Abstract:
Causal mediation analysis aims at disentangling a treatment effect into an indirect mechanism operating through an intermediate outcome or mediator, as well as the direct effect of the treatment on the outcome of interest. However, the evaluation of direct and indirect effects is frequently complicated by non-ignorable selection into the treatment and/or mediator, even after controlling for observ…
▽ More
Causal mediation analysis aims at disentangling a treatment effect into an indirect mechanism operating through an intermediate outcome or mediator, as well as the direct effect of the treatment on the outcome of interest. However, the evaluation of direct and indirect effects is frequently complicated by non-ignorable selection into the treatment and/or mediator, even after controlling for observables, as well as sample selection/outcome attrition. We propose a method for bounding direct and indirect effects in the presence of such complications using a method that is based on a sequence of linear programming problems. Considering inverse probability weighting by propensity scores, we compute the weights that would yield identification in the absence of complications and perturb them by an entropy parameter reflecting a specific amount of propensity score misspecification to set-identify the effects of interest. We apply our method to data from the National Longitudinal Survey of Youth 1979 to derive bounds on the explained and unexplained components of a gender wage gap decomposition that is likely prone to non-ignorable mediator selection and outcome attrition.
△ Less
Submitted 3 May, 2020; v1 submitted 12 February, 2020;
originally announced February 2020.
-
An introduction to flexible methods for policy evaluation
Authors:
Martin Huber
Abstract:
This chapter covers different approaches to policy evaluation for assessing the causal effect of a treatment or intervention on an outcome of interest. As an introduction to causal inference, the discussion starts with the experimental evaluation of a randomized treatment. It then reviews evaluation methods based on selection on observables (assuming a quasi-random treatment given observed covaria…
▽ More
This chapter covers different approaches to policy evaluation for assessing the causal effect of a treatment or intervention on an outcome of interest. As an introduction to causal inference, the discussion starts with the experimental evaluation of a randomized treatment. It then reviews evaluation methods based on selection on observables (assuming a quasi-random treatment given observed covariates), instrumental variables (inducing a quasi-random shift in the treatment), difference-in-differences and changes-in-changes (exploiting changes in outcomes over time), as well as regression discontinuities and kinks (using changes in the treatment assignment at some threshold of a running variable). The chapter discusses methods particularly suited for data with many observations for a flexible (i.e. semi- or nonparametric) modeling of treatment effects, and/or many (i.e. high dimensional) observed covariates by applying machine learning to select and control for covariates in a data-driven way. This is not only useful for tackling confounding by controlling for instance for factors jointly affecting the treatment and the outcome, but also for learning effect heterogeneities across subgroups defined upon observable covariates and optimally targeting those groups for which the treatment is most effective.
△ Less
Submitted 1 October, 2019;
originally announced October 2019.
-
Direct and Indirect Effects based on Changes-in-Changes
Authors:
Martin Huber,
Mark Schelker,
Anthony Strittmatter
Abstract:
We propose a novel approach for causal mediation analysis based on changes-in-changes assumptions restricting unobserved heterogeneity over time. This allows disentangling the causal effect of a binary treatment on a continuous outcome into an indirect effect operating through a binary intermediate variable (called mediator) and a direct effect running via other causal mechanisms. We identify aver…
▽ More
We propose a novel approach for causal mediation analysis based on changes-in-changes assumptions restricting unobserved heterogeneity over time. This allows disentangling the causal effect of a binary treatment on a continuous outcome into an indirect effect operating through a binary intermediate variable (called mediator) and a direct effect running via other causal mechanisms. We identify average and quantile direct and indirect effects for various subgroups under the condition that the outcome is monotonic in the unobserved heterogeneity and that the distribution of the latter does not change over time conditional on the treatment and the mediator. We also provide a simulation study and an empirical application to the Jobs II programme.
△ Less
Submitted 22 October, 2019; v1 submitted 11 September, 2019;
originally announced September 2019.