Statistics > Methodology
[Submitted on 18 Jan 2018 (v1), revised 15 Jun 2018 (this version, v2), latest version 8 May 2020 (v4)]
Title:Anchor regression: heterogeneous data meets causality
View PDFAbstract:Many traditional statistical prediction methods mainly deal with the problem of overfitting to the given data set. On the other hand, there is a vast literature on the estimation of causal parameters for prediction under interventions. However, both types of estimators can perform poorly when used for prediction on heterogeneous data. We discuss the delicate trade-off between predictive performance on the training distribution and perturbed distributions. In particular, under a linear structural equation model with exogenous variables, we show that the change in loss under certain perturbations (interventions) can be written as a convex penalty. This motivates anchor regression, a regularization scheme that encourages the estimator to generalize well to perturbed data. The procedure naturally provides an interpolation between the solution to ordinary least squares and two-stage least squares, but also has predictive guarantees if the instrumental variables assumptions are violated. An additional characterization of the procedure is given in terms of quantiles: If the data follow a Gaussian distribution, the method minimizes quantiles of the conditional mean squared error. We derive guarantees of the proposed procedure for predictive performance under perturbations for the population case and for high-dimensional data and test its performance on real-world data.
Submission history
From: Dominik Rothenhäusler [view email][v1] Thu, 18 Jan 2018 20:32:09 UTC (251 KB)
[v2] Fri, 15 Jun 2018 15:05:09 UTC (407 KB)
[v3] Thu, 13 Jun 2019 22:14:23 UTC (783 KB)
[v4] Fri, 8 May 2020 18:50:04 UTC (1,465 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.