Skip to main content

Showing 1–17 of 17 results for author: Rothenhäusler, D

.
  1. arXiv:2404.18370  [pdf, other

    stat.ME

    Out-of-distribution generalization under random, dense distributional shifts

    Authors: Yu** Jeong, Dominik Rothenhäusler

    Abstract: Many existing approaches for estimating parameters in settings with distributional shifts operate under an invariance assumption. For example, under covariate shift, it is assumed that p(y|x) remains invariant. We refer to such distribution shifts as sparse, since they may be substantial but affect only a part of the data generating system. In contrast, in various real-world settings, shifts might… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  2. arXiv:2309.01056  [pdf, other

    stat.AP stat.ME

    Diagnosing the role of observable distribution shift in scientific replications

    Authors: Ying **, Kevin Guo, Dominik Rothenhäusler

    Abstract: Many researchers have identified distribution shift as a likely contributor to the reproducibility crisis in behavioral and biomedical sciences. The idea is that if treatment effects vary across individual characteristics and experimental contexts, then studies conducted in different populations will estimate different average effects. This paper uses ``generalizability" methods to quantify how mu… ▽ More

    Submitted 2 September, 2023; originally announced September 2023.

  3. arXiv:2306.02948  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Learning under random distributional shifts

    Authors: Kirk Bansak, Elisabeth Paulson, Dominik Rothenhäusler

    Abstract: Many existing approaches for generating predictions in settings with distribution shift model distribution shifts as adversarial or low-rank in suitable representations. In various real-world settings, however, we might expect shifts to arise through the superposition of many small and random changes in the population and environment. Thus, we consider a class of random distribution shift models t… ▽ More

    Submitted 30 October, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

  4. arXiv:2211.10032  [pdf, other

    stat.ME

    Modular Regression: Improving Linear Models by Incorporating Auxiliary Data

    Authors: Ying **, Dominik Rothenhäusler

    Abstract: This paper develops a new framework, called modular regression, to utilize auxiliary information -- such as variables other than the original features or additional data sets -- in the training process of linear models. At a high level, our method follows the routine: (i) decomposing the regression task into several sub-tasks, (ii) fitting the sub-task models, and (iii) using the sub-task models t… ▽ More

    Submitted 23 November, 2023; v1 submitted 18 November, 2022; originally announced November 2022.

    Comments: Journal of Machine Learning Research

  5. arXiv:2209.09352  [pdf, other

    stat.ME

    Distributionally robust and generalizable inference

    Authors: Dominik Rothenhäusler, Peter Bühlmann

    Abstract: We discuss recently developed methods that quantify the stability and generalizability of statistical findings under distributional changes. In many practical problems, the data is not drawn i.i.d. from the target population. For example, unobserved sampling bias, batch effects, or unknown associations might inflate the variance compared to i.i.d. sampling. For reliable statistical inference, it i… ▽ More

    Submitted 3 October, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

  6. arXiv:2204.13193  [pdf, other

    stat.ME

    On the statistical role of inexact matching in observational studies

    Authors: Kevin Guo, Dominik Rothenhäusler

    Abstract: In observational causal inference, exact covariate matching plays two statistical roles: (i) it effectively controls for bias due to measured confounding; (ii) it justifies assumption-free inference based on randomization tests. This paper shows that inexact covariate matching does not always play these same roles. We find that inexact matching often leaves behind statistically meaningful bias and… ▽ More

    Submitted 30 November, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

    Comments: To appear in Biometrika

  7. arXiv:2202.11886  [pdf, other

    stat.ME

    Calibrated inference: statistical inference that accounts for both sampling uncertainty and distributional uncertainty

    Authors: Yu** Jeong, Dominik Rothenhäusler

    Abstract: How can we draw trustworthy scientific conclusions? One criterion is that a study can be replicated by independent teams. While replication is critically important, it is arguably insufficient. If a study is biased for some reason and other studies recapitulate the approach then findings might be consistently incorrect. It has been argued that trustworthy scientific conclusions require disparate s… ▽ More

    Submitted 10 January, 2023; v1 submitted 23 February, 2022; originally announced February 2022.

  8. arXiv:2106.03024  [pdf, other

    stat.ME

    Causal aggregation: estimation and inference of causal effects by constraint-based data fusion

    Authors: Jaime Roquero Gimenez, Dominik Rothenhäusler

    Abstract: In causal inference, it is common to estimate the causal effect of a single treatment variable on an outcome. However, practitioners may also be interested in the effect of simultaneous interventions on multiple covariates of a fixed target variable. We propose a novel method that allows to estimate the effect of joint interventions using data from different experiments in which only very few vari… ▽ More

    Submitted 22 November, 2022; v1 submitted 6 June, 2021; originally announced June 2021.

  9. arXiv:2105.03067  [pdf, other

    stat.ME math.ST stat.ML

    The $s$-value: evaluating stability with respect to distributional shifts

    Authors: Suyash Gupta, Dominik Rothenhäusler

    Abstract: Common statistical measures of uncertainty such as $p$-values and confidence intervals quantify the uncertainty due to sampling, that is, the uncertainty due to not observing the full population. However, sampling is not the only source of uncertainty. In practice, distributions change between locations and across time. This makes it difficult to gather knowledge that transfers across data sets. W… ▽ More

    Submitted 13 March, 2022; v1 submitted 7 May, 2021; originally announced May 2021.

    Comments: 43 pages, 9 figures

  10. arXiv:2104.04565  [pdf, other

    stat.ME

    Tailored inference for finite populations: conditional validity and transfer across distributions

    Authors: Ying **, Dominik Rothenhäusler

    Abstract: Parameters of sub-populations can be more relevant than super-population ones. For example, a healthcare provider may be interested in the effect of a treatment plan for a specific subset of their patients; policymakers may be concerned with the impact of a policy in a particular state within a given population. In these cases, the focus is on a specific finite population, as opposed to an infinit… ▽ More

    Submitted 20 March, 2023; v1 submitted 9 April, 2021; originally announced April 2021.

    Comments: To appear at Biometrika

  11. arXiv:2008.12892  [pdf, other

    stat.ME

    Model selection for estimation of causal parameters

    Authors: Dominik Rothenhäusler

    Abstract: A popular technique for selecting and tuning machine learning estimators is cross-validation. Cross-validation evaluates overall model fit, usually in terms of predictive accuracy. In causal inference, the optimal choice of estimator depends not only on the fitted models but also on assumptions the statistician is willing to make. In this case, the performance of different (potentially biased) est… ▽ More

    Submitted 6 July, 2021; v1 submitted 28 August, 2020; originally announced August 2020.

  12. arXiv:1907.13258  [pdf, other

    stat.ME

    Incremental causal effects

    Authors: Dominik Rothenhäusler, Bin Yu

    Abstract: Causal evidence is needed to act and it is often enough for the evidence to point towards a direction of the effect of an action. For example, policymakers might be interested in estimating the effect of slightly increasing taxes on private spending across the whole population. We study identifiability and estimation of causal effects, where a continuous treatment is slightly shifted across the wh… ▽ More

    Submitted 7 August, 2020; v1 submitted 30 July, 2019; originally announced July 2019.

  13. arXiv:1801.06229  [pdf, other

    stat.ME

    Anchor regression: heterogeneous data meets causality

    Authors: Dominik Rothenhäusler, Nicolai Meinshausen, Peter Bühlmann, Jonas Peters

    Abstract: We consider the problem of predicting a response variable from a set of covariates on a data set that differs in distribution from the training data. Causal parameters are optimal in terms of predictive accuracy if in the new distribution either many variables are affected by interventions or only some variables are affected, but the perturbations are strong. If the training and test distributions… ▽ More

    Submitted 8 May, 2020; v1 submitted 18 January, 2018; originally announced January 2018.

  14. arXiv:1706.06159  [pdf, other

    stat.ME

    Causal Dantzig: fast inference in linear structural equation models with hidden variables under additive interventions

    Authors: Dominik Rothenhäusler, Peter Bühlmann, Nicolai Meinshausen

    Abstract: Causal inference is known to be very challenging when only observational data are available. Randomized experiments are often costly and impractical and in instrumental variable regression the number of instruments has to exceed the number of causal predictors. It was recently shown in Peters et al. [2016] that causal inference for the full model is possible when data from distinct observational e… ▽ More

    Submitted 15 June, 2018; v1 submitted 19 June, 2017; originally announced June 2017.

  15. arXiv:1607.05980  [pdf, other

    math.ST

    Causal inference in partially linear structural equation models

    Authors: Dominik Rothenhäusler, Jan Ernest, Peter Bühlmann

    Abstract: We consider identifiability of partially linear additive structural equation models with Gaussian noise (PLSEMs) and estimation of distributionally equivalent models to a given PLSEM. Thereby, we also include robustness results for errors in the neighborhood of Gaussian distributions. Existing identifiability results in the framework of additive SEMs with Gaussian noise are limited to linear and n… ▽ More

    Submitted 14 December, 2017; v1 submitted 20 July, 2016; originally announced July 2016.

    Comments: D.R. and J.E. contributed equally to this work

    MSC Class: 62G99; 62H99; 68T99

  16. arXiv:1506.02494  [pdf, other

    stat.ME stat.ML

    backShift: Learning causal cyclic graphs from unknown shift interventions

    Authors: Dominik Rothenhäusler, Christina Heinze, Jonas Peters, Nicolai Meinshausen

    Abstract: We propose a simple method to learn linear causal cyclic models in the presence of latent variables. The method relies on equilibrium data of the model recorded under a specific kind of interventions ("shift interventions"). The location and strength of these interventions do not have to be known and can be estimated from the data. Our method, called backShift, only uses second moments of the data… ▽ More

    Submitted 18 November, 2015; v1 submitted 8 June, 2015; originally announced June 2015.

    Journal ref: Advances in Neural Information Processing Systems 28 (2015) 1513-1521

  17. arXiv:1502.07963  [pdf, other

    stat.ME

    Confidence Intervals for Maximin Effects in Inhomogeneous Large-Scale Data

    Authors: Dominik Rothenhäusler, Nicolai Meinshausen, Peter Bühlmann

    Abstract: One challenge of large-scale data analysis is that the assumption of an identical distribution for all samples is often not realistic. An optimal linear regression might, for example, be markedly different for distinct groups of the data. Maximin effects have been proposed as a computationally attractive way to estimate effects that are common across all data without fitting a mixture distribution… ▽ More

    Submitted 27 February, 2015; originally announced February 2015.