Search | arXiv e-print repository

Counterfactual-based Root Cause Analysis for Dynamical Systems

Authors: Juliane Weilbach, Sebastian Gerwinn, Karim Barsim, Martin Fränzle

Abstract: Identifying the underlying reason for a failing dynamic process or otherwise anomalous observation is a fundamental challenge, yet has numerous industrial applications. Identifying the failure-causing sub-system using causal inference, one can ask the question: "Would the observed failure also occur, if we had replaced the behaviour of a sub-system at a certain point in time with its normal behavi… ▽ More Identifying the underlying reason for a failing dynamic process or otherwise anomalous observation is a fundamental challenge, yet has numerous industrial applications. Identifying the failure-causing sub-system using causal inference, one can ask the question: "Would the observed failure also occur, if we had replaced the behaviour of a sub-system at a certain point in time with its normal behaviour?" To this end, a formal description of behaviour of the full system is needed in which such counterfactual questions can be answered. However, existing causal methods for root cause identification are typically limited to static settings and focusing on additive external influences causing failures rather than structural influences. In this paper, we address these problems by modelling the dynamic causal system using a Residual Neural Network and deriving corresponding counterfactual distributions over trajectories. We show quantitatively that more root causes are identified when an intervention is performed on the structural equation and the external influence, compared to an intervention on the external influence only. By employing an efficient approximation to a corresponding Shapley value, we also obtain a ranking between the different subsystems at different points in time being responsible for an observed failure, which is applicable in settings with large number of variables. We illustrate the effectiveness of the proposed method on a benchmark dynamic system as well as on a real world river dataset. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2309.08332 [pdf, other]

Estimation of Counterfactual Interventions under Uncertainties

Authors: Juliane Weilbach, Sebastian Gerwinn, Melih Kandemir, Martin Fraenzle

Abstract: Counterfactual analysis is intuitively performed by humans on a daily basis eg. "What should I have done differently to get the loan approved?". Such counterfactual questions also steer the formulation of scientific hypotheses. More formally it provides insights about potential improvements of a system by inferring the effects of hypothetical interventions into a past observation of the system's b… ▽ More Counterfactual analysis is intuitively performed by humans on a daily basis eg. "What should I have done differently to get the loan approved?". Such counterfactual questions also steer the formulation of scientific hypotheses. More formally it provides insights about potential improvements of a system by inferring the effects of hypothetical interventions into a past observation of the system's behaviour which plays a prominent role in a variety of industrial applications. Due to the hypothetical nature of such analysis, counterfactual distributions are inherently ambiguous. This ambiguity is particularly challenging in continuous settings in which a continuum of explanations exist for the same observation. In this paper, we address this problem by following a hierarchical Bayesian approach which explicitly models such uncertainty. In particular, we derive counterfactual distributions for a Bayesian Warped Gaussian Process thereby allowing for non-Gaussian distributions and non-additive noise. We illustrate the properties our approach on a synthetic and on a semi-synthetic example and show its performance when used within an algorithmic recourse downstream task. △ Less

Submitted 15 September, 2023; originally announced September 2023.

arXiv:2210.12061 [pdf, other]

Validation of Composite Systems by Discrepancy Propagation

Authors: David Reeb, Kanil Patel, Karim Barsim, Martin Schiegg, Sebastian Gerwinn

Abstract: Assessing the validity of a real-world system with respect to given quality criteria is a common yet costly task in industrial applications due to the vast number of required real-world tests. Validating such systems by means of simulation offers a promising and less expensive alternative, but requires an assessment of the simulation accuracy and therefore end-to-end measurements. Additionally, co… ▽ More Assessing the validity of a real-world system with respect to given quality criteria is a common yet costly task in industrial applications due to the vast number of required real-world tests. Validating such systems by means of simulation offers a promising and less expensive alternative, but requires an assessment of the simulation accuracy and therefore end-to-end measurements. Additionally, covariate shifts between simulations and actual usage can cause difficulties for estimating the reliability of such systems. In this work, we present a validation method that propagates bounds on distributional discrepancy measures through a composite system, thereby allowing us to derive an upper bound on the failure probability of the real system from potentially inaccurate simulations. Each propagation step entails an optimization problem, where -- for measures such as maximum mean discrepancy (MMD) -- we develop tight convex relaxations based on semidefinite programs. We demonstrate that our propagation method yields valid and useful bounds for composite systems exhibiting a variety of realistic effects. In particular, we show that the proposed method can successfully account for data shifts within the experimental design as well as model inaccuracies within the simulation. △ Less

Submitted 3 January, 2024; v1 submitted 21 October, 2022; originally announced October 2022.

Comments: 21 pages incl. 11 pages appendix; camera-ready version at UAI 2023

Journal ref: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI 2023), PMLR 216:1730-1740, 2023

arXiv:2107.07345 [pdf, other]

Inferring the Structure of Ordinary Differential Equations

Authors: Juliane Weilbach, Sebastian Gerwinn, Christian Weilbach, Melih Kandemir

Abstract: Understanding physical phenomena oftentimes means understanding the underlying dynamical system that governs observational measurements. While accurate prediction can be achieved with black box systems, they often lack interpretability and are less amenable for further expert investigation. Alternatively, the dynamics can be analysed via symbolic regression. In this paper, we extend the approach b… ▽ More Understanding physical phenomena oftentimes means understanding the underlying dynamical system that governs observational measurements. While accurate prediction can be achieved with black box systems, they often lack interpretability and are less amenable for further expert investigation. Alternatively, the dynamics can be analysed via symbolic regression. In this paper, we extend the approach by (Udrescu et al., 2020) called AIFeynman to the dynamic setting to perform symbolic regression on ODE systems based on observations from the resulting trajectories. We compare this extension to state-of-the-art approaches for symbolic regression empirically on several dynamical systems for which the ground truth equations of increasing complexity are available. Although the proposed approach performs best on this benchmark, we observed difficulties of all the compared symbolic regression approaches on more complex systems, such as Cart-Pole. △ Less

Submitted 5 July, 2021; originally announced July 2021.

arXiv:2006.09914 [pdf, other]

Learning Partially Known Stochastic Dynamics with Empirical PAC Bayes

Authors: Manuel Haussmann, Sebastian Gerwinn, Andreas Look, Barbara Rakitsch, Melih Kandemir

Abstract: Neural Stochastic Differential Equations model a dynamical environment with neural nets assigned to their drift and diffusion terms. The high expressive power of their nonlinearity comes at the expense of instability in the identification of the large set of free parameters. This paper presents a recipe to improve the prediction accuracy of such models in three steps: i) accounting for epistemic u… ▽ More Neural Stochastic Differential Equations model a dynamical environment with neural nets assigned to their drift and diffusion terms. The high expressive power of their nonlinearity comes at the expense of instability in the identification of the large set of free parameters. This paper presents a recipe to improve the prediction accuracy of such models in three steps: i) accounting for epistemic uncertainty by assuming probabilistic weights, ii) incorporation of partial knowledge on the state dynamics, and iii) training the resultant hybrid model by an objective derived from a PAC-Bayesian generalization bound. We observe in our experiments that this recipe effectively translates partial and noisy prior knowledge into an improved model fit. △ Less

Submitted 26 February, 2021; v1 submitted 17 June, 2020; originally announced June 2020.

Comments: Accepted at AISTATS 2021

arXiv:1906.00816 [pdf, ps, other]

Bayesian Evidential Deep Learning with PAC Regularization

Authors: Manuel Haussmann, Sebastian Gerwinn, Melih Kandemir

Abstract: We propose a novel method for closed-form predictive distribution modeling with neural nets. In quantifying prediction uncertainty, we build on Evidential Deep Learning, which has been impactful as being both simple to implement and giving closed-form access to predictive uncertainty. We employ it to model aleatoric uncertainty and extend it to account also for epistemic uncertainty by converting… ▽ More We propose a novel method for closed-form predictive distribution modeling with neural nets. In quantifying prediction uncertainty, we build on Evidential Deep Learning, which has been impactful as being both simple to implement and giving closed-form access to predictive uncertainty. We employ it to model aleatoric uncertainty and extend it to account also for epistemic uncertainty by converting it to a Bayesian Neural Net. While extending its uncertainty quantification capabilities, we maintain its analytically accessible predictive distribution model by performing progressive moment matching for the first time for approximate weight marginalization. The eventual model introduces a prohibitively large number of hyperparameters for stable training. We overcome this drawback by deriving a vacuous PAC bound that comprises the marginal likelihood of the predictor and a complexity penalty. We observe on regression, classification, and out-of-domain detection benchmarks that our method improves model fit and uncertainty quantification. △ Less

Submitted 21 January, 2021; v1 submitted 3 June, 2019; originally announced June 2019.

Comments: Presented at AABI 2020

arXiv:1810.12263 [pdf, other]

Learning Gaussian Processes by Minimizing PAC-Bayesian Generalization Bounds

Authors: David Reeb, Andreas Doerr, Sebastian Gerwinn, Barbara Rakitsch

Abstract: Gaussian Processes (GPs) are a generic modelling tool for supervised learning. While they have been successfully applied on large datasets, their use in safety-critical applications is hindered by the lack of good performance guarantees. To this end, we propose a method to learn GPs and their sparse approximations by directly optimizing a PAC-Bayesian bound on their generalization performance, ins… ▽ More Gaussian Processes (GPs) are a generic modelling tool for supervised learning. While they have been successfully applied on large datasets, their use in safety-critical applications is hindered by the lack of good performance guarantees. To this end, we propose a method to learn GPs and their sparse approximations by directly optimizing a PAC-Bayesian bound on their generalization performance, instead of maximizing the marginal likelihood. Besides its theoretical appeal, we find in our evaluation that our learning method is robust and yields significantly better generalization guarantees than other common GP approaches on several regression benchmark datasets. △ Less

Submitted 28 December, 2018; v1 submitted 29 October, 2018; originally announced October 2018.

Comments: 11 pages main text, 12 pages appendix. v2: minor changes, new NeurIPS style file. Final camera-ready version submitted to NeurIPS 2018

Journal ref: Advances in Neural Information Processing Systems 31 (Proceedings of the NeurIPS Conference 2018), https://papers.nips.cc/paper/7594-learning-gaussian-processes-by-minimizing-pac-bayesian-generalization-bounds

arXiv:1011.6086 [pdf, other]

In All Likelihood, Deep Belief Is Not Enough

Authors: Lucas Theis, Sebastian Gerwinn, Fabian Sinz, Matthias Bethge

Abstract: Statistical models of natural stimuli provide an important tool for researchers in the fields of machine learning and computational neuroscience. A canonical way to quantitatively assess and compare the performance of statistical models is given by the likelihood. One class of statistical models which has recently gained increasing popularity and has been applied to a variety of complex data are d… ▽ More Statistical models of natural stimuli provide an important tool for researchers in the fields of machine learning and computational neuroscience. A canonical way to quantitatively assess and compare the performance of statistical models is given by the likelihood. One class of statistical models which has recently gained increasing popularity and has been applied to a variety of complex data are deep belief networks. Analyses of these models, however, have been typically limited to qualitative analyses based on samples due to the computationally intractable nature of the model likelihood. Motivated by these circumstances, the present article provides a consistent estimator for the likelihood that is both computationally tractable and simple to apply in practice. Using this estimator, a deep belief network which has been suggested for the modeling of natural image patches is quantitatively investigated and compared to other models of natural image patches. Contrary to earlier claims based on qualitative results, the results presented in this article provide evidence that the model under investigation is not a particularly good model for natural images △ Less

Submitted 28 November, 2010; originally announced November 2010.

Journal ref: Journal of Machine Learning Research 12, 3071-3096, 2011

Showing 1–8 of 8 results for author: Gerwinn, S