Search | arXiv e-print repository

Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes

Authors: Georg Manten, Cecilia Casolo, Emilio Ferrucci, Søren Wengel Mogensen, Cristopher Salvi, Niki Kilbertus

Abstract: Inferring the causal structure underlying stochastic dynamical systems from observational data holds great promise in domains ranging from science and health to finance. Such processes can often be accurately modeled via stochastic differential equations (SDEs), which naturally imply causal relationships via "which variables enter the differential of which other variables". In this paper, we devel… ▽ More Inferring the causal structure underlying stochastic dynamical systems from observational data holds great promise in domains ranging from science and health to finance. Such processes can often be accurately modeled via stochastic differential equations (SDEs), which naturally imply causal relationships via "which variables enter the differential of which other variables". In this paper, we develop a kernel-based test of conditional independence (CI) on "path-space" -- e.g., solutions to SDEs, but applicable beyond that -- by leveraging recent advances in signature kernels. We demonstrate strictly superior performance of our proposed CI test compared to existing approaches on path-space and provide theoretical consistency results. Then, we develop constraint-based causal discovery algorithms for acyclic stochastic dynamical systems (allowing for self-loops) that leverage temporal information to recover the entire directed acyclic graph. Assuming faithfulness and a CI oracle, we show that our algorithms are sound and complete. We empirically verify that our developed CI test in conjunction with the causal discovery algorithms outperform baselines across a range of settings. △ Less

Submitted 11 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

arXiv:2310.18654 [pdf, other]

Causal discovery in a complex industrial system: A time series benchmark

Authors: Søren Wengel Mogensen, Karin Rathsman, Per Nilsson

Abstract: Causal discovery outputs a causal structure, represented by a graph, from observed data. For time series data, there is a variety of methods, however, it is difficult to evaluate these on real data as realistic use cases very rarely come with a known causal graph to which output can be compared. In this paper, we present a dataset from an industrial subsystem at the European Spallation Source alon… ▽ More Causal discovery outputs a causal structure, represented by a graph, from observed data. For time series data, there is a variety of methods, however, it is difficult to evaluate these on real data as realistic use cases very rarely come with a known causal graph to which output can be compared. In this paper, we present a dataset from an industrial subsystem at the European Spallation Source along with its causal graph which has been constructed from expert knowledge. This provides a testbed for causal discovery from time series observations of complex systems, and we believe this can help inform the development of causal discovery methodology. △ Less

Submitted 28 October, 2023; originally announced October 2023.

Comments: 18 pages, 9 figures, 1 table

arXiv:2310.13796 [pdf, other]

Faithful graphical representations of local independence

Authors: Søren Wengel Mogensen

Abstract: Graphical models use graphs to represent conditional independence structure in the distribution of a random vector. In stochastic processes, graphs may represent so-called local independence or conditional Granger causality. Under some regularity conditions, a local independence graph implies a set of independences using a graphical criterion known as $δ$-separation, or using its generalization,… ▽ More Graphical models use graphs to represent conditional independence structure in the distribution of a random vector. In stochastic processes, graphs may represent so-called local independence or conditional Granger causality. Under some regularity conditions, a local independence graph implies a set of independences using a graphical criterion known as $δ$-separation, or using its generalization, $μ$-separation. This is a stochastic process analogue of $d$-separation in DAGs. However, there may be more independences than implied by this graph and this is a violation of so-called faithfulness. We characterize faithfulness in local independence graphs and give a method to construct a faithful graph from any local independence model such that the output equals the true graph when Markov and faithfulness assumptions hold. We discuss various assumptions that are weaker than faithfulness, and we explore different structure learning algorithms and their properties under varying assumptions. △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: 17 pages, 3 figures

arXiv:2310.04709 [pdf, other]

Time-dependent mediators in survival analysis: Graphical representation of causal assumptions

Authors: Søren Wengel Mogensen, Odd O. Aalen, Susanne Strohmaier

Abstract: We study time-dependent mediators in survival analysis using a treatment separation approach due to Didelez [2019] and based on earlier work by Robins and Richardson [2011]. This approach avoids nested counterfactuals and crossworld assumptions which are otherwise common in mediation analysis. The causal model of treatment, mediators, covariates, confounders and outcome is represented by causal di… ▽ More We study time-dependent mediators in survival analysis using a treatment separation approach due to Didelez [2019] and based on earlier work by Robins and Richardson [2011]. This approach avoids nested counterfactuals and crossworld assumptions which are otherwise common in mediation analysis. The causal model of treatment, mediators, covariates, confounders and outcome is represented by causal directed acyclic graphs (DAGs). However, the DAGs tend to be very complex when we have measurements at a large number of time points. We therefore suggest using so-called rolled graphs in which a node represents an entire coordinate process instead of a single random variable, leading us to far simpler graphical representations. The rolled graphs are not necessarily acyclic; they can be analyzed by $δ$-separation which is the appropriate graphical separation criterion in this class of graphs and analogous to $d$-separation. In particular, $δ$-separation is a graphical tool for evaluating if the conditions of the mediation analysis are met or if unmeasured confounders influence the estimated effects. We also state a mediational g-formula. This is similar to the approach in Vansteelandt et al. [2019] although that paper has a different conceptual basis. Finally, we apply this framework to a statistical model based on a Cox model with an added treatment effect.survival analysis; mediation; causal inference; graphical models; local independence graphs △ Less

Submitted 7 October, 2023; originally announced October 2023.

Comments: 40 pages, 9 figures

MSC Class: 62D20; 62N02; 62H22

arXiv:2308.10606 [pdf, other]

Analyzing Complex Systems with Cascades Using Continuous-Time Bayesian Networks

Authors: Alessandro Bregoli, Karin Rathsman, Marco Scutari, Fabio Stella, Søren Wengel Mogensen

Abstract: Interacting systems of events may exhibit cascading behavior where events tend to be temporally clustered. While the cascades themselves may be obvious from the data, it is important to understand which states of the system trigger them. For this purpose, we propose a modeling framework based on continuous-time Bayesian networks (CTBNs) to analyze cascading behavior in complex systems. This framew… ▽ More Interacting systems of events may exhibit cascading behavior where events tend to be temporally clustered. While the cascades themselves may be obvious from the data, it is important to understand which states of the system trigger them. For this purpose, we propose a modeling framework based on continuous-time Bayesian networks (CTBNs) to analyze cascading behavior in complex systems. This framework allows us to describe how events propagate through the system and to identify likely sentry states, that is, system states that may lead to imminent cascading behavior. Moreover, CTBNs have a simple graphical representation and provide interpretable outputs, both of which are important when communicating with domain experts. We also develop new methods for knowledge extraction from CTBNs and we apply the proposed methodology to a data set of alarms in a large industrial system. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: 21 pages, 11 figures

Journal ref: Proceedings of the 30th International Symposium on Temporal Representation and Reasoning (TIME 2023), Leibniz International Proceedings in Informatics (LIPIcs), 8:1-8:21

arXiv:2202.06052 [pdf, other]

Learning by Doing: Controlling a Dynamical System using Causality, Control, and Reinforcement Learning

Authors: Sebastian Weichwald, Søren Wengel Mogensen, Tabitha Edith Lee, Dominik Baumann, Oliver Kroemer, Isabelle Guyon, Sebastian Trimpe, Jonas Peters, Niklas Pfister

Abstract: Questions in causality, control, and reinforcement learning go beyond the classical machine learning task of prediction under i.i.d. observations. Instead, these fields consider the problem of learning how to actively perturb a system to achieve a certain effect on a response variable. Arguably, they have complementary views on the problem: In control, one usually aims to first identify the system… ▽ More Questions in causality, control, and reinforcement learning go beyond the classical machine learning task of prediction under i.i.d. observations. Instead, these fields consider the problem of learning how to actively perturb a system to achieve a certain effect on a response variable. Arguably, they have complementary views on the problem: In control, one usually aims to first identify the system by excitation strategies to then apply model-based design techniques to control the system. In (non-model-based) reinforcement learning, one directly optimizes a reward. In causality, one focus is on identifiability of causal structure. We believe that combining the different views might create synergies and this competition is meant as a first step toward such synergies. The participants had access to observational and (offline) interventional data generated by dynamical systems. Track CHEM considers an open-loop problem in which a single impulse at the beginning of the dynamics can be set, while Track ROBO considers a closed-loop problem in which control variables can be set at each time step. The goal in both tracks is to infer controls that drive the system to a desired state. Code is open-sourced ( https://github.com/LearningByDoingCompetition/learningbydoing-comp ) to reproduce the winning solutions of the competition and to facilitate trying out new methods on the competition tasks. △ Less

Submitted 12 February, 2022; originally announced February 2022.

Comments: https://learningbydoingcompetition.github.io/

arXiv:1909.13186 [pdf, other]

Causal screening for dynamical systems

Authors: Søren Wengel Mogensen

Abstract: Many classical algorithms output graphical representations of causal structures by testing conditional independence among a set of random variables. In dynamical systems, local independence can be used analogously as a testable implication of the underlying data-generating process. We suggest some inexpensive methods for causal screening which provide output with a sound causal interpretation unde… ▽ More Many classical algorithms output graphical representations of causal structures by testing conditional independence among a set of random variables. In dynamical systems, local independence can be used analogously as a testable implication of the underlying data-generating process. We suggest some inexpensive methods for causal screening which provide output with a sound causal interpretation under the assumption of ancestral faithfulness. The popular model class of linear Hawkes processes is used to provide an example of a dynamical causal model. We argue that for sparse causal graphs the output will often be close to complete. We give examples of this framework and apply it to a challenging biological system. △ Less

Submitted 11 September, 2020; v1 submitted 28 September, 2019; originally announced September 2019.

Comments: 13 pages, 3 figures

MSC Class: 62A99; 62M99

Journal ref: Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), PMLR volume 124, 2020

arXiv:1805.02407 [pdf, other]

Soft Maximin Estimation for Heterogeneous Data

Authors: Adam Lund, Søren Wengel Mogensen, Niels Richard Hansen

Abstract: Extracting a common robust signal from data divided into heterogeneous groups can be difficult when each group -- in addition to the signal -- can contain large, unique variation components. Previously, maximin estimation has been proposed as a robust estimation method in the presence of heterogeneous noise. We propose soft maximin estimation as a computationally attractive alternative aimed at st… ▽ More Extracting a common robust signal from data divided into heterogeneous groups can be difficult when each group -- in addition to the signal -- can contain large, unique variation components. Previously, maximin estimation has been proposed as a robust estimation method in the presence of heterogeneous noise. We propose soft maximin estimation as a computationally attractive alternative aimed at striking a balance between pooled estimation and (hard) maximin estimation. The soft maximin method provides a range of estimators, controlled by a parameter $ζ>0$, that interpolates pooled least squares estimation and maximin estimation. By establishing relevant theoretical properties we argue that the soft maximin method is both statistically sensibel and computationally attractive. We also demonstrate, on real and simulated data, that the soft maximin estimator can offer improvements over both pooled OLS and hard maximin in terms of predictive performance and computational complexity. A time and memory efficient implementation is provided in the R package \verb+SMME+ available on CRAN. △ Less

Submitted 7 April, 2022; v1 submitted 7 May, 2018; originally announced May 2018.

arXiv:1802.10163 [pdf, other]

doi 10.1214/19-AOS1821

Markov equivalence of marginalized local independence graphs

Authors: Søren Wengel Mogensen, Niels Richard Hansen

Abstract: Symmetric independence relations are often studied using graphical representations. Ancestral graphs or acyclic directed mixed graphs with $m$-separation provide classes of symmetric graphical independence models that are closed under marginalization. Asymmetric independence relations appear naturally for multivariate stochastic processes, for instance in terms of local independence. However, no c… ▽ More Symmetric independence relations are often studied using graphical representations. Ancestral graphs or acyclic directed mixed graphs with $m$-separation provide classes of symmetric graphical independence models that are closed under marginalization. Asymmetric independence relations appear naturally for multivariate stochastic processes, for instance in terms of local independence. However, no class of graphs representing such asymmetric independence relations, which is also closed under marginalization, has been developed. We develop the theory of directed mixed graphs with $μ$-separation and show that this provides a graphical independence model class which is closed under marginalization and which generalizes previously considered graphical representations of local independence. For statistical applications, it is pivotal to characterize graphs that induce the same independence relations as such a Markov equivalence class of graphs is the object that is ultimately identifiable from observational data. Our main result is that for directed mixed graphs with $μ$-separation each Markov equivalence class contains a maximal element which can be constructed from the independence relations alone. Moreover, we introduce the directed mixed equivalence graph as the maximal graph with edge markings. This graph encodes all the information about the edges that is identifiable from the independence relations, and furthermore it can be computed efficiently from the maximal graph. △ Less

Submitted 11 February, 2019; v1 submitted 27 February, 2018; originally announced February 2018.

Comments: 49 pages (including supplementary material), updated to add examples and fix typos

MSC Class: 62M99; 62A99

Journal ref: The Annals of Statistics 48(1), 2020, 539-559

Showing 1–9 of 9 results for author: Mogensen, S W