Search | arXiv e-print repository

Counterfactual Reasoning with Probabilistic Graphical Models for Analyzing Socioecological Systems

Authors: Rafael Cabañas, Ana D. Maldonado, María Morales, Pedro A. Aguilera, Antonio Salmerón

Abstract: Causal and counterfactual reasoning are emerging directions in data science that allow us to reason about hypothetical scenarios. This is particularly useful in domains where experimental data are usually not available. In the context of environmental and ecological sciences, causality enables us, for example, to predict how an ecosystem would respond to hypothetical interventions. A structural ca… ▽ More Causal and counterfactual reasoning are emerging directions in data science that allow us to reason about hypothetical scenarios. This is particularly useful in domains where experimental data are usually not available. In the context of environmental and ecological sciences, causality enables us, for example, to predict how an ecosystem would respond to hypothetical interventions. A structural causal model is a class of probabilistic graphical models for causality, which, due to its intuitive nature, can be easily understood by experts in multiple fields. However, certain queries, called unidentifiable, cannot be calculated in an exact and precise manner. This paper proposes applying a novel and recent technique for bounding unidentifiable queries within the domain of socioecological systems. Our findings indicate that traditional statistical analysis, including probabilistic graphical models, can identify the influence between variables. However, such methods do not offer insights into the nature of the relationship, specifically whether it involves necessity or sufficiency. This is where counterfactual reasoning becomes valuable. △ Less

Submitted 18 January, 2024; originally announced January 2024.

Comments: 34 pages

arXiv:2307.16577 [pdf, ps, other]

Approximating Counterfactual Bounds while Fusing Observational, Biased and Randomised Data Sources

Authors: Marco Zaffalon, Alessandro Antonucci, Rafael Cabañas, David Huber

Abstract: We address the problem of integrating data from multiple, possibly biased, observational and interventional studies, to eventually compute counterfactuals in structural causal models. We start from the case of a single observational dataset affected by a selection bias. We show that the likelihood of the available data has no local maxima. This enables us to use the causal expectation-maximisation… ▽ More We address the problem of integrating data from multiple, possibly biased, observational and interventional studies, to eventually compute counterfactuals in structural causal models. We start from the case of a single observational dataset affected by a selection bias. We show that the likelihood of the available data has no local maxima. This enables us to use the causal expectation-maximisation scheme to approximate the bounds for partially identifiable counterfactual queries, which are the focus of this paper. We then show how the same approach can address the general case of multiple datasets, no matter whether interventional or observational, biased or unbiased, by remap** it into the former one via graphical transformations. Systematic numerical experiments and a case study on palliative care show the effectiveness of our approach, while hinting at the benefits of fusing heterogeneous data sources to get informative outcomes in case of partial identifiability. △ Less

Submitted 31 July, 2023; originally announced July 2023.

arXiv:2307.08304 [pdf, ps, other]

Efficient Computation of Counterfactual Bounds

Authors: Marco Zaffalon, Alessandro Antonucci, Rafael Cabañas, David Huber, Dario Azzimonti

Abstract: We assume to be given structural equations over discrete variables inducing a directed acyclic graph, namely, a structural causal model, together with data about its internal nodes. The question we want to answer is how we can compute bounds for partially identifiable counterfactual queries from such an input. We start by giving a map from structural casual models to credal networks. This allows u… ▽ More We assume to be given structural equations over discrete variables inducing a directed acyclic graph, namely, a structural causal model, together with data about its internal nodes. The question we want to answer is how we can compute bounds for partially identifiable counterfactual queries from such an input. We start by giving a map from structural casual models to credal networks. This allows us to compute exact counterfactual bounds via algorithms for credal nets on a subclass of structural causal models. Exact computation is going to be inefficient in general given that, as we show, causal inference is NP-hard even on polytrees. We target then approximate bounds via a causal EM scheme. We evaluate their accuracy by providing credible intervals on the quality of the approximation; we show through a synthetic benchmark that the EM scheme delivers accurate results in a fair number of runs. In the course of the discussion, we also point out what seems to be a neglected limitation to the trending idea that counterfactual bounds can be computed without knowledge of the structural equations. We also present a real case study on palliative care to show how our algorithms can readily be used for practical purposes. △ Less

Submitted 4 December, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

arXiv:2212.02932 [pdf, ps, other]

Learning to Bound Counterfactual Inference from Observational, Biased and Randomised Data

Authors: Marco Zaffalon, Alessandro Antonucci, David Huber, Rafael Cabañas

Abstract: We address the problem of integrating data from multiple, possibly biased, observational and interventional studies, to eventually compute counterfactuals in structural causal models. We start from the case of a single observational dataset affected by a selection bias. We show that the likelihood of the available data has no local maxima. This enables us to use the causal expectation-maximisation… ▽ More We address the problem of integrating data from multiple, possibly biased, observational and interventional studies, to eventually compute counterfactuals in structural causal models. We start from the case of a single observational dataset affected by a selection bias. We show that the likelihood of the available data has no local maxima. This enables us to use the causal expectation-maximisation scheme to compute approximate bounds for partially identifiable counterfactual queries, which are the focus of this paper. We then show how the same approach can solve the general case of multiple datasets, no matter whether interventional or observational, biased or unbiased, by remap** it into the former one via graphical transformations. Systematic numerical experiments and a case study on palliative care show the effectiveness and accuracy of our approach, while hinting at the benefits of integrating heterogeneous data to get informative bounds in case of partial identifiability. △ Less

Submitted 16 March, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

arXiv:2208.01417 [pdf, ps, other]

Bounding Counterfactuals under Selection Bias

Authors: Marco Zaffalon, Alessandro Antonucci, Rafael Cabañas, David Huber, Dario Azzimonti

Abstract: Causal analysis may be affected by selection bias, which is defined as the systematic exclusion of data from a certain subpopulation. Previous work in this area focused on the derivation of identifiability conditions. We propose instead a first algorithm to address both identifiable and unidentifiable queries. We prove that, in spite of the missingness induced by the selection bias, the likelihood… ▽ More Causal analysis may be affected by selection bias, which is defined as the systematic exclusion of data from a certain subpopulation. Previous work in this area focused on the derivation of identifiability conditions. We propose instead a first algorithm to address both identifiable and unidentifiable queries. We prove that, in spite of the missingness induced by the selection bias, the likelihood of the available data is unimodal. This enables us to use the causal expectation-maximisation scheme to obtain the values of causal queries in the identifiable case, and to compute bounds otherwise. Experiments demonstrate the approach to be practically viable. Theoretical convergence characterisations are provided. △ Less

Submitted 26 July, 2022; originally announced August 2022.

Comments: Eleventh International Conference on Probabilistic Graphical Models (PGM 2022)

arXiv:2110.13786 [pdf, other]

Diversity and Generalization in Neural Network Ensembles

Authors: Luis A. Ortega, Rafael Cabañas, Andrés R. Masegosa

Abstract: Ensembles are widely used in machine learning and, usually, provide state-of-the-art performance in many prediction tasks. From the very beginning, the diversity of an ensemble has been identified as a key factor for the superior performance of these models. But the exact role that diversity plays in ensemble models is poorly understood, specially in the context of neural networks. In this work, w… ▽ More Ensembles are widely used in machine learning and, usually, provide state-of-the-art performance in many prediction tasks. From the very beginning, the diversity of an ensemble has been identified as a key factor for the superior performance of these models. But the exact role that diversity plays in ensemble models is poorly understood, specially in the context of neural networks. In this work, we combine and expand previously published results in a theoretically sound framework that describes the relationship between diversity and ensemble performance for a wide range of ensemble methods. More precisely, we provide sound answers to the following questions: how to measure diversity, how diversity relates to the generalization error of an ensemble, and how diversity is promoted by neural network ensemble algorithms. This analysis covers three widely used loss functions, namely, the squared loss, the cross-entropy loss, and the 0-1 loss; and two widely used model combination strategies, namely, model averaging and weighted majority vote. We empirically validate this theoretical analysis with neural network ensembles. △ Less

Submitted 16 February, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

arXiv:2105.04158 [pdf, ps, other]

CREPO: An Open Repository to Benchmark Credal Network Algorithms

Authors: Rafael Cabañas, Alessandro Antonucci

Abstract: Credal networks are a popular class of imprecise probabilistic graphical models obtained as a Bayesian network generalization based on, so-called credal, sets of probability mass functions. A Java library called CREMA has been recently released to model, process and query credal networks. Despite the NP-hardness of the (exact) task, a number of algorithms is available to approximate credal network… ▽ More Credal networks are a popular class of imprecise probabilistic graphical models obtained as a Bayesian network generalization based on, so-called credal, sets of probability mass functions. A Java library called CREMA has been recently released to model, process and query credal networks. Despite the NP-hardness of the (exact) task, a number of algorithms is available to approximate credal network inferences. In this paper we present CREPO, an open repository of synthetic credal networks, provided together with the exact results of inference tasks on these models. A Python tool is also delivered to load these data and interact with CREMA, thus making extremely easy to evaluate and compare existing and novel inference algorithms. To demonstrate such benchmarking scheme, we propose an approximate heuristic to be used inside variable elimination schemes to keep a bound on the maximum number of vertices generated during the combination step. A CREPO-based validation against approximate procedures based on linearization and exact techniques performed in CREMA is finally discussed. △ Less

Submitted 10 May, 2021; originally announced May 2021.

Comments: Isipta 2021 (Version with Supplementary Material)

arXiv:2011.02912 [pdf, ps, other]

Causal Expectation-Maximisation

Authors: Marco Zaffalon, Alessandro Antonucci, Rafael Cabañas

Abstract: Structural causal models are the basic modelling unit in Pearl's causal theory; in principle they allow us to solve counterfactuals, which are at the top rung of the ladder of causation. But they often contain latent variables that limit their application to special settings. This appears to be a consequence of the fact, proven in this paper, that causal inference is NP-hard even in models charact… ▽ More Structural causal models are the basic modelling unit in Pearl's causal theory; in principle they allow us to solve counterfactuals, which are at the top rung of the ladder of causation. But they often contain latent variables that limit their application to special settings. This appears to be a consequence of the fact, proven in this paper, that causal inference is NP-hard even in models characterised by polytree-shaped graphs. To deal with such a hardness, we introduce the causal EM algorithm. Its primary aim is to reconstruct the uncertainty about the latent variables from data about categorical manifest variables. Counterfactual inference is then addressed via standard algorithms for Bayesian networks. The result is a general method to approximately compute counterfactuals, be they identifiable or not (in which case we deliver bounds). We show empirically, as well as by deriving credible intervals, that the approximation we provide becomes accurate in a fair number of EM runs. These results lead us finally to argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations. △ Less

Submitted 22 November, 2021; v1 submitted 4 November, 2020; originally announced November 2020.

Comments: WHY-21 workshop (NeurIPS 2021)

arXiv:2008.00463 [pdf, ps, other]

Structural Causal Models Are (Solvable by) Credal Networks

Authors: Marco Zaffalon, Alessandro Antonucci, Rafael Cabañas

Abstract: A structural causal model is made of endogenous (manifest) and exogenous (latent) variables. We show that endogenous observations induce linear constraints on the probabilities of the exogenous variables. This allows to exactly map a causal model into a credal network. Causal inferences, such as interventions and counterfactuals, can consequently be obtained by standard algorithms for the updating… ▽ More A structural causal model is made of endogenous (manifest) and exogenous (latent) variables. We show that endogenous observations induce linear constraints on the probabilities of the exogenous variables. This allows to exactly map a causal model into a credal network. Causal inferences, such as interventions and counterfactuals, can consequently be obtained by standard algorithms for the updating of credal nets. These natively return sharp values in the identifiable case, while intervals corresponding to the exact bounds are produced for unidentifiable queries. A characterization of the causal models that allow the map above to be compactly derived is given, along with a discussion about the scalability for general models. This contribution should be regarded as a systematic approach to represent structural causal models by credal networks and hence to systematically compute causal inferences. A number of demonstrative examples is presented to clarify our methodology. Extensive experiments show that approximate algorithms for credal networks can immediately be used to do causal inference in real-size problems. △ Less

Submitted 2 August, 2020; originally announced August 2020.

Comments: To appear in the proceedings of the 10th International Conference on Probabilistic Graphical Models (PGM 2020)

arXiv:1908.11161 [pdf, other]

InferPy: Probabilistic Modeling with Deep Neural Networks Made Easy

Authors: Javier Cózar, Rafael Cabañas, Antonio Salmerón, Andrés R. Masegosa

Abstract: InferPy is a Python package for probabilistic modeling with deep neural networks. It defines a user-friendly API that trades-off model complexity with ease of use, unlike other libraries whose focus is on dealing with very general probabilistic models at the cost of having a more complex API. In particular, this package allows to define, learn and evaluate general hierarchical probabilistic models… ▽ More InferPy is a Python package for probabilistic modeling with deep neural networks. It defines a user-friendly API that trades-off model complexity with ease of use, unlike other libraries whose focus is on dealing with very general probabilistic models at the cost of having a more complex API. In particular, this package allows to define, learn and evaluate general hierarchical probabilistic models containing deep neural networks in a compact and simple way. InferPy is built on top of Tensorflow Probability and Keras. △ Less

Submitted 12 February, 2020; v1 submitted 29 August, 2019; originally announced August 2019.

Comments: 5 pages limit (paper submitted to an original software publication track). This paper briefly describes a scientific software

arXiv:1908.03442 [pdf, other]

Probabilistic Models with Deep Neural Networks

Authors: Andrés R. Masegosa, Rafael Cabañas, Helge Langseth, Thomas D. Nielsen, Antonio Salmerón

Abstract: Recent advances in statistical inference have significantly expanded the toolbox of probabilistic modeling. Historically, probabilistic modeling has been constrained to (i) very restricted model classes where exact or approximate probabilistic inference were feasible, and (ii) small or medium-sized data sets which fit within the main memory of the computer. However, developments in variational inf… ▽ More Recent advances in statistical inference have significantly expanded the toolbox of probabilistic modeling. Historically, probabilistic modeling has been constrained to (i) very restricted model classes where exact or approximate probabilistic inference were feasible, and (ii) small or medium-sized data sets which fit within the main memory of the computer. However, developments in variational inference, a general form of approximate probabilistic inference originated in statistical physics, are allowing probabilistic modeling to overcome these restrictions: (i) Approximate probabilistic inference is now possible over a broad class of probabilistic models containing a large number of parameters, and (ii) scalable inference methods based on stochastic gradient descent and distributed computation engines allow to apply probabilistic modeling over massive data sets. One important practical consequence of these advances is the possibility to include deep neural networks within a probabilistic model to capture complex non-linear stochastic relationships between random variables. These advances in conjunction with the release of novel probabilistic modeling toolboxes have greatly expanded the scope of application of probabilistic models, and allow these models to take advantage of the recent strides made by the deep learning community. In this paper we review the main concepts, methods and tools needed to use deep neural networks within a probabilistic modeling framework. △ Less

Submitted 2 October, 2019; v1 submitted 9 August, 2019; originally announced August 2019.

arXiv:1905.13227 [pdf]

Equation of State of Colloidal Membranes

Authors: Andrew J. Balchunas, Rafael A. Cabanas, Mark J. Zakhary, Thomas Gibaud, Seth Fraden, Prerna Sharma, Michael. F. Hagan, Zvonimir Dogic

Abstract: In the presence of a non-adsorbing polymer, monodisperse rod-like colloids assemble into one-rod-length thick liquid-like monolayers, called colloidal membranes. The density of the rods within a colloidal membrane is determined by a balance between the osmotic pressure exerted by the envelo** polymer suspension and the repulsion between the colloidal rods. We developed a microfluidic device for… ▽ More In the presence of a non-adsorbing polymer, monodisperse rod-like colloids assemble into one-rod-length thick liquid-like monolayers, called colloidal membranes. The density of the rods within a colloidal membrane is determined by a balance between the osmotic pressure exerted by the envelo** polymer suspension and the repulsion between the colloidal rods. We developed a microfluidic device for continuously observing an isolated membrane while dynamically controlling the osmotic pressure of the polymer suspension. Using this technology we measured the membrane rod density over a range of osmotic pressures than is wider that what is accessible in equilibrium samples. With increasing density we observed a first-order phase transition, in which the in-plane membrane order transforms from a 2D fluid into a 2D solid. In the limit of low osmotic pressures, we measured the rate at which individual rods evaporate from the membrane. The developed microfluidic technique could have wide applicability for in situ investigation of various soft materials and how their properties depend on the solvent composition. △ Less

Submitted 29 May, 2019; originally announced May 2019.

Comments: 15 pages, 13 figures

arXiv:1704.01427 [pdf, ps, other]

doi 10.1016/j.knosys.2018.09.019

AMIDST: a Java Toolbox for Scalable Probabilistic Machine Learning

Authors: Andrés R. Masegosa, Ana M. Martínez, Darío Ramos-López, Rafael Cabañas, Antonio Salmerón, Thomas D. Nielsen, Helge Langseth, Anders L. Madsen

Abstract: The AMIDST Toolbox is a software for scalable probabilistic machine learning with a spe- cial focus on (massive) streaming data. The toolbox supports a flexible modeling language based on probabilistic graphical models with latent variables and temporal dependencies. The specified models can be learnt from large data sets using parallel or distributed implementa- tions of Bayesian learning algorit… ▽ More The AMIDST Toolbox is a software for scalable probabilistic machine learning with a spe- cial focus on (massive) streaming data. The toolbox supports a flexible modeling language based on probabilistic graphical models with latent variables and temporal dependencies. The specified models can be learnt from large data sets using parallel or distributed implementa- tions of Bayesian learning algorithms for either streaming or batch data. These algorithms are based on a flexible variational message passing scheme, which supports discrete and continu- ous variables from a wide range of probability distributions. AMIDST also leverages existing functionality and algorithms by interfacing to software tools such as Flink, Spark, MOA, Weka, R and HUGIN. AMIDST is an open source toolbox written in Java and available at http://www.amidsttoolbox.com under the Apache Software License version 2.0. △ Less

Submitted 4 April, 2017; originally announced April 2017.

ACM Class: I.2.6

arXiv:1411.4743 [pdf]

doi 10.1039/C4CP05405A

DNA driven self-assembly of micron-sized rods using DNA-grafted bacteriophage fd virions

Authors: R. R. Unwin, R. A. Cabanas, T. Yanagishima, T. R. Blower, H. Takahashi, G. P. C. Salmond, J. M. Edwardson, S. Fraden, E. Eiser

Abstract: We have functionalized the sides of fd bacteriophage virions with oligonucleotides to induce DNA hybridization driven self-assembly of high aspect ratio filamentous particles. Potential impacts of this new structure range from an entirely new building block in DNA origami structures, inclusion of virions in DNA nanostructures and nanomachines, to a new means of adding thermotropic control to lyotr… ▽ More We have functionalized the sides of fd bacteriophage virions with oligonucleotides to induce DNA hybridization driven self-assembly of high aspect ratio filamentous particles. Potential impacts of this new structure range from an entirely new building block in DNA origami structures, inclusion of virions in DNA nanostructures and nanomachines, to a new means of adding thermotropic control to lyotropic liquid crystal systems. A protocol for producing the virions in bulk is reviewed. Thiolated oligonucleotides are attached to the viral capsid using a heterobifunctional chemical linker. A commonly used system is utilized, where a sticky, single-stranded DNA strand is connected to an inert double-stranded spacer to increase inter-particle connectivity. Solutions of fd virions carrying complementary strands are mixed, annealed, and their aggregation is studied using dynamic light scattering (DLS), fluorescence microscopy, and atomic force microscopy (AFM). Aggregation is clearly observed on cooling, with some degree of local order, and is reversible when temperature is cycled through the DNA hybridization transition. △ Less

Submitted 18 November, 2014; originally announced November 2014.

Comments: 10 pages, 1 Table, 6 Figures

Showing 1–14 of 14 results for author: Cabañas, R