Search | arXiv e-print repository

Classifying the evolution of COVID-19 severity on patients with combined dynamic Bayesian networks and neural networks

Authors: David Quesada, Pedro Larrañaga, Concha Bielza

Abstract: When we face patients arriving to a hospital suffering from the effects of some illness, one of the main problems we can encounter is evaluating whether or not said patients are going to require intensive care in the near future. This intensive care requires allotting valuable and scarce resources, and knowing beforehand the severity of a patients illness can improve both its treatment and the org… ▽ More When we face patients arriving to a hospital suffering from the effects of some illness, one of the main problems we can encounter is evaluating whether or not said patients are going to require intensive care in the near future. This intensive care requires allotting valuable and scarce resources, and knowing beforehand the severity of a patients illness can improve both its treatment and the organization of resources. We illustrate this issue in a dataset consistent of Spanish COVID-19 patients from the sixth epidemic wave where we label patients as critical when they either had to enter the intensive care unit or passed away. We then combine the use of dynamic Bayesian networks, to forecast the vital signs and the blood analysis results of patients over the next 40 hours, and neural networks, to evaluate the severity of a patients disease in that interval of time. Our empirical results show that the transposition of the current state of a patient to future values with the DBN for its subsequent use in classification obtains better the accuracy and g-mean score than a direct application with a classifier. △ Less

Submitted 10 March, 2023; originally announced March 2023.

arXiv:2301.09870 [pdf, other]

Context-specific kernel-based hidden Markov model for time series analysis

Authors: Carlos Puerto-Santana, Concha Bielza, Pedro Larrañaga, Gustav Eje Henter

Abstract: Traditional hidden Markov models have been a useful tool to understand and model stochastic dynamic data; in the case of non-Gaussian data, models such as mixture of Gaussian hidden Markov models can be used. However, these suffer from the computation of precision matrices and have a lot of unnecessary parameters. As a consequence, such models often perform better when it is assumed that all varia… ▽ More Traditional hidden Markov models have been a useful tool to understand and model stochastic dynamic data; in the case of non-Gaussian data, models such as mixture of Gaussian hidden Markov models can be used. However, these suffer from the computation of precision matrices and have a lot of unnecessary parameters. As a consequence, such models often perform better when it is assumed that all variables are independent, a hypothesis that may be unrealistic. Hidden Markov models based on kernel density estimation are also capable of modeling non-Gaussian data, but they assume independence between variables. In this article, we introduce a new hidden Markov model based on kernel density estimation, which is capable of capturing kernel dependencies using context-specific Bayesian networks. The proposed model is described, together with a learning algorithm based on the expectation-maximization algorithm. Additionally, the model is compared to related HMMs on synthetic and real data. From the results, the benefits in likelihood and classification accuracy from the proposed model are quantified and analyzed. △ Less

Submitted 15 May, 2023; v1 submitted 24 January, 2023; originally announced January 2023.

Comments: Keywords: Hidden Markov models, Kernel density estimation, Bayesian networks, Adaptive models, Time series

arXiv:2203.02400 [pdf, other]

Quantum Approximate Optimization Algorithm for Bayesian network structure learning

Authors: Vicente P. Soloviev, Concha Bielza, Pedro Larrañaga

Abstract: Bayesian network structure learning is an NP-hard problem that has been faced by a number of traditional approaches in recent decades. Currently, quantum technologies offer a wide range of advantages that can be exploited to solve optimization tasks that cannot be addressed in an efficient way when utilizing classic computing approaches. In this work, a specific type of variational quantum algorit… ▽ More Bayesian network structure learning is an NP-hard problem that has been faced by a number of traditional approaches in recent decades. Currently, quantum technologies offer a wide range of advantages that can be exploited to solve optimization tasks that cannot be addressed in an efficient way when utilizing classic computing approaches. In this work, a specific type of variational quantum algorithm, the quantum approximate optimization algorithm, was used to solve the Bayesian network structure learning problem, by employing $3n(n-1)/2$ qubits, where $n$ is the number of nodes in the Bayesian network to be learned. Our results showed that the quantum approximate optimization algorithm approach offers competitive results with state-of-the-art methods and quantitative resilience to quantum noise. The approach was applied to a cancer benchmark problem, and the results justified the use of variational quantum algorithms for solving the Bayesian network structure learning problem. △ Less

Submitted 4 March, 2022; originally announced March 2022.

arXiv:2109.03008 [pdf, other]

Semiparametric Bayesian Networks

Authors: David Atienza, Concha Bielza, Pedro Larrañaga

Abstract: We introduce semiparametric Bayesian networks that combine parametric and nonparametric conditional probability distributions. Their aim is to incorporate the advantages of both components: the bounded complexity of parametric models and the flexibility of nonparametric ones. We demonstrate that semiparametric Bayesian networks generalize two well-known types of Bayesian networks: Gaussian Bayesia… ▽ More We introduce semiparametric Bayesian networks that combine parametric and nonparametric conditional probability distributions. Their aim is to incorporate the advantages of both components: the bounded complexity of parametric models and the flexibility of nonparametric ones. We demonstrate that semiparametric Bayesian networks generalize two well-known types of Bayesian networks: Gaussian Bayesian networks and kernel density estimation Bayesian networks. For this purpose, we consider two different conditional probability distributions required in a semiparametric Bayesian network. In addition, we present modifications of two well-known algorithms (greedy hill-climbing and PC) to learn the structure of a semiparametric Bayesian network from data. To realize this, we employ a score function based on cross-validation. In addition, using a validation dataset, we apply an early-stop** criterion to avoid overfitting. To evaluate the applicability of the proposed algorithm, we conduct an exhaustive experiment on synthetic data sampled by mixing linear and nonlinear functions, multivariate normal data sampled from Gaussian Bayesian networks, real data from the UCI repository, and bearings degradation data. As a result of this experiment, we conclude that the proposed algorithm accurately learns the combination of parametric and nonparametric components, while achieving a performance comparable with those provided by state-of-the-art methods. △ Less

Submitted 7 September, 2021; originally announced September 2021.

Comments: 44 pages, 13 figures, 4 tables, submitted to Information Sciences

MSC Class: 68T05 68T10 ACM Class: I.5.1; I.2.6

arXiv:2010.15604 [pdf, other]

Autoregressive Asymmetric Linear Gaussian Hidden Markov Models

Authors: Carlos Puerto-Santana, Pedro Larrañaga, Concha Bielza

Abstract: In a real life process evolving over time, the relationship between its relevant variables may change. Therefore, it is advantageous to have different inference models for each state of the process. Asymmetric hidden Markov models fulfil this dynamical requirement and provide a framework where the trend of the process can be expressed as a latent variable. In this paper, we modify these recent asy… ▽ More In a real life process evolving over time, the relationship between its relevant variables may change. Therefore, it is advantageous to have different inference models for each state of the process. Asymmetric hidden Markov models fulfil this dynamical requirement and provide a framework where the trend of the process can be expressed as a latent variable. In this paper, we modify these recent asymmetric hidden Markov models to have an asymmetric autoregressive component, allowing the model to choose the order of autoregression that maximizes its penalized likelihood for a given training set. Additionally, we show how inference, hidden states decoding and parameter learning must be adapted to fit the proposed model. Finally, we run experiments with synthetic and real data to show the capabilities of this new model. △ Less

Submitted 27 October, 2020; originally announced October 2020.

Comments: 34 pages, 16 figures, intended to be published in IEEE Transactions on Pattern Analysis and Machine Intelligence

arXiv:2006.01448 [pdf, other]

doi 10.1109/ACCESS.2020.3018593

Sparse Cholesky covariance parametrization for recovering latent structure in ordered data

Authors: Irene Córdoba, Concha Bielza, Pedro Larrañaga, Gherardo Varando

Abstract: The sparse Cholesky parametrization of the inverse covariance matrix can be interpreted as a Gaussian Bayesian network; however its counterpart, the covariance Cholesky factor, has received, with few notable exceptions, little attention so far, despite having a natural interpretation as a hidden variable model for ordered signal data. To fill this gap, in this paper we focus on arbitrary zero patt… ▽ More The sparse Cholesky parametrization of the inverse covariance matrix can be interpreted as a Gaussian Bayesian network; however its counterpart, the covariance Cholesky factor, has received, with few notable exceptions, little attention so far, despite having a natural interpretation as a hidden variable model for ordered signal data. To fill this gap, in this paper we focus on arbitrary zero patterns in the Cholesky factor of a covariance matrix. We discuss how these models can also be extended, in analogy with Gaussian Bayesian networks, to data where no apparent order is available. For the ordered scenario, we propose a novel estimation method that is based on matrix loss penalization, as opposed to the existing regression-based approaches. The performance of this sparse model for the Cholesky factor, together with our novel estimator, is assessed in a simulation setting, as well as over spatial and temporal real data where a natural ordering arises among the variables. We give guidelines, based on the empirical results, about which of the methods analysed is more appropriate for each setting. △ Less

Submitted 19 August, 2020; v1 submitted 2 June, 2020; originally announced June 2020.

Comments: 24 pages, 12 figures

Journal ref: IEEE Access, 8: 154614-154624, 2020

arXiv:1812.00262 [pdf, other]

doi 10.1007/978-3-319-20807-7_47

Towards Gaussian Bayesian Network Fusion

Authors: Irene Córdoba, Concha Bielza, Pedro Larrañaga

Abstract: Data sets are growing in complexity thanks to the increasing facilities we have nowadays to both generate and store data. This poses many challenges to machine learning that are leading to the proposal of new methods and paradigms, in order to be able to deal with what is nowadays referred to as Big Data. In this paper we propose a method for the aggregation of different Bayesian network structure… ▽ More Data sets are growing in complexity thanks to the increasing facilities we have nowadays to both generate and store data. This poses many challenges to machine learning that are leading to the proposal of new methods and paradigms, in order to be able to deal with what is nowadays referred to as Big Data. In this paper we propose a method for the aggregation of different Bayesian network structures that have been learned from separate data sets, as a first step towards mining data sets that need to be partitioned in an horizontal way, i.e. with respect to the instances, in order to be processed. Considerations that should be taken into account when dealing with this situation are discussed. Scalable learning of Bayesian networks is slowly emerging, and our method constitutes one of the first insights into Gaussian Bayesian network aggregation from different sources. Tested on synthetic data it obtains good results that surpass those from individual learning. Future research will be focused on expanding the method and testing more diverse data sets. △ Less

Submitted 1 December, 2018; originally announced December 2018.

Comments: 10 pages, 3 figures, 2015 conference

Journal ref: Springer Lecture Notes in Artificial Intelligence (ECSQARU 2015), vol 9161, pages 519-528

arXiv:1811.04759 [pdf, ps, other]

Markov Property in Generative Classifiers

Authors: Gherardo Varando, Concha Bielza, Pedro Larrañaga, Eva Riccomagno

Abstract: We show that, for generative classifiers, conditional independence corresponds to linear constraints for the induced discrimination functions. Discrimination functions of undirected Markov network classifiers can thus be characterized by sets of linear constraints. These constraints are represented by a second order finite difference operator over functions of categorical variables. As an applicat… ▽ More We show that, for generative classifiers, conditional independence corresponds to linear constraints for the induced discrimination functions. Discrimination functions of undirected Markov network classifiers can thus be characterized by sets of linear constraints. These constraints are represented by a second order finite difference operator over functions of categorical variables. As an application we study the expressive power of generative classifiers under the undirected Markov property and we present a general method to combine discriminative and generative classifiers. △ Less

Submitted 12 November, 2018; originally announced November 2018.

arXiv:1806.11015 [pdf, other]

doi 10.1007/978-3-030-00374-6_5

Bayesian optimization of the PC algorithm for learning Gaussian Bayesian networks

Authors: Irene Córdoba, Eduardo C. Garrido-Merchán, Daniel Hernández-Lobato, Concha Bielza, Pedro Larrañaga

Abstract: The PC algorithm is a popular method for learning the structure of Gaussian Bayesian networks. It carries out statistical tests to determine absent edges in the network. It is hence governed by two parameters: (i) The type of test, and (ii) its significance level. These parameters are usually set to values recommended by an expert. Nevertheless, such an approach can suffer from human bias, leading… ▽ More The PC algorithm is a popular method for learning the structure of Gaussian Bayesian networks. It carries out statistical tests to determine absent edges in the network. It is hence governed by two parameters: (i) The type of test, and (ii) its significance level. These parameters are usually set to values recommended by an expert. Nevertheless, such an approach can suffer from human bias, leading to suboptimal reconstruction results. In this paper we consider a more principled approach for choosing these parameters in an automatic way. For this we optimize a reconstruction score evaluated on a set of different Gaussian Bayesian networks. This objective is expensive to evaluate and lacks a closed-form expression, which means that Bayesian optimization (BO) is a natural choice. BO methods use a model to guide the search and are hence able to exploit smoothness properties of the objective surface. We show that the parameters found by a BO method outperform those found by a random search strategy and the expert recommendation. Importantly, we have found that an often overlooked statistical test provides the best over-all reconstruction results. △ Less

Submitted 28 June, 2018; originally announced June 2018.

Journal ref: Lecture Notes in Artificial Intelligence (CAEPIA 2018), 11160:44:54, 2018

arXiv:1606.07282 [pdf, other]

doi 10.1016/j.jspi.2019.09.008

A review of Gaussian Markov models for conditional independence

Authors: Irene Córdoba, Concha Bielza, Pedro Larrañaga

Abstract: Markov models lie at the interface between statistical independence in a probability distribution and graph separation properties. We review model selection and estimation in directed and undirected Markov models with Gaussian parametrization, emphasizing the main similarities and differences. These two model classes are similar but not equivalent, although they share a common intersection. We pre… ▽ More Markov models lie at the interface between statistical independence in a probability distribution and graph separation properties. We review model selection and estimation in directed and undirected Markov models with Gaussian parametrization, emphasizing the main similarities and differences. These two model classes are similar but not equivalent, although they share a common intersection. We present the existing results from a historical perspective, taking into account the amount of literature existing from both the artificial intelligence and statistics research communities, where these models were originated. We cover classical topics such as maximum likelihood estimation and model selection via hypothesis testing, but also more modern approaches like regularization and Bayesian methods. We also discuss how the Markov models reviewed fit in the rich hierarchy of other, higher level Markov model classes. Finally, we close the paper overviewing relaxations of the Gaussian assumption and pointing out the main areas of application where these Markov models are nowadays used. △ Less

Submitted 2 October, 2019; v1 submitted 23 June, 2016; originally announced June 2016.

Comments: Fix author signature

Journal ref: Journal of Statistical Planning and Inference, 206:127-144, 2020

arXiv:1301.3871 [pdf]

Combinatorial Optimization by Learning and Simulation of Bayesian Networks

Authors: Pedro Larrañaga, Ramon Etxeberria, Jose A. Lozano, Jose M. Pena

Abstract: This paper shows how the Bayesian network paradigm can be used in order to solve combinatorial optimization problems. To do it some methods of structure learning from data and simulation of Bayesian networks are inserted inside Estimation of Distribution Algorithms (EDA). EDA are a new tool for evolutionary computation in which populations of individuals are created by estimation and simulation of… ▽ More This paper shows how the Bayesian network paradigm can be used in order to solve combinatorial optimization problems. To do it some methods of structure learning from data and simulation of Bayesian networks are inserted inside Estimation of Distribution Algorithms (EDA). EDA are a new tool for evolutionary computation in which populations of individuals are created by estimation and simulation of the joint probability distribution of the selected individuals. We propose new approaches to EDA for combinatorial optimization based on the theory of probabilistic graphical models. Experimental results are also presented. △ Less

Submitted 16 January, 2013; originally announced January 2013.

Comments: Appears in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI2000)

Report number: UAI-P-2000-PG-343-352

Showing 1–11 of 11 results for author: Larrañaga, P