-
Development of Bayesian Component Failure Models in E1 HEMP Grid Analysis
Authors:
Niladri Das,
Ross Guttromson,
Tommie A. Catanach
Abstract:
Combined electric power system and High-Altitude Electromagnetic Pulse (HEMP) models are being developed to determine the effect of a HEMP on the US power grid. The work relies primarily on deterministic methods; however, it is computationally untenable to evaluate the E1 HEMP response of large numbers of grid components distributed across a large interconnection. Further, the deterministic assess…
▽ More
Combined electric power system and High-Altitude Electromagnetic Pulse (HEMP) models are being developed to determine the effect of a HEMP on the US power grid. The work relies primarily on deterministic methods; however, it is computationally untenable to evaluate the E1 HEMP response of large numbers of grid components distributed across a large interconnection. Further, the deterministic assessment of these components' failures are largely unachievable. E1 HEMP laboratory testing of the components is accomplished, but is expensive, leaving few data points to construct failure models of grid components exposed to E1 HEMP. The use of Bayesian priors, developed using the subject matter expertise, combined with the minimal test data in a Bayesian inference process, provides the basis for the development of more robust and cost-effective statistical component failure models. These can be used with minimal computational burden in a simulation environment such as sampling of Cumulative Distribution Functions (CDFs).
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Metrics for Bayesian Optimal Experiment Design under Model Misspecification
Authors:
Tommie A. Catanach,
Niladri Das
Abstract:
The conventional approach to Bayesian decision-theoretic experiment design involves searching over possible experiments to select a design that maximizes the expected value of a specified utility function. The expectation is over the joint distribution of all unknown variables implied by the statistical model that will be used to analyze the collected data. The utility function defines the objecti…
▽ More
The conventional approach to Bayesian decision-theoretic experiment design involves searching over possible experiments to select a design that maximizes the expected value of a specified utility function. The expectation is over the joint distribution of all unknown variables implied by the statistical model that will be used to analyze the collected data. The utility function defines the objective of the experiment where a common utility function is the information gain. This article introduces an expanded framework for this process, where we go beyond the traditional Expected Information Gain criteria and introduce the Expected General Information Gain which measures robustness to the model discrepancy and Expected Discriminatory Information as a criterion to quantify how well an experiment can detect model discrepancy. The functionality of the framework is showcased through its application to a scenario involving a linearized spring mass damper system and an F-16 model where the model discrepancy is taken into account while doing Bayesian optimal experiment design.
△ Less
Submitted 16 April, 2023;
originally announced April 2023.
-
Design of experiments for the calibration of history-dependent models via deep reinforcement learning and an enhanced Kalman filter
Authors:
Ruben Villarreal,
Nikolaos N. Vlassis,
Nhon N. Phan,
Tommie A. Catanach,
Reese E. Jones,
Nathaniel A. Trask,
Sharlotte L. B. Kramer,
WaiChing Sun
Abstract:
Experimental data is costly to obtain, which makes it difficult to calibrate complex models. For many models an experimental design that produces the best calibration given a limited experimental budget is not obvious. This paper introduces a deep reinforcement learning (RL) algorithm for design of experiments that maximizes the information gain measured by Kullback-Leibler (KL) divergence obtaine…
▽ More
Experimental data is costly to obtain, which makes it difficult to calibrate complex models. For many models an experimental design that produces the best calibration given a limited experimental budget is not obvious. This paper introduces a deep reinforcement learning (RL) algorithm for design of experiments that maximizes the information gain measured by Kullback-Leibler (KL) divergence obtained via the Kalman filter (KF). This combination enables experimental design for rapid online experiments where traditional methods are too costly. We formulate possible configurations of experiments as a decision tree and a Markov decision process (MDP), where a finite choice of actions is available at each incremental step. Once an action is taken, a variety of measurements are used to update the state of the experiment. This new data leads to a Bayesian update of the parameters by the KF, which is used to enhance the state representation. In contrast to the Nash-Sutcliffe efficiency (NSE) index, which requires additional sampling to test hypotheses for forward predictions, the KF can lower the cost of experiments by directly estimating the values of new data acquired through additional actions. In this work our applications focus on mechanical testing of materials. Numerical experiments with complex, history-dependent models are used to verify the implementation and benchmark the performance of the RL-designed experiments.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
Variational Kalman Filtering with Hinf-Based Correction for Robust Bayesian Learning in High Dimensions
Authors:
Niladri Das,
Jed A. Duersch,
Thomas A. Catanach
Abstract:
In this paper, we address the problem of convergence of sequential variational inference filter (VIF) through the application of a robust variational objective and Hinf-norm based correction for a linear Gaussian system. As the dimension of state or parameter space grows, performing the full Kalman update with the dense covariance matrix for a large scale system requires increased storage and comp…
▽ More
In this paper, we address the problem of convergence of sequential variational inference filter (VIF) through the application of a robust variational objective and Hinf-norm based correction for a linear Gaussian system. As the dimension of state or parameter space grows, performing the full Kalman update with the dense covariance matrix for a large scale system requires increased storage and computational complexity, making it impractical. The VIF approach, based on mean-field Gaussian variational inference, reduces this burden through the variational approximation to the covariance usually in the form of a diagonal covariance approximation. The challenge is to retain convergence and correct for biases introduced by the sequential VIF steps. We desire a framework that improves feasibility while still maintaining reasonable proximity to the optimal Kalman filter as data is assimilated. To accomplish this goal, a Hinf-norm based optimization perturbs the VIF covariance matrix to improve robustness. This yields a novel VIF- Hinf recursion that employs consecutive variational inference and Hinf based optimization steps. We explore the development of this method and investigate a numerical example to illustrate the effectiveness of the proposed filter.
△ Less
Submitted 27 April, 2022;
originally announced April 2022.
-
Adaptive n-ary Activation Functions for Probabilistic Boolean Logic
Authors:
Jed A. Duersch,
Thomas A. Catanach,
Niladri Das
Abstract:
Balancing model complexity against the information contained in observed data is the central challenge to learning. In order for complexity-efficient models to exist and be discoverable in high dimensions, we require a computational framework that relates a credible notion of complexity to simple parameter representations. Further, this framework must allow excess complexity to be gradually remove…
▽ More
Balancing model complexity against the information contained in observed data is the central challenge to learning. In order for complexity-efficient models to exist and be discoverable in high dimensions, we require a computational framework that relates a credible notion of complexity to simple parameter representations. Further, this framework must allow excess complexity to be gradually removed via gradient-based optimization. Our n-ary, or n-argument, activation functions fill this gap by approximating belief functions (probabilistic Boolean logic) using logit representations of probability. Just as Boolean logic determines the truth of a consequent claim from relationships among a set of antecedent propositions, probabilistic formulations generalize predictions when antecedents, truth tables, and consequents all retain uncertainty. Our activation functions demonstrate the ability to learn arbitrary logic, such as the binary exclusive disjunction (p xor q) and ternary conditioned disjunction ( c ? p : q ), in a single layer using an activation function of matching or greater arity. Further, we represent belief tables using a basis that directly associates the number of nonzero parameters to the effective arity of the belief function, thus capturing a concrete relationship between logical complexity and efficient parameter representations. This opens optimization approaches to reduce logical complexity by inducing parameter sparsity.
△ Less
Submitted 16 March, 2022;
originally announced March 2022.
-
Parsimonious Inference
Authors:
Jed A. Duersch,
Thomas A. Catanach
Abstract:
Bayesian inference provides a uniquely rigorous approach to obtain principled justification for uncertainty in predictions, yet it is difficult to articulate suitably general prior belief in the machine learning context, where computational architectures are pure abstractions subject to frequent modifications by practitioners attempting to improve results. Parsimonious inference is an information-…
▽ More
Bayesian inference provides a uniquely rigorous approach to obtain principled justification for uncertainty in predictions, yet it is difficult to articulate suitably general prior belief in the machine learning context, where computational architectures are pure abstractions subject to frequent modifications by practitioners attempting to improve results. Parsimonious inference is an information-theoretic formulation of inference over arbitrary architectures that formalizes Occam's Razor; we prefer simple and sufficient explanations. Our universal hyperprior assigns plausibility to prior descriptions, encoded as sequences of symbols, by expanding on the core relationships between program length, Kolmogorov complexity, and Solomonoff's algorithmic probability. We then cast learning as information minimization over our composite change in belief when an architecture is specified, training data are observed, and model parameters are inferred. By distinguishing model complexity from prediction information, our framework also quantifies the phenomenon of memorization.
Although our theory is general, it is most critical when datasets are limited, e.g. small or skewed. We develop novel algorithms for polynomial regression and random forests that are suitable for such data, as demonstrated by our experiments. Our approaches combine efficient encodings with prudent sampling strategies to construct predictive ensembles without cross-validation, thus addressing a fundamental challenge in how to efficiently obtain predictions from data.
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
Bayesian inference of Stochastic reaction networks using Multifidelity Sequential Tempered Markov Chain Monte Carlo
Authors:
Thomas A. Catanach,
Huy D. Vo,
Brian Munsky
Abstract:
Stochastic reaction network models are often used to explain and predict the dynamics of gene regulation in single cells. These models usually involve several parameters, such as the kinetic rates of chemical reactions, that are not directly measurable and must be inferred from experimental data. Bayesian inference provides a rigorous probabilistic framework for identifying these parameters by fin…
▽ More
Stochastic reaction network models are often used to explain and predict the dynamics of gene regulation in single cells. These models usually involve several parameters, such as the kinetic rates of chemical reactions, that are not directly measurable and must be inferred from experimental data. Bayesian inference provides a rigorous probabilistic framework for identifying these parameters by finding a posterior parameter distribution that captures their uncertainty. Traditional computational methods for solving inference problems such as Markov Chain Monte Carlo methods based on classical Metropolis-Hastings algorithm involve numerous serial evaluations of the likelihood function, which in turn requires expensive forward solutions of the chemical master equation (CME). We propose an alternative approach based on a multifidelity extension of the Sequential Tempered Markov Chain Monte Carlo (ST-MCMC) sampler. This algorithm is built upon Sequential Monte Carlo and solves the Bayesian inference problem by decomposing it into a sequence of efficiently solved subproblems that gradually increase model fidelity and the influence of the observed data. We reformulate the finite state projection (FSP) algorithm, a well-known method for solving the CME, to produce a hierarchy of surrogate master equations to be used in this multifidelity scheme. To determine the appropriate fidelity, we introduce a novel information-theoretic criteria that seeks to extract the most information about the ultimate Bayesian posterior from each model in the hierarchy without inducing significant bias. This novel sampling scheme is tested with high performance computing resources using biologically relevant problems.
△ Less
Submitted 5 January, 2020;
originally announced January 2020.
-
Generalizing Information to the Evolution of Rational Belief
Authors:
Jed A. Duersch,
Thomas A. Catanach
Abstract:
Information theory provides a mathematical foundation to measure uncertainty in belief. Belief is represented by a probability distribution that captures our understanding of an outcome's plausibility. Information measures based on Shannon's concept of entropy include realization information, Kullback-Leibler divergence, Lindley's information in experiment, cross entropy, and mutual information.…
▽ More
Information theory provides a mathematical foundation to measure uncertainty in belief. Belief is represented by a probability distribution that captures our understanding of an outcome's plausibility. Information measures based on Shannon's concept of entropy include realization information, Kullback-Leibler divergence, Lindley's information in experiment, cross entropy, and mutual information.
We derive a general theory of information from first principles that accounts for evolving belief and recovers all of these measures. Rather than simply gauging uncertainty, information is understood in this theory to measure change in belief. We may then regard entropy as the information we expect to gain upon realization of a discrete latent random variable.
This theory of information is compatible with the Bayesian paradigm in which rational belief is updated as evidence becomes available. Furthermore, this theory admits novel measures of information with well-defined properties, which we explore in both analysis and experiment. This view of information illuminates the study of machine learning by allowing us to quantify information captured by a predictive model and distinguish it from residual information contained in training data. We gain related insights regarding feature selection, anomaly detection, and novel Bayesian approaches.
△ Less
Submitted 12 January, 2020; v1 submitted 21 November, 2019;
originally announced November 2019.
-
Bayesian Updating and Uncertainty Quantification using Sequential Tempered MCMC with the Rank-One Modified Metropolis Algorithm
Authors:
Thomas A. Catanach,
James L. Beck
Abstract:
Bayesian methods are critical for quantifying the behaviors of systems. They capture our uncertainty about a system's behavior using probability distributions and update this understanding as new information becomes available. Probabilistic predictions that incorporate this uncertainty can then be made to evaluate system performance and make decisions. While Bayesian methods are very useful, they…
▽ More
Bayesian methods are critical for quantifying the behaviors of systems. They capture our uncertainty about a system's behavior using probability distributions and update this understanding as new information becomes available. Probabilistic predictions that incorporate this uncertainty can then be made to evaluate system performance and make decisions. While Bayesian methods are very useful, they are often computationally intensive. This necessitates the development of more efficient algorithms. Here, we discuss a group of population Markov Chain Monte Carlo (MCMC) methods for Bayesian updating and system reliability assessment that we call Sequential Tempered MCMC (ST-MCMC) algorithms. These algorithms combine 1) a notion of tempering to gradually transform a population of samples from the prior to the posterior through a series of intermediate distributions, 2) importance resampling, and 3) MCMC. They are a form of Sequential Monte Carlo and include algorithms like Transitional Markov Chain Monte Carlo and Subset Simulation. We also introduce a new sampling algorithm called the Rank-One Modified Metropolis Algorithm (ROMMA), which builds upon the Modified Metropolis Algorithm used within Subset Simulation to improve performance in high dimensions. Finally, we formulate a single algorithm to solve combined Bayesian updating and reliability assessment problems to make posterior assessments of system reliability. The algorithms are then illustrated by performing prior and posterior reliability assessment of a water distribution system with unknown leaks and demands.
△ Less
Submitted 23 April, 2018;
originally announced April 2018.
-
Heterogeneity, correlations and financial contagion
Authors:
Fabio Caccioli,
Thomas A. Catanach,
J. Doyne Farmer
Abstract:
We consider a model of contagion in financial networks recently introduced in the literature, and we characterize the effect of a few features empirically observed in real networks on the stability of the system. Notably, we consider the effect of heterogeneous degree distributions, heterogeneous balance sheet size and degree correlations between banks. We study the probability of contagion condit…
▽ More
We consider a model of contagion in financial networks recently introduced in the literature, and we characterize the effect of a few features empirically observed in real networks on the stability of the system. Notably, we consider the effect of heterogeneous degree distributions, heterogeneous balance sheet size and degree correlations between banks. We study the probability of contagion conditional on the failure of a random bank, the most connected bank and the biggest bank, and we consider the effect of targeted policies aimed at increasing the capital requirements of a few banks with high connectivity or big balance sheets. Networks with heterogeneous degree distributions are shown to be more resilient to contagion triggered by the failure of a random bank, but more fragile with respect to contagion triggered by the failure of highly connected nodes. A power law distribution of balance sheet size is shown to induce an inefficient diversification that makes the system more prone to contagion events. A targeted policy aimed at reinforcing the stability of the biggest banks is shown to improve the stability of the system in the regime of high average degree. Finally, disassortative mixing, such as that observed in real banking networks, is shown to enhance the stability of the system.
△ Less
Submitted 6 September, 2011;
originally announced September 2011.