-
Bayesian Calibration and Uncertainty Quantification of a Rate-dependent Cohesive Zone Model for Polymer Interfaces
Authors:
Ponkrshnan Thiagarajan,
Trisha Sain,
Susanta Ghosh
Abstract:
In the present work, a rate-dependent cohesive zone model for the fracture of polymeric interfaces is presented. Inverse calibration of parameters for such complex models through trial and error is computationally tedious due to the large number of parameters and the high computational cost associated. The obtained parameter values are often non-unique and the calibration inherits higher uncertain…
▽ More
In the present work, a rate-dependent cohesive zone model for the fracture of polymeric interfaces is presented. Inverse calibration of parameters for such complex models through trial and error is computationally tedious due to the large number of parameters and the high computational cost associated. The obtained parameter values are often non-unique and the calibration inherits higher uncertainty when the available experimental data is limited. To alleviate these difficulties, a Bayesian calibration approach is used for the proposed rate-dependent cohesive zone model in this work. The proposed cohesive zone model accounts for both reversible elastic and irreversible rate-dependent separation sliding deformation at the interface. The viscous dissipation due to the irreversible opening at the interface is modeled using elastic-viscoplastic kinematics that incorporates the effects of strain rate. To quantify the uncertainty associated with the inverse parameter estimation, a modular Bayesian approach is employed to calibrate the unknown model parameters, accounting for the parameter uncertainty of the cohesive zone model. Further, to quantify the model uncertainties, such as incorrect assumptions or missing physics, a discrepancy function is introduced and it is approximated as a Gaussian process. The improvement in the model predictions following the introduction of a discrepancy function is demonstrated justifying the need for a discrepancy term. Finally, the overall uncertainty of the model is quantified in a predictive setting and the results are provided as confidence intervals. A sensitivity analysis is also performed to understand the effect of the variability of the inputs on the nature of the output.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Open Source Infrastructure for Differentiable Density Functional Theory
Authors:
Advika Vidhyadhiraja,
Arun Pa Thiagarajan,
Shang Zhu,
Venkat Viswanathan,
Bharath Ramsundar
Abstract:
Learning exchange correlation functionals, used in quantum chemistry calculations, from data has become increasingly important in recent years, but training such a functional requires sophisticated software infrastructure. For this reason, we build open source infrastructure to train neural exchange correlation functionals. We aim to standardize the processing pipeline by adapting state-of-the-art…
▽ More
Learning exchange correlation functionals, used in quantum chemistry calculations, from data has become increasingly important in recent years, but training such a functional requires sophisticated software infrastructure. For this reason, we build open source infrastructure to train neural exchange correlation functionals. We aim to standardize the processing pipeline by adapting state-of-the-art techniques from work done by multiple groups. We have open sourced the model in the DeepChem library to provide a platform for additional research on differentiable quantum chemistry methods.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Electronic Structure Prediction of Multi-million Atom Systems Through Uncertainty Quantification Enabled Transfer Learning
Authors:
Shashank Pathrudkar,
Ponkrshnan Thiagarajan,
Shivang Agarwal,
Amartya S. Banerjee,
Susanta Ghosh
Abstract:
The ground state electron density -- obtainable using Kohn-Sham Density Functional Theory (KS-DFT) simulations -- contains a wealth of material information, making its prediction via machine learning (ML) models attractive. However, the computational expense of KS-DFT scales cubically with system size which tends to stymie training data generation, making it difficult to develop quantifiably accur…
▽ More
The ground state electron density -- obtainable using Kohn-Sham Density Functional Theory (KS-DFT) simulations -- contains a wealth of material information, making its prediction via machine learning (ML) models attractive. However, the computational expense of KS-DFT scales cubically with system size which tends to stymie training data generation, making it difficult to develop quantifiably accurate ML models that are applicable across many scales and system configurations. Here, we address this fundamental challenge by employing transfer learning to leverage the multi-scale nature of the training data, while comprehensively sampling system configurations using thermalization. Our ML models are less reliant on heuristics, and being based on Bayesian neural networks, enable uncertainty quantification. We show that our models incur significantly lower data generation costs while allowing confident -- and when verifiable, accurate -- predictions for a wide variety of bulk systems well beyond training, including systems with defects, different alloy compositions, and at unprecedented, multi-million-atom scales. Moreover, such predictions can be carried out using only modest computational resources.
△ Less
Submitted 1 May, 2024; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Jensen-Shannon Divergence Based Novel Loss Functions for Bayesian Neural Networks
Authors:
Ponkrshnan Thiagarajan,
Susanta Ghosh
Abstract:
The Kullback-Leibler (KL) divergence is widely used in state-of-the-art Bayesian Neural Networks (BNNs) to approximate the posterior distribution of weights. However, the KL divergence is unbounded and asymmetric, which may lead to instabilities during optimization or may yield poor generalizations. To overcome these limitations, we examine the Jensen-Shannon (JS) divergence that is bounded, symme…
▽ More
The Kullback-Leibler (KL) divergence is widely used in state-of-the-art Bayesian Neural Networks (BNNs) to approximate the posterior distribution of weights. However, the KL divergence is unbounded and asymmetric, which may lead to instabilities during optimization or may yield poor generalizations. To overcome these limitations, we examine the Jensen-Shannon (JS) divergence that is bounded, symmetric, and more general. Towards this, we propose two novel loss functions for BNNs. The first loss function uses the geometric JS divergence (JS-G) that is symmetric, unbounded, and offers an analytical expression for Gaussian priors. The second loss function uses the generalized JS divergence (JS-A) that is symmetric and bounded. We show that the conventional KL divergence-based loss function is a special case of the two loss functions presented in this work. To evaluate the divergence part of the loss we use analytical expressions for JS-G and use Monte Carlo methods for JS-A. We provide algorithms to optimize the loss function using both these methods. The proposed loss functions offer additional parameters that can be tuned to control the degree of regularisation. The regularization performance of the JS divergences is analyzed to demonstrate their superiority over the state-of-the-art. Further, we derive the conditions for better regularization by the proposed JS-G divergence-based loss function than the KL divergence-based loss function. Bayesian convolutional neural networks (BCNN) based on the proposed JS divergences perform better than the state-of-the-art BCNN, which is shown for the classification of the CIFAR data set having various degrees of noise and a histopathology data set having a high bias.
△ Less
Submitted 8 February, 2023; v1 submitted 22 September, 2022;
originally announced September 2022.
-
Explanation and Use of Uncertainty Quantified by Bayesian Neural Network Classifiers for Breast Histopathology Images
Authors:
Ponkrshnan Thiagarajan,
Pushkar Khairnar,
Susanta Ghosh
Abstract:
Despite the promise of Convolutional neural network (CNN) based classification models for histopathological images, it is infeasible to quantify its uncertainties. Moreover, CNNs may suffer from overfitting when the data is biased. We show that Bayesian-CNN can overcome these limitations by regularizing automatically and by quantifying the uncertainty. We have developed a novel technique to utiliz…
▽ More
Despite the promise of Convolutional neural network (CNN) based classification models for histopathological images, it is infeasible to quantify its uncertainties. Moreover, CNNs may suffer from overfitting when the data is biased. We show that Bayesian-CNN can overcome these limitations by regularizing automatically and by quantifying the uncertainty. We have developed a novel technique to utilize the uncertainties provided by the Bayesian-CNN that significantly improves the performance on a large fraction of the test data (about 6% improvement in accuracy on 77% of test data). Further, we provide a novel explanation for the uncertainty by projecting the data into a low dimensional space through a nonlinear dimensionality reduction technique. This dimensionality reduction enables interpretation of the test data through visualization and reveals the structure of the data in a low dimensional feature space. We show that the Bayesian-CNN can perform much better than the state-of-the-art transfer learning CNN (TL-CNN) by reducing the false negative and false positive by 11% and 7.7% respectively for the present data set. It achieves this performance with only 1.86 million parameters as compared to 134.33 million for TL-CNN. Besides, we modify the Bayesian-CNN by introducing a stochastic adaptive activation function. The modified Bayesian-CNN performs slightly better than Bayesian-CNN on all performance metrics and significantly reduces the number of false negatives and false positives (3% reduction for both). We also show that these results are statistically significant by performing McNemar's statistical significance test. This work shows the advantages of Bayesian-CNN against the state-of-the-art, explains and utilizes the uncertainties for histopathological images. It should find applications in various medical image classifications.
△ Less
Submitted 5 November, 2021; v1 submitted 7 October, 2020;
originally announced October 2020.
-
Statistical Model Checking based Analysis of Biological Networks
Authors:
Bing Liu,
Benjamin M. Gyori,
P. S. Thiagarajan
Abstract:
We introduce a framework for analyzing ordinary differential equation (ODE) models of biological networks using statistical model checking (SMC). A key aspect of our work is the modeling of single-cell variability by assigning a probability distribution to intervals of initial concentration values and kinetic rate constants. We propagate this distribution through the system dynamics to obtain a di…
▽ More
We introduce a framework for analyzing ordinary differential equation (ODE) models of biological networks using statistical model checking (SMC). A key aspect of our work is the modeling of single-cell variability by assigning a probability distribution to intervals of initial concentration values and kinetic rate constants. We propagate this distribution through the system dynamics to obtain a distribution over the set of trajectories of the ODEs. This in turn opens the door for performing statistical analysis of the ODE system's behavior. To illustrate this we first encode quantitative data and qualitative trends as bounded linear time temporal logic (BLTL) formulas. Based on this we construct a parameter estimation method using an SMC-driven evaluation procedure applied to the stochastic version of the behavior of the ODE system. We then describe how this SMC framework can be generalized to hybrid automata by exploiting the given distribution over the initial states and the--much more sophisticated--system dynamics to associate a Markov chain with the hybrid automaton. We then establish a strong relationship between the behaviors of the hybrid automaton and its associated Markov chain. Consequently, we sample trajectories from the hybrid automaton in a way that mimics the sampling of the trajectories of the Markov chain. This enables us to verify approximately that the Markov chain meets a BLTL specification with high probability. We have applied these methods to ODE based models of Toll-like receptor signaling and the crosstalk between autophagy and apoptosis, as well as to systems exhibiting hybrid dynamics including the circadian clock pathway and cardiac cell physiology. We present an overview of these applications and summarize the main empirical results. These case studies demonstrate that the our methods can be applied in a variety of practical settings.
△ Less
Submitted 3 December, 2018;
originally announced December 2018.
-
Approximate probabilistic verification of hybrid systems
Authors:
Benjamin M. Gyori,
Bing Liu,
Soumya Paul,
R. Ramanathan,
P. S. Thiagarajan
Abstract:
Hybrid systems whose mode dynamics are governed by non-linear ordinary differential equations (ODEs) are often a natural model for biological processes. However such models are difficult to analyze. To address this, we develop a probabilistic analysis method by approximating the mode transitions as stochastic events. We assume that the probability of making a mode transition is proportional to the…
▽ More
Hybrid systems whose mode dynamics are governed by non-linear ordinary differential equations (ODEs) are often a natural model for biological processes. However such models are difficult to analyze. To address this, we develop a probabilistic analysis method by approximating the mode transitions as stochastic events. We assume that the probability of making a mode transition is proportional to the measure of the set of pairs of time points and value states at which the mode transition is enabled. To ensure a sound mathematical basis, we impose a natural continuity property on the non-linear ODEs. We also assume that the states of the system are observed at discrete time points but that the mode transitions may take place at any time between two successive discrete time points. This leads to a discrete time Markov chain as a probabilistic approximation of the hybrid system. We then show that for BLTL (bounded linear time temporal logic) specifications the hybrid system meets a specification iff its Markov chain approximation meets the same specification with probability $1$. Based on this, we formulate a sequential hypothesis testing procedure for verifying -approximately- that the Markov chain meets a BLTL specification with high probability. Our case studies on cardiac cell dynamics and the circadian rhythm indicate that our scheme can be applied in a number of realistic settings.
△ Less
Submitted 19 June, 2015; v1 submitted 22 December, 2014;
originally announced December 2014.
-
Distributed Markov Chains
Authors:
Sumit Kumar Jha,
Madhavan Mukund,
Ratul Saha,
P S Thiagarajan
Abstract:
The formal verification of large probabilistic models is important and challenging. Exploiting the concurrency that is often present is one way to address this problem. Here we study a restricted class of asynchronous distributed probabilistic systems in which the synchronizations determine the probability distribution for the next moves of the participating agents. The key restriction we impose i…
▽ More
The formal verification of large probabilistic models is important and challenging. Exploiting the concurrency that is often present is one way to address this problem. Here we study a restricted class of asynchronous distributed probabilistic systems in which the synchronizations determine the probability distribution for the next moves of the participating agents. The key restriction we impose is that the synchronizations are deterministic, in the sense that any two simultaneously enabled synchronizations must involve disjoint sets of agents. As a result, this network of agents can be viewed as a succinct and distributed presentation of a large global Markov chain. A rich class of Markov chains can be represented this way.
We define an interleaved semantics for our model in terms of the local synchronization actions. The network structure induces an independence relation on these actions, which, in turn, induces an equivalence relation over the interleaved runs in the usual way. We construct a natural probability measure over these equivalence classes of runs by exploiting Mazurkiewicz trace theory and the probability measure space of the associated global Markov chain.
It turns out that verification of our model, called DMCs (distributed Markov chains), can often be efficiently carried out by exploiting the partial order nature of the interleaved semantics. To demonstrate this, we develop a statistical model checking (SMC) procedure and use it to verify two large distributed probabilistic networks.
△ Less
Submitted 5 August, 2014;
originally announced August 2014.