-
Covariance Expressions for Multi-Fidelity Sampling with Multi-Output, Multi-Statistic Estimators: Application to Approximate Control Variates
Authors:
Thomas O. Dixon,
James E. Warner,
Geoffrey F. Bomarito,
Alex A. Gorodetsky
Abstract:
We provide a collection of results on covariance expressions between Monte Carlo based multi-output mean, variance, and Sobol main effect variance estimators from an ensemble of models. These covariances can be used within multi-fidelity uncertainty quantification strategies that seek to reduce the estimator variance of high-fidelity Monte Carlo estimators with an ensemble of low-fidelity models.…
▽ More
We provide a collection of results on covariance expressions between Monte Carlo based multi-output mean, variance, and Sobol main effect variance estimators from an ensemble of models. These covariances can be used within multi-fidelity uncertainty quantification strategies that seek to reduce the estimator variance of high-fidelity Monte Carlo estimators with an ensemble of low-fidelity models. Such covariance expressions are required within approaches like the approximate control variate and multi-level best linear unbiased estimator. While the literature provides these expressions for some single-output cases such as mean and variance, our results are relevant to both multiple function outputs and multiple statistics across any sampling strategy. Following the description of these results, we use them within an approximate control variate scheme to show that leveraging multiple outputs can dramatically reduce estimator variance compared to single-output approaches. Synthetic examples are used to highlight the effects of optimal sample allocation and pilot sample estimation. A flight-trajectory simulation of entry, descent, and landing is used to demonstrate multi-output estimation in practical applications.
△ Less
Submitted 27 June, 2024; v1 submitted 29 September, 2023;
originally announced October 2023.
-
Automated Learning of Interpretable Models with Quantified Uncertainty
Authors:
G. F. Bomarito,
P. E. Leser,
N. C. M Strauss,
K. M. Garbrecht,
J. D. Hochhalter
Abstract:
Interpretability and uncertainty quantification in machine learning can provide justification for decisions, promote scientific discovery and lead to a better understanding of model behavior. Symbolic regression provides inherently interpretable machine learning, but relatively little work has focused on the use of symbolic regression on noisy data and the accompanying necessity to quantify uncert…
▽ More
Interpretability and uncertainty quantification in machine learning can provide justification for decisions, promote scientific discovery and lead to a better understanding of model behavior. Symbolic regression provides inherently interpretable machine learning, but relatively little work has focused on the use of symbolic regression on noisy data and the accompanying necessity to quantify uncertainty. A new Bayesian framework for genetic-programming-based symbolic regression (GPSR) is introduced that uses model evidence (i.e., marginal likelihood) to formulate replacement probability during the selection phase of evolution. Model parameter uncertainty is automatically quantified, enabling probabilistic predictions with each equation produced by the GPSR algorithm. Model evidence is also quantified in this process, and its use is shown to increase interpretability, improve robustness to noise, and reduce overfitting when compared to a conventional GPSR implementation on both numerical and physical experiments.
△ Less
Submitted 12 April, 2022;
originally announced May 2022.
-
Entropy-based adaptive design for contour finding and estimating reliability
Authors:
D. Austin Cole,
Robert B. Gramacy,
James E. Warner,
Geoffrey F. Bomarito,
Patrick E. Leser,
William P. Leser
Abstract:
In reliability analysis, methods used to estimate failure probability are often limited by the costs associated with model evaluations. Many of these methods, such as multifidelity importance sampling (MFIS), rely upon a computationally efficient, surrogate model like a Gaussian process (GP) to quickly generate predictions. The quality of the GP fit, particularly in the vicinity of the failure reg…
▽ More
In reliability analysis, methods used to estimate failure probability are often limited by the costs associated with model evaluations. Many of these methods, such as multifidelity importance sampling (MFIS), rely upon a computationally efficient, surrogate model like a Gaussian process (GP) to quickly generate predictions. The quality of the GP fit, particularly in the vicinity of the failure region(s), is instrumental in supplying accurately predicted failures for such strategies. We introduce an entropy-based GP adaptive design that, when paired with MFIS, provides more accurate failure probability estimates and with higher confidence. We show that our greedy data acquisition strategy better identifies multiple failure regions compared to existing contour-finding schemes. We then extend the method to batch selection, without sacrificing accuracy. Illustrative examples are provided on benchmark data as well as an application to an impact damage simulator for National Aeronautics and Space Administration (NASA) spacesuits.
△ Less
Submitted 24 October, 2021; v1 submitted 24 May, 2021;
originally announced May 2021.
-
On the Optimization of Approximate Control Variates with Parametrically Defined Estimators
Authors:
Geoffrey F. Bomarito,
Patrick E. Leser,
James E. Warner,
William P. Leser
Abstract:
Multi-model Monte Carlo methods, such as multi-level Monte Carlo (MLMC) and multifidelity Monte Carlo (MFMC), allow for efficient estimation of the expectation of a quantity of interest given a set of models of varying fidelities. Recently, it was shown that the MLMC and MFMC estimators are both instances of the approximate control variates (ACV) framework [Gorodetsky et al. 2020]. In that same wo…
▽ More
Multi-model Monte Carlo methods, such as multi-level Monte Carlo (MLMC) and multifidelity Monte Carlo (MFMC), allow for efficient estimation of the expectation of a quantity of interest given a set of models of varying fidelities. Recently, it was shown that the MLMC and MFMC estimators are both instances of the approximate control variates (ACV) framework [Gorodetsky et al. 2020]. In that same work, it was also shown that hand-tailored ACV estimators could outperform MLMC and MFMC for a variety of model scenarios. Because there is no reason to believe that these hand-tailored estimators are the best among a myriad of possible ACV estimators, a more general approach to estimator construction is pursued in this work. First, a general form of the ACV estimator variance is formulated. Then, the formulation is utilized to generate parametrically-defined estimators. These parametrically-defined estimators allow for an optimization to be pursued over a larger domain of possible ACV estimators. The parametrically-defined estimators are tested on a large set of model scenarios, and it is found that the broader search domain enabled by parametrically-defined estimators leads to greater variance reduction.
△ Less
Submitted 4 December, 2020;
originally announced December 2020.
-
Inverse Estimation of Elastic Modulus Using Physics-Informed Generative Adversarial Networks
Authors:
James E. Warner,
Julian Cuevas,
Geoffrey F. Bomarito,
Patrick E. Leser,
William P. Leser
Abstract:
While standard generative adversarial networks (GANs) rely solely on training data to learn unknown probability distributions, physics-informed GANs (PI-GANs) encode physical laws in the form of stochastic partial differential equations (PDEs) using auto differentiation. By relating observed data to unobserved quantities of interest through PDEs, PI-GANs allow for the estimation of underlying prob…
▽ More
While standard generative adversarial networks (GANs) rely solely on training data to learn unknown probability distributions, physics-informed GANs (PI-GANs) encode physical laws in the form of stochastic partial differential equations (PDEs) using auto differentiation. By relating observed data to unobserved quantities of interest through PDEs, PI-GANs allow for the estimation of underlying probability distributions without their direct measurement (i.e. inverse problems). The scalable nature of GANs allows high-dimensional, spatially-dependent probability distributions (i.e., random fields) to be inferred, while incorporating prior information through PDEs allows the training datasets to be relatively small.
In this work, PI-GANs are demonstrated for the application of elastic modulus estimation in mechanical testing. Given measured deformation data, the underlying probability distribution of spatially-varying elastic modulus (stiffness) is learned. Two feed-forward deep neural network generators are used to model the deformation and material stiffness across a two dimensional domain. Wasserstein GANs with gradient penalty are employed for enhanced stability. In the absence of explicit training data, it is demonstrated that the PI-GAN learns to generate realistic, physically-admissible realizations of material stiffness by incorporating the PDE that relates it to the measured deformation. It is shown that the statistics (mean, standard deviation, point-wise distributions, correlation length) of these generated stiffness samples have good agreement with the true distribution.
△ Less
Submitted 20 May, 2020;
originally announced June 2020.