Search | arXiv e-print repository

arXiv:2407.00761 [pdf, other]

Improving the performance of Stein variational inference through extreme sparsification of physically-constrained neural network models

Authors: Govinda Anantha Padmanabha, Jan Niklas Fuhg, Cosmin Safta, Reese E. Jones, Nikolaos Bouklas

Abstract: Most scientific machine learning (SciML) applications of neural networks involve hundreds to thousands of parameters, and hence, uncertainty quantification for such models is plagued by the curse of dimensionality. Using physical applications, we show that $L_0$ sparsification prior to Stein variational gradient descent ($L_0$+SVGD) is a more robust and efficient means of uncertainty quantificatio… ▽ More Most scientific machine learning (SciML) applications of neural networks involve hundreds to thousands of parameters, and hence, uncertainty quantification for such models is plagued by the curse of dimensionality. Using physical applications, we show that $L_0$ sparsification prior to Stein variational gradient descent ($L_0$+SVGD) is a more robust and efficient means of uncertainty quantification, in terms of computational cost and performance than the direct application of SGVD or projected SGVD methods. Specifically, $L_0$+SVGD demonstrates superior resilience to noise, the ability to perform well in extrapolated regions, and a faster convergence rate to an optimal solution. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 30 pages, 11 figures

arXiv:2406.19524 [pdf, other]

Bayesian calibration of stochastic agent based model via random forest

Authors: Connor Robertson, Cosmin Safta, Nicholson Collier, Jonathan Ozik, Jaideep Ray

Abstract: Agent-based models (ABM) provide an excellent framework for modeling outbreaks and interventions in epidemiology by explicitly accounting for diverse individual interactions and environments. However, these models are usually stochastic and highly parametrized, requiring precise calibration for predictive performance. When considering realistic numbers of agents and properly accounting for stochas… ▽ More Agent-based models (ABM) provide an excellent framework for modeling outbreaks and interventions in epidemiology by explicitly accounting for diverse individual interactions and environments. However, these models are usually stochastic and highly parametrized, requiring precise calibration for predictive performance. When considering realistic numbers of agents and properly accounting for stochasticity, this high dimensional calibration can be computationally prohibitive. This paper presents a random forest based surrogate modeling technique to accelerate the evaluation of ABMs and demonstrates its use to calibrate an epidemiological ABM named CityCOVID via Markov chain Monte Carlo (MCMC). The technique is first outlined in the context of CityCOVID's quantities of interest, namely hospitalizations and deaths, by exploring dimensionality reduction via temporal decomposition with principal component analysis (PCA) and via sensitivity analysis. The calibration problem is then presented and samples are generated to best match COVID-19 hospitalization and death numbers in Chicago from March to June in 2020. These results are compared with previous approximate Bayesian calibration (IMABC) results and their predictive performance is analyzed showing improved performance with a reduction in computation. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.17119 [pdf, other]

Accelerating Phase Field Simulations Through a Hybrid Adaptive Fourier Neural Operator with U-Net Backbone

Authors: Christophe Bonneville, Nathan Bieberdorf, Arun Hegde, Mark Asta, Habib N. Najm, Laurent Capolungo, Cosmin Safta

Abstract: Prolonged contact between a corrosive liquid and metal alloys can cause progressive dealloying. For such liquid-metal dealloying (LMD) process, phase field models have been developed. However, the governing equations often involve coupled non-linear partial differential equations (PDE), which are challenging to solve numerically. In particular, stiffness in the PDEs requires an extremely small tim… ▽ More Prolonged contact between a corrosive liquid and metal alloys can cause progressive dealloying. For such liquid-metal dealloying (LMD) process, phase field models have been developed. However, the governing equations often involve coupled non-linear partial differential equations (PDE), which are challenging to solve numerically. In particular, stiffness in the PDEs requires an extremely small time steps (e.g. $10^{-12}$ or smaller). This computational bottleneck is especially problematic when running LMD simulation until a late time horizon is required. This motivates the development of surrogate models capable of lea** forward in time, by skip** several consecutive time steps at-once. In this paper, we propose U-Shaped Adaptive Fourier Neural Operators (U-AFNO), a machine learning (ML) model inspired by recent advances in neural operator learning. U-AFNO employs U-Nets for extracting and reconstructing local features within the physical fields, and passes the latent space through a vision transformer (ViT) implemented in the Fourier space (AFNO). We use U-AFNOs to learn the dynamics map** the field at a current time step into a later time step. We also identify global quantities of interest (QoI) describing the corrosion process (e.g. the deformation of the liquid-metal interface) and show that our proposed U-AFNO model is able to accurately predict the field dynamics, in-spite of the chaotic nature of LMD. Our model reproduces the key micro-structure statistics and QoIs with a level of accuracy on-par with the high-fidelity numerical solver. We also investigate the opportunity of using hybrid simulations, in which we alternate forward leap in time using the U-AFNO with high-fidelity time step**. We demonstrate that while advantageous for some surrogate model design choices, our proposed U-AFNO model in fully auto-regressive settings consistently outperforms hybrid schemes. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2404.17584 [pdf, other]

Equivariant graph convolutional neural networks for the representation of homogenized anisotropic microstructural mechanical response

Authors: Ravi Patel, Cosmin Safta, Reese E. Jones

Abstract: Composite materials with different microstructural material symmetries are common in engineering applications where grain structure, alloying and particle/fiber packing are optimized via controlled manufacturing. In fact these microstructural tunings can be done throughout a part to achieve functional gradation and optimization at a structural level. To predict the performance of particular micros… ▽ More Composite materials with different microstructural material symmetries are common in engineering applications where grain structure, alloying and particle/fiber packing are optimized via controlled manufacturing. In fact these microstructural tunings can be done throughout a part to achieve functional gradation and optimization at a structural level. To predict the performance of particular microstructural configuration and thereby overall performance, constitutive models of materials with microstructure are needed. In this work we provide neural network architectures that provide effective homogenization models of materials with anisotropic components. These models satisfy equivariance and material symmetry principles inherently through a combination of equivariant and tensor basis operations. We demonstrate them on datasets of stochastic volume elements with different textures and phases where the material undergoes elastic and plastic deformation, and show that the these network architectures provide significant performance improvements. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: 23 pages, 10 figures

arXiv:2402.11179 [pdf, other]

Uncertainty Quantification of Graph Convolution Neural Network Models of Evolving Processes

Authors: Jeremiah Hauth, Cosmin Safta, Xun Huan, Ravi G. Patel, Reese E. Jones

Abstract: The application of neural network models to scientific machine learning tasks has proliferated in recent years. In particular, neural network models have proved to be adept at modeling processes with spatial-temporal complexity. Nevertheless, these highly parameterized models have garnered skepticism in their ability to produce outputs with quantified error bounds over the regimes of interest. Hen… ▽ More The application of neural network models to scientific machine learning tasks has proliferated in recent years. In particular, neural network models have proved to be adept at modeling processes with spatial-temporal complexity. Nevertheless, these highly parameterized models have garnered skepticism in their ability to produce outputs with quantified error bounds over the regimes of interest. Hence there is a need to find uncertainty quantification methods that are suitable for neural networks. In this work we present comparisons of the parametric uncertainty quantification of neural networks modeling complex spatial-temporal processes with Hamiltonian Monte Carlo and Stein variational gradient descent and its projected variant. Specifically we apply these methods to graph convolutional neural network models of evolving systems modeled with recurrent neural network and neural ordinary differential equations architectures. We show that Stein variational inference is a viable alternative to Monte Carlo methods with some clear advantages for complex neural network models. For our exemplars, Stein variational interference gave similar uncertainty profiles through time compared to Hamiltonian Monte Carlo, albeit with generally more generous variance.Projected Stein variational gradient descent also produced similar uncertainty profiles to the non-projected counterpart, but large reductions in the active weight space were confounded by the stability of the neural network predictions and the convoluted likelihood landscape. △ Less

Submitted 16 February, 2024; originally announced February 2024.

Comments: 27 pages, 20 figures

arXiv:2312.04648 [pdf, other]

Enhancing Polynomial Chaos Expansion Based Surrogate Modeling using a Novel Probabilistic Transfer Learning Strategy

Authors: Wyatt Bridgman, Uma Balakrishnan, Reese Jones, Jiefu Chen, Xuqing Wu, Cosmin Safta, Yueqin Huang, Mohammad Khalil

Abstract: In the field of surrogate modeling, polynomial chaos expansion (PCE) allows practitioners to construct inexpensive yet accurate surrogates to be used in place of the expensive forward model simulations. For black-box simulations, non-intrusive PCE allows the construction of these surrogates using a set of simulation response evaluations. In this context, the PCE coefficients can be obtained using… ▽ More In the field of surrogate modeling, polynomial chaos expansion (PCE) allows practitioners to construct inexpensive yet accurate surrogates to be used in place of the expensive forward model simulations. For black-box simulations, non-intrusive PCE allows the construction of these surrogates using a set of simulation response evaluations. In this context, the PCE coefficients can be obtained using linear regression, which is also known as point collocation or stochastic response surfaces. Regression exhibits better scalability and can handle noisy function evaluations in contrast to other non-intrusive approaches, such as projection. However, since over-sampling is generally advisable for the linear regression approach, the simulation requirements become prohibitive for expensive forward models. We propose to leverage transfer learning whereby knowledge gained through similar PCE surrogate construction tasks (source domains) is transferred to a new surrogate-construction task (target domain) which has a limited number of forward model simulations (training data). The proposed transfer learning strategy determines how much, if any, information to transfer using new techniques inspired by Bayesian modeling and data assimilation. The strategy is scrutinized using numerical investigations and applied to an engineering problem from the oil and gas industry. △ Less

Submitted 7 December, 2023; originally announced December 2023.

arXiv:2210.00854 [pdf, other]

Deep learning and multi-level featurization of graph representations of microstructural data

Authors: Reese Jones, Cosmin Safta, Ari Frankel

Abstract: Many material response functions depend strongly on microstructure, such as inhomogeneities in phase or orientation. Homogenization presents the task of predicting the mean response of a sample of the microstructure to external loading for use in subgrid models and structure-property explorations. Although many microstructural fields have obvious segmentations, learning directly from the graph ind… ▽ More Many material response functions depend strongly on microstructure, such as inhomogeneities in phase or orientation. Homogenization presents the task of predicting the mean response of a sample of the microstructure to external loading for use in subgrid models and structure-property explorations. Although many microstructural fields have obvious segmentations, learning directly from the graph induced by the segmentation can be difficult because this representation does not encode all the information of the full field. We develop a means of deep learning of hidden features on the reduced graph given the native discretization and a segmentation of the initial input field. The features are associated with regions represented as nodes on the reduced graph. This reduced representation is then the basis for the subsequent multi-level/scale graph convolutional network model. There are a number of advantages of reducing the graph before fully processing with convolutional layers it, such as interpretable features and efficiency on large meshes. We demonstrate the performance of the proposed network relative to convolutional neural networks operating directly on the native discretization of the data using three physical exemplars. △ Less

Submitted 29 September, 2022; originally announced October 2022.

Comments: 27 pages, 17 figures

arXiv:2107.00090 [pdf, other]

Mesh-based graph convolutional neural networks for modeling materials with microstructure

Authors: Ari Frankel, Cosmin Safta, Coleman Alleman, Reese Jones

Abstract: Predicting the evolution of a representative sample of a material with microstructure is a fundamental problem in homogenization. In this work we propose a graph convolutional neural network that utilizes the discretized representation of the initial microstructure directly, without segmentation or clustering. Compared to feature-based and pixel-based convolutional neural network models, the propo… ▽ More Predicting the evolution of a representative sample of a material with microstructure is a fundamental problem in homogenization. In this work we propose a graph convolutional neural network that utilizes the discretized representation of the initial microstructure directly, without segmentation or clustering. Compared to feature-based and pixel-based convolutional neural network models, the proposed method has a number of advantages: (a) it is deep in that it does not require featurization but can benefit from it, (b) it has a simple implementation with standard convolutional filters and layers, (c) it works natively on unstructured and structured grid data without interpolation (unlike pixel-based convolutional neural networks), and (d) it preserves rotational invariance like other graph-based convolutional neural networks. We demonstrate the performance of the proposed network and compare it to traditional pixel-based convolution neural network models and feature-based graph convolutional neural networks on multiple large datasets. △ Less

Submitted 29 November, 2021; v1 submitted 3 June, 2021; originally announced July 2021.

Comments: 45 pages, 19 figures

arXiv:2006.09319 [pdf, other]

doi 10.1615/JMachLearnModelComput.2020035155

A Survey of Constrained Gaussian Process Regression: Approaches and Implementation Challenges

Authors: Laura Swiler, Mamikon Gulian, Ari Frankel, Cosmin Safta, John Jakeman

Abstract: Gaussian process regression is a popular Bayesian framework for surrogate modeling of expensive data sources. As part of a broader effort in scientific machine learning, many recent works have incorporated physical constraints or other a priori information within Gaussian process regression to supplement limited data and regularize the behavior of the model. We provide an overview and survey of se… ▽ More Gaussian process regression is a popular Bayesian framework for surrogate modeling of expensive data sources. As part of a broader effort in scientific machine learning, many recent works have incorporated physical constraints or other a priori information within Gaussian process regression to supplement limited data and regularize the behavior of the model. We provide an overview and survey of several classes of Gaussian process constraints, including positivity or bound constraints, monotonicity and convexity constraints, differential equation constraints provided by linear PDEs, and boundary condition constraints. We compare the strategies behind each approach as well as the differences in implementation, concluding with a discussion of the computational challenges introduced by constraints. △ Less

Submitted 6 January, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

Comments: 42 pages, 3 figures. Version 3: DOI & Reference added; appeared in Journal of Machine Learning for Modeling and Computing. Version 2 includes minor additions, clarifications and improvements to notation

Journal ref: Journal of Machine Learning for Modeling and Computing, 1(2):119-156 (2020)

arXiv:1707.09334 [pdf, other]

doi 10.1137/17M1141096

Compressive Sensing with Cross-Validation and Stop-Sampling for Sparse Polynomial Chaos Expansions

Authors: Xun Huan, Cosmin Safta, Khachik Sargsyan, Zachary P. Vane, Guilhem Lacaze, Joseph C. Oefelein, Habib N. Najm

Abstract: Compressive sensing is a powerful technique for recovering sparse solutions of underdetermined linear systems, which is often encountered in uncertainty quantification analysis of expensive and high-dimensional physical models. We perform numerical investigations employing several compressive sensing solvers that target the unconstrained LASSO formulation, with a focus on linear systems that arise… ▽ More Compressive sensing is a powerful technique for recovering sparse solutions of underdetermined linear systems, which is often encountered in uncertainty quantification analysis of expensive and high-dimensional physical models. We perform numerical investigations employing several compressive sensing solvers that target the unconstrained LASSO formulation, with a focus on linear systems that arise in the construction of polynomial chaos expansions. With core solvers of l1_ls, SpaRSA, CGIST, FPC_AS, and ADMM, we develop techniques to mitigate overfitting through an automated selection of regularization constant based on cross-validation, and a heuristic strategy to guide the stop-sampling decision. Practical recommendations on parameter settings for these techniques are provided and discussed. The overall method is applied to a series of numerical examples of increasing complexity, including large eddy simulations of supersonic turbulent jet-in-crossflow involving a 24-dimensional input. Through empirical phase-transition diagrams and convergence plots, we illustrate sparse recovery performance under structures induced by polynomial chaos, accuracy and computational tradeoffs between polynomial bases of different degrees, and practicability of conducting compressive sensing for a realistic, high-dimensional physical application. Across test cases studied in this paper, we find ADMM to have demonstrated empirical advantages through consistent lower errors and faster computational times. △ Less

Submitted 26 June, 2018; v1 submitted 28 July, 2017; originally announced July 2017.

Comments: Preprint 29 pages, 16 figures (56 small figures); v1 submitted to the SIAM/ASA Journal on Uncertainty Quantification on July 28, 2017; v2 submitted on March 12, 2018. v2 changes: minor edits involving some content reorganization and clarification; v3 submitted on May 5, 2018. v3 changes: minor edits

MSC Class: 62J05; 94A12; 65Z05; 62P35

Journal ref: SIAM/ASA Journal on Uncertainty Quantification 6 (2018) 907-936

arXiv:1508.05176 [pdf, other]

Efficient Representation of Uncertainty for Stochastic Economic Dispatch

Authors: Cosmin Safta, Richard L. -Y. Chen, Habib N. Najm, Ali Pinar, Jean-Paul Watson

Abstract: Stochastic economic dispatch models address uncertainties in forecasts of renewable generation output by considering a finite number of realizations drawn from a stochastic process model, typically via Monte Carlo sampling. Accurate evaluations of expectations or higher-order moments for quantities of interest, e.g., generating cost, can require a prohibitively large number of samples. We propose… ▽ More Stochastic economic dispatch models address uncertainties in forecasts of renewable generation output by considering a finite number of realizations drawn from a stochastic process model, typically via Monte Carlo sampling. Accurate evaluations of expectations or higher-order moments for quantities of interest, e.g., generating cost, can require a prohibitively large number of samples. We propose an alternative to Monte Carlo sampling based on Polynomial Chaos expansions. These representations are based on sparse quadrature methods, and enable accurate propagation of uncertainties in model parameters. We also investigate a method based on Karhunen-Loeve expansions that enables us to efficiently represent uncertainties in renewable energy generation. Considering expected production cost, we demonstrate that the proposed approach can yield several orders of magnitude reduction in computational cost for solving stochastic economic dispatch relative to Monte Carlo sampling, for a given target error threshold. △ Less

Submitted 21 August, 2015; originally announced August 2015.

Comments: arXiv admin note: text overlap with arXiv:1407.2232

Showing 1–11 of 11 results for author: Safta, C