Search | arXiv e-print repository

arXiv:2310.04690 [pdf, other]

A dimension-reduced variational approach for solving physics-based inverse problems using generative adversarial network priors and normalizing flows

Authors: Agnimitra Dasgupta, Dhruv V Patel, Deep Ray, Erik A Johnson, Assad A Oberai

Abstract: We propose a novel modular inference approach combining two different generative models -- generative adversarial networks (GAN) and normalizing flows -- to approximate the posterior distribution of physics-based Bayesian inverse problems framed in high-dimensional ambient spaces. We dub the proposed framework GAN-Flow. The proposed method leverages the intrinsic dimension reduction and superior s… ▽ More We propose a novel modular inference approach combining two different generative models -- generative adversarial networks (GAN) and normalizing flows -- to approximate the posterior distribution of physics-based Bayesian inverse problems framed in high-dimensional ambient spaces. We dub the proposed framework GAN-Flow. The proposed method leverages the intrinsic dimension reduction and superior sample generation capabilities of GANs to define a low-dimensional data-driven prior distribution. Once a trained GAN-prior is available, the inverse problem is solved entirely in the latent space of the GAN using variational Bayesian inference with normalizing flow-based variational distribution, which approximates low-dimensional posterior distribution by transforming realizations from the low-dimensional latent prior (Gaussian) to corresponding realizations of a low-dimensional variational posterior distribution. The trained GAN generator then maps realizations from this approximate posterior distribution in the latent space back to the high-dimensional ambient space. We also propose a two-stage training strategy for GAN-Flow wherein we train the two generative models sequentially. Thereafter, GAN-Flow can estimate the statistics of posterior-predictive quantities of interest at virtually no additional computational cost. The synergy between the two types of generative models allows us to overcome many challenges associated with the application of Bayesian inference to large-scale inverse problems, chief among which are describing an informative prior and sampling from the high-dimensional posterior. We demonstrate the efficacy and flexibility of GAN-Flow on various physics-based inverse problems of varying ambient dimensionality and prior knowledge using different types of GANs and normalizing flows. △ Less

Submitted 7 October, 2023; originally announced October 2023.

arXiv:2306.04895 [pdf, other]

Solution of physics-based inverse problems using conditional generative adversarial networks with full gradient penalty

Authors: Deep Ray, Javier Murgoitio-Esandi, Agnimitra Dasgupta, Assad A. Oberai

Abstract: The solution of probabilistic inverse problems for which the corresponding forward problem is constrained by physical principles is challenging. This is especially true if the dimension of the inferred vector is large and the prior information about it is in the form of a collection of samples. In this work, a novel deep learning based approach is developed and applied to solving these types of pr… ▽ More The solution of probabilistic inverse problems for which the corresponding forward problem is constrained by physical principles is challenging. This is especially true if the dimension of the inferred vector is large and the prior information about it is in the form of a collection of samples. In this work, a novel deep learning based approach is developed and applied to solving these types of problems. The approach utilizes samples of the inferred vector drawn from the prior distribution and a physics-based forward model to generate training data for a conditional Wasserstein generative adversarial network (cWGAN). The cWGAN learns the probability distribution for the inferred vector conditioned on the measurement and produces samples from this distribution. The cWGAN developed in this work differs from earlier versions in that its critic is required to be 1-Lipschitz with respect to both the inferred and the measurement vectors and not just the former. This leads to a loss term with the full (and not partial) gradient penalty. It is shown that this rather simple change leads to a stronger notion of convergence for the conditional density learned by the cWGAN and a more robust and accurate sampling strategy. Through numerical examples it is shown that this change also translates to better accuracy when solving inverse problems. The numerical examples considered include illustrative problems where the true distribution and/or statistics are known, and a more complex inverse problem motivated by applications in biomechanics. △ Less

Submitted 7 June, 2023; originally announced June 2023.

Comments: 34 pages, 9 figures, 3 tables, 1 appendix

arXiv:2304.04862 [pdf, other]

A few-shot graph Laplacian-based approach for improving the accuracy of low-fidelity data

Authors: Orazio Pinti, Assad A. Oberai

Abstract: Low-fidelity data is typically inexpensive to generate but inaccurate. On the other hand, high-fidelity data is accurate but expensive to obtain. Multi-fidelity methods use a small set of high-fidelity data to enhance the accuracy of a large set of low-fidelity data. In the approach described in this paper, this is accomplished by constructing a graph Laplacian using the low-fidelity data and comp… ▽ More Low-fidelity data is typically inexpensive to generate but inaccurate. On the other hand, high-fidelity data is accurate but expensive to obtain. Multi-fidelity methods use a small set of high-fidelity data to enhance the accuracy of a large set of low-fidelity data. In the approach described in this paper, this is accomplished by constructing a graph Laplacian using the low-fidelity data and computing its low-lying spectrum. This spectrum is then used to cluster the data and identify points that are closest to the centroids of the clusters. High-fidelity data is then acquired for these key points. Thereafter, a transformation that maps every low-fidelity data point to its bi-fidelity counterpart is determined by minimizing the discrepancy between the bi- and high-fidelity data at the key points, and to preserve the underlying structure of the low-fidelity data distribution. The latter objective is achieved by relying, once again, on the spectral properties of the graph Laplacian. This method is applied to a problem in solid mechanics and another in aerodynamics. In both cases, this methods uses a small fraction of high-fidelity data to significantly improve the accuracy of a large set of low-fidelity data. △ Less

Submitted 28 March, 2023; originally announced April 2023.

arXiv:2301.05427 [pdf, other]

Building a Fuel Moisture Model for the Coupled Fire-Atmosphere Model WRF-SFIRE from Data: From Kalman Filters to Recurrent Neural Networks

Authors: J. Mandel, J. Hirschi, A. K. Kochanski, A. Farguell, J. Haley, D. V. Mallia, B. Shaddy, A. A. Oberai, K. A. Hilburn

Abstract: The current fuel moisture content (FMC) subsystems in WRF-SFIRE and its workflow system WRFx use a time-lag differential equation model with assimilation of data from FMC sensors on Remote Automated Weather Stations (RAWS) by the extended augmented Kalman filter. But the quality of the result is constrained by the limitations of the model and of the Kalman filter. We observe that the data flow in… ▽ More The current fuel moisture content (FMC) subsystems in WRF-SFIRE and its workflow system WRFx use a time-lag differential equation model with assimilation of data from FMC sensors on Remote Automated Weather Stations (RAWS) by the extended augmented Kalman filter. But the quality of the result is constrained by the limitations of the model and of the Kalman filter. We observe that the data flow in a system consisting of a model and the Kalman filter can be interpreted to be the same as the data flow in a recurrent neural network (RNN). Thus, instead of building more sophisticated models and data assimilation methods, we want to train a RNN to approximate the dynamics of the response of the FMC sensor to a time series of environmental data. Because standard AI approaches did not converge to reasonable solutions, we pre-train the RNN with special initial weights devised to turn it into a numerical solver of the differential equation. We then allow the AI training machinery to optimize the RNN weights to fit the data better. We illustrate the method on an example of a time series of 10h-FMC from RAWS and weather data from the Real-Time Mesoscale Analysis (RTMA). △ Less

Submitted 13 January, 2023; originally announced January 2023.

Comments: 4 pages, 4 figures. Seminar on Numerical Analysis SNA'23, Ostrava, Czech Republic, January 23-27, 2023

MSC Class: 68T07; 86-10

Journal ref: https://www.ugn.cas.cz/event/2023/sna pp. 52-55

arXiv:2301.00942 [pdf, other]

Deep Learning and Computational Physics (Lecture Notes)

Authors: Deep Ray, Orazio Pinti, Assad A. Oberai

Abstract: These notes were compiled as lecture notes for a course developed and taught at the University of the Southern California. They should be accessible to a typical engineering graduate student with a strong background in Applied Mathematics. The main objective of these notes is to introduce a student who is familiar with concepts in linear algebra and partial differential equations to select topic… ▽ More These notes were compiled as lecture notes for a course developed and taught at the University of the Southern California. They should be accessible to a typical engineering graduate student with a strong background in Applied Mathematics. The main objective of these notes is to introduce a student who is familiar with concepts in linear algebra and partial differential equations to select topics in deep learning. These lecture notes exploit the strong connections between deep learning algorithms and the more conventional techniques of computational physics to achieve two goals. First, they use concepts from computational physics to develop an understanding of deep learning algorithms. Not surprisingly, many concepts in deep learning can be connected to similar concepts in computational physics, and one can utilize this connection to better understand these algorithms. Second, several novel deep learning algorithms can be used to solve challenging problems in computational physics. Thus, they offer someone who is interested in modeling a physical phenomena with a complementary set of tools. △ Less

Submitted 2 January, 2023; originally announced January 2023.

Comments: 7 chapters

MSC Class: 68T07

arXiv:2209.12871 [pdf, other]

Variationally Mimetic Operator Networks

Authors: Dhruv Patel, Deep Ray, Michael R. A. Abdelmalik, Thomas J. R. Hughes, Assad A. Oberai

Abstract: In recent years operator networks have emerged as promising deep learning tools for approximating the solution to partial differential equations (PDEs). These networks map input functions that describe material properties, forcing functions and boundary data to the solution of a PDE. This work describes a new architecture for operator networks that mimics the form of the numerical solution obtaine… ▽ More In recent years operator networks have emerged as promising deep learning tools for approximating the solution to partial differential equations (PDEs). These networks map input functions that describe material properties, forcing functions and boundary data to the solution of a PDE. This work describes a new architecture for operator networks that mimics the form of the numerical solution obtained from an approximate variational or weak formulation of the problem. The application of these ideas to a generic elliptic PDE leads to a variationally mimetic operator network (VarMiON). Like the conventional Deep Operator Network (DeepONet) the VarMiON is also composed of a sub-network that constructs the basis functions for the output and another that constructs the coefficients for these basis functions. However, in contrast to the DeepONet, the architecture of these sub-networks in the VarMiON is precisely determined. An analysis of the error in the VarMiON solution reveals that it contains contributions from the error in the training data, the training error, the quadrature error in sampling input and output functions, and a "covering error" that measures the distance between the test input functions and the nearest functions in the training dataset. It also depends on the stability constants for the exact solution operator and its VarMiON approximation. The application of the VarMiON to a canonical elliptic PDE and a nonlinear PDE reveals that for approximately the same number of network parameters, on average the VarMiON incurs smaller errors than a standard DeepONet and a recently proposed multiple-input operator network (MIONet). Further, its performance is more robust to variations in input functions, the techniques used to sample the input and output functions, the techniques used to construct the basis functions, and the number of input functions. △ Less

Submitted 29 August, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

Comments: 49 pages, 18 figures, 1 Appendix

MSC Class: 65N99; 35J20

arXiv:2202.07773 [pdf, other]

The efficacy and generalizability of conditional GANs for posterior inference in physics-based inverse problems

Authors: Deep Ray, Harisankar Ramaswamy, Dhruv V. Patel, Assad A. Oberai

Abstract: In this work, we train conditional Wasserstein generative adversarial networks to effectively sample from the posterior of physics-based Bayesian inference problems. The generator is constructed using a U-Net architecture, with the latent information injected using conditional instance normalization. The former facilitates a multiscale inverse map, while the latter enables the decoupling of the la… ▽ More In this work, we train conditional Wasserstein generative adversarial networks to effectively sample from the posterior of physics-based Bayesian inference problems. The generator is constructed using a U-Net architecture, with the latent information injected using conditional instance normalization. The former facilitates a multiscale inverse map, while the latter enables the decoupling of the latent space dimension from the dimension of the measurement, and introduces stochasticity at all scales of the U-Net. We solve PDE-based inverse problems to demonstrate the performance of our approach in quantifying the uncertainty in the inferred field. Further, we show the generator can learn inverse maps which are local in nature, which in turn promotes generalizability when testing with out-of-distribution samples. △ Less

Submitted 17 November, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

MSC Class: 62F15; 68T07; 65M32

arXiv:2107.02926 [pdf, other]

doi 10.1016/j.cma.2022.115428

Solution of Physics-based Bayesian Inverse Problems with Deep Generative Priors

Authors: Dhruv V Patel, Deep Ray, Assad A Oberai

Abstract: Inverse problems are ubiquitous in nature, arising in almost all areas of science and engineering ranging from geophysics and climate science to astrophysics and biomechanics. One of the central challenges in solving inverse problems is tackling their ill-posed nature. Bayesian inference provides a principled approach for overcoming this by formulating the inverse problem into a statistical framew… ▽ More Inverse problems are ubiquitous in nature, arising in almost all areas of science and engineering ranging from geophysics and climate science to astrophysics and biomechanics. One of the central challenges in solving inverse problems is tackling their ill-posed nature. Bayesian inference provides a principled approach for overcoming this by formulating the inverse problem into a statistical framework. However, it is challenging to apply when inferring fields that have discrete representations of large dimensions (the so-called "curse of dimensionality") and/or when prior information is available only in the form of previously acquired solutions. In this work, we present a novel method for efficient and accurate Bayesian inversion using deep generative models. Specifically, we demonstrate how using the approximate distribution learned by a Generative Adversarial Network (GAN) as a prior in a Bayesian update and reformulating the resulting inference problem in the low-dimensional latent space of the GAN, enables the efficient solution of large-scale Bayesian inverse problems. Our statistical framework preserves the underlying physics and is demonstrated to yield accurate results with reliable uncertainty estimates, even in the absence of information about underlying noise model, which is a significant challenge with many existing methods. We demonstrate the effectiveness of proposed method on a variety of inverse problems which include both synthetic as well as experimentally observed data. △ Less

Submitted 25 July, 2022; v1 submitted 6 July, 2021; originally announced July 2021.

Comments: Paper: 38 pages, 12 figures, 3 Tables

arXiv:2103.02648 [pdf, other]

The Effect of Super-spreader Events in Epidemics

Authors: Harisankar Ramaswamy, Assad A Oberai, Mitul Luhar, Yannis C Yortsos

Abstract: The spread of infectious epidemics is often accelerated by super-spreader events. Understanding their effect is important, particularly in the context of standard epidemiological models, which require estimates for parameters such as $R_0$. In this letter, we show that the effective value of $R_0$ in super-spreader situations is significantly large, of the order of hundreds, suggesting a delta-fun… ▽ More The spread of infectious epidemics is often accelerated by super-spreader events. Understanding their effect is important, particularly in the context of standard epidemiological models, which require estimates for parameters such as $R_0$. In this letter, we show that the effective value of $R_0$ in super-spreader situations is significantly large, of the order of hundreds, suggesting a delta-function-like behavior during the event. Use of a well-mixed room model supports these findings. They elucidate infection kinetic modeling in enclosed environments, which differ from the standard SIR model, and provide expressions for $R_0$ in terms of physical and operational parameters. The overall impact of super-spreader events can be significant, depending on the state of the epidemic and how the infections generated by the event subsequently spread in the community. △ Less

Submitted 26 March, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

MSC Class: 92-02 ACM Class: I.6.3

arXiv:2008.12766 [pdf, other]

A comprehensive spatial-temporal infection model

Authors: Harisankar Ramaswamy, Assad A Oberai, Yannis C Yortsos

Abstract: Motivated by analogies between the spreading of human-to-human infections and of chemical processes, we develop a comprehensive model that accounts both for infection and for transport. In this analogy, the three different populations of infection models correspond to three chemical species. Areal densities emerge as the key variables, thus capturing the effect of spatial density. We derive expres… ▽ More Motivated by analogies between the spreading of human-to-human infections and of chemical processes, we develop a comprehensive model that accounts both for infection and for transport. In this analogy, the three different populations of infection models correspond to three chemical species. Areal densities emerge as the key variables, thus capturing the effect of spatial density. We derive expressions for the kinetics of the infection rates and for the important parameter R0, that include areal density and its spatial distribution. Coupled with mobility the model allows the study of various effects. We first present results for a batch reactor, the chemical process equivalent of the SIR model. Because density makes R0 a decreasing function of the process extent, the infection curves are different and smaller than for the standard SIR model. We show that the effect of the initial conditions is limited to the onset of the epidemic. We derive effective infection curves for a number of cases, including a back-and-forth commute between regions of low and high R0 environments. We then consider spatially distributed systems. We show that diffusion leads to traveling waves, which in 1-D geometries propagate at a constant speed and with a constant shape, both of which are sole functions of R0. The infection curves are slightly different than for the batch problem, as diffusion mitigates the infection intensity, thus leading to an effective lower R0. The dimensional wave speed is found to be proportional to the product of the square root of the diffusivity and of an increasing function of R0, confirming the importance of restricting mobility in arresting the propagation of infection. We examine the interaction of infection waves under various conditions and scenarios, and extend the wave propagation analysis to 2-D heterogeneous systems. △ Less

Submitted 4 December, 2020; v1 submitted 28 August, 2020; originally announced August 2020.

MSC Class: 35Q92

arXiv:2003.12597 [pdf, other]

doi 10.13140/RG.2.2.28806.32322

GAN-based Priors for Quantifying Uncertainty

Authors: Dhruv V. Patel, Assad A. Oberai

Abstract: Bayesian inference is used extensively to quantify the uncertainty in an inferred field given the measurement of a related field when the two are linked by a mathematical model. Despite its many applications, Bayesian inference faces challenges when inferring fields that have discrete representations of large dimension, and/or have prior distributions that are difficult to characterize mathematica… ▽ More Bayesian inference is used extensively to quantify the uncertainty in an inferred field given the measurement of a related field when the two are linked by a mathematical model. Despite its many applications, Bayesian inference faces challenges when inferring fields that have discrete representations of large dimension, and/or have prior distributions that are difficult to characterize mathematically. In this work we demonstrate how the approximate distribution learned by a deep generative adversarial network (GAN) may be used as a prior in a Bayesian update to address both these challenges. We demonstrate the efficacy of this approach on two distinct, and remarkably broad, classes of problems. The first class leads to supervised learning algorithms for image classification with superior out of distribution detection and accuracy, and for image inpainting with built-in variance estimation. The second class leads to unsupervised learning algorithms for image denoising and for solving physics-driven inverse problems. △ Less

Submitted 27 March, 2020; originally announced March 2020.

arXiv:1909.06389 [pdf, other]

Spectral Analysis Of Weighted Laplacians Arising In Data Clustering

Authors: Franca Hoffmann, Bamdad Hosseini, Assad A. Oberai, Andrew M. Stuart

Abstract: Graph Laplacians computed from weighted adjacency matrices are widely used to identify geometric structure in data, and clusters in particular; their spectral properties play a central role in a number of unsupervised and semi-supervised learning algorithms. When suitably scaled, graph Laplacians approach limiting continuum operators in the large data limit. Studying these limiting operators, ther… ▽ More Graph Laplacians computed from weighted adjacency matrices are widely used to identify geometric structure in data, and clusters in particular; their spectral properties play a central role in a number of unsupervised and semi-supervised learning algorithms. When suitably scaled, graph Laplacians approach limiting continuum operators in the large data limit. Studying these limiting operators, therefore, sheds light on learning algorithms. This paper is devoted to the study of a parameterized family of divergence form elliptic operators that arise as the large data limit of graph Laplacians. The link between a three-parameter family of graph Laplacians and a three-parameter family of differential operators is explained. The spectral properties of these differential operators are analyzed in the situation where the data comprises two nearly separated clusters, in a sense which is made precise. In particular, we investigate how the spectral gap depends on the three parameters entering the graph Laplacian, and on a parameter measuring the size of the perturbation from the perfectly clustered case. Numerical results are presented which exemplify and extend the analysis: the computations study situations in which there are two nearly separated clusters, but which violate the assumptions used in our theory; situations in which more than two clusters are present, also going beyond our theory; and situations which demonstrate the relevance of our studies of differential operators for the understanding of finite data problems via the graph Laplacian. The findings provide insight into parameter choices made in learning algorithms which are based on weighted adjacency matrices; they also provide the basis for analysis of the consistency of various unsupervised and semi-supervised learning algorithms, in the large data limit. △ Less

Submitted 13 July, 2020; v1 submitted 13 September, 2019; originally announced September 2019.

MSC Class: 47A75; 62H30; 68T10; 35B20; 05C50

arXiv:1907.09987 [pdf, other]

Bayesian Inference with Generative Adversarial Network Priors

Authors: Dhruv Patel, Assad A Oberai

Abstract: Bayesian inference is used extensively to infer and to quantify the uncertainty in a field of interest from a measurement of a related field when the two are linked by a physical model. Despite its many applications, Bayesian inference faces challenges when inferring fields that have discrete representations of large dimension, and/or have prior distributions that are difficult to represent mathem… ▽ More Bayesian inference is used extensively to infer and to quantify the uncertainty in a field of interest from a measurement of a related field when the two are linked by a physical model. Despite its many applications, Bayesian inference faces challenges when inferring fields that have discrete representations of large dimension, and/or have prior distributions that are difficult to represent mathematically. In this manuscript we consider the use of Generative Adversarial Networks (GANs) in addressing these challenges. A GAN is a type of deep neural network equipped with the ability to learn the distribution implied by multiple samples of a given field. Once trained on these samples, the generator component of a GAN maps the iid components of a low-dimensional latent vector to an approximation of the distribution of the field of interest. In this work we demonstrate how this approximate distribution may be used as a prior in a Bayesian update, and how it addresses the challenges associated with characterizing complex prior distributions and the large dimension of the inferred field. We demonstrate the efficacy of this approach by applying it to the problem of inferring and quantifying uncertainty in the initial temperature field in a heat conduction problem from a noisy measurement of the temperature at later time. △ Less

Submitted 22 July, 2019; originally announced July 2019.

arXiv:1506.04765 [pdf, other]

Recovering vector displacement estimates in quasistatic elastography using sparse relaxation of the momentum equation

Authors: Olalekan A. Babaniyi, Assad A. Oberai, Paul E. Barbone

Abstract: We consider the problem of estimating the $2D$ vector displacement field in a heterogeneous elastic solid deforming under plane stress conditions. The problem is motivated by applications in quasistatic elastography. From precise and accurate measurements of one component of the $2D$ vector displacement field and very limited information of the second component, the method reconstructs the second… ▽ More We consider the problem of estimating the $2D$ vector displacement field in a heterogeneous elastic solid deforming under plane stress conditions. The problem is motivated by applications in quasistatic elastography. From precise and accurate measurements of one component of the $2D$ vector displacement field and very limited information of the second component, the method reconstructs the second component quite accurately. No a priori knowledge of the heterogeneous distribution of material properties is required. This method relies on using a special form of the momentum equations to filter ultrasound displacement measurements to produce more precise estimates. We verify the method with applications to simulated displacement data. We validate the method with applications to displacement data measured from a tissue mimicking phantom, and in-vivo data; significant improvements are noticed in the filtered displacements recovered from all the tests. In verification studies, error in lateral displacement estimates decreased from about $50\%$ to about $2\%$, and strain error decreased from more than $250\%$ to below $2\%$. △ Less

Submitted 11 June, 2015; originally announced June 2015.

Comments: 38 pages, 26 figures

arXiv:1412.1055 [pdf, other]

doi 10.1016/j.jcp.2015.04.035

A new class of finite element variational multiscale turbulence models for incompressible magnetohydrodynamics

Authors: David Sondak, John N. Shadid, Assad A. Oberai, Roger P. Pawlowski, Eric C. Cyr, Tom M. Smith

Abstract: New large eddy simulation (LES) turbulence models for incompressible magnetohydrodynamics (MHD) derived from the variational multiscale (VMS) formulation for finite element simulations are introduced. The new models include the variational multiscale formulation, a residual-based eddy viscosity model, and a mixed model that combines both of these component models. Each model contains terms that ar… ▽ More New large eddy simulation (LES) turbulence models for incompressible magnetohydrodynamics (MHD) derived from the variational multiscale (VMS) formulation for finite element simulations are introduced. The new models include the variational multiscale formulation, a residual-based eddy viscosity model, and a mixed model that combines both of these component models. Each model contains terms that are proportional to the residual of the incompressible MHD equations and is therefore numerically consistent. Moreover, each model is also dynamic, in that its effect vanishes when this residual is small. The new models are tested on the decaying MHD Taylor Green vortex at low and high Reynolds numbers. The evaluation of the models is based on comparisons with available data from direct numerical simulations (DNS) of the time evolution of energies as well as energy spectra at various discrete times. A numerical study, on a sequence of meshes, is presented that demonstrates that the large eddy simulation approaches the DNS solution for these quantities with spatial mesh refinement. △ Less

Submitted 2 December, 2014; originally announced December 2014.

Comments: 30 pages, 14 figures, submitted to the Journal of Computational Physics

MSC Class: 76F65 (Primary) 76W05 (Secondary)

Showing 1–15 of 15 results for author: Oberai, A A