Search | arXiv e-print repository

Tackling the Curse of Dimensionality in Fractional and Tempered Fractional PDEs with Physics-Informed Neural Networks

Authors: Zheyuan Hu, Kenji Kawaguchi, Zhongqiang Zhang, George Em Karniadakis

Abstract: Fractional and tempered fractional partial differential equations (PDEs) are effective models of long-range interactions, anomalous diffusion, and non-local effects. Traditional numerical methods for these problems are mesh-based, thus struggling with the curse of dimensionality (CoD). Physics-informed neural networks (PINNs) offer a promising solution due to their universal approximation, general… ▽ More Fractional and tempered fractional partial differential equations (PDEs) are effective models of long-range interactions, anomalous diffusion, and non-local effects. Traditional numerical methods for these problems are mesh-based, thus struggling with the curse of dimensionality (CoD). Physics-informed neural networks (PINNs) offer a promising solution due to their universal approximation, generalization ability, and mesh-free training. In principle, Monte Carlo fractional PINN (MC-fPINN) estimates fractional derivatives using Monte Carlo methods and thus could lift CoD. However, this may cause significant variance and errors, hence affecting convergence; in addition, MC-fPINN is sensitive to hyperparameters. In general, numerical methods and specifically PINNs for tempered fractional PDEs are under-developed. Herein, we extend MC-fPINN to tempered fractional PDEs to address these issues, resulting in the Monte Carlo tempered fractional PINN (MC-tfPINN). To reduce possible high variance and errors from Monte Carlo sampling, we replace the one-dimensional (1D) Monte Carlo with 1D Gaussian quadrature, applicable to both MC-fPINN and MC-tfPINN. We validate our methods on various forward and inverse problems of fractional and tempered fractional PDEs, scaling up to 100,000 dimensions. Our improved MC-fPINN/MC-tfPINN using quadrature consistently outperforms the original versions in accuracy and convergence speed in very high dimensions. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 15 pages

ACM Class: F.2.2; I.2.7

arXiv:2406.11676 [pdf, other]

Score-fPINN: Fractional Score-Based Physics-Informed Neural Networks for High-Dimensional Fokker-Planck-Levy Equations

Authors: Zheyuan Hu, Zhongqiang Zhang, George Em Karniadakis, Kenji Kawaguchi

Abstract: We introduce an innovative approach for solving high-dimensional Fokker-Planck-Lévy (FPL) equations in modeling non-Brownian processes across disciplines such as physics, finance, and ecology. We utilize a fractional score function and Physical-informed neural networks (PINN) to lift the curse of dimensionality (CoD) and alleviate numerical overflow from exponentially decaying solutions with dimen… ▽ More We introduce an innovative approach for solving high-dimensional Fokker-Planck-Lévy (FPL) equations in modeling non-Brownian processes across disciplines such as physics, finance, and ecology. We utilize a fractional score function and Physical-informed neural networks (PINN) to lift the curse of dimensionality (CoD) and alleviate numerical overflow from exponentially decaying solutions with dimensions. The introduction of a fractional score function allows us to transform the FPL equation into a second-order partial differential equation without fractional Laplacian and thus can be readily solved with standard physics-informed neural networks (PINNs). We propose two methods to obtain a fractional score function: fractional score matching (FSM) and score-fPINN for fitting the fractional score function. While FSM is more cost-effective, it relies on known conditional distributions. On the other hand, score-fPINN is independent of specific stochastic differential equations (SDEs) but requires evaluating the PINN model's derivatives, which may be more costly. We conduct our experiments on various SDEs and demonstrate numerical stability and effectiveness of our method in dealing with high-dimensional problems, marking a significant advancement in addressing the CoD in FPL equations. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 16 pages, 1 figure

ACM Class: F.2.2; I.2.7

arXiv:2406.10997 [pdf, other]

Two-level overlap** additive Schwarz preconditioner for training scientific machine learning applications

Authors: Youngkyu Lee, Alena Kopaničáková, George Em Karniadakis

Abstract: We introduce a novel two-level overlap** additive Schwarz preconditioner for accelerating the training of scientific machine learning applications. The design of the proposed preconditioner is motivated by the nonlinear two-level overlap** additive Schwarz preconditioner. The neural network parameters are decomposed into groups (subdomains) with overlap** regions. In addition, the network's… ▽ More We introduce a novel two-level overlap** additive Schwarz preconditioner for accelerating the training of scientific machine learning applications. The design of the proposed preconditioner is motivated by the nonlinear two-level overlap** additive Schwarz preconditioner. The neural network parameters are decomposed into groups (subdomains) with overlap** regions. In addition, the network's feed-forward structure is indirectly imposed through a novel subdomain-wise synchronization strategy and a coarse-level training step. Through a series of numerical experiments, which consider physics-informed neural networks and operator learning approaches, we demonstrate that the proposed two-level preconditioner significantly speeds up the convergence of the standard (LBFGS) optimizer while also yielding more accurate machine learning models. Moreover, the devised preconditioner is designed to take advantage of model-parallel computations, which can further reduce the training time. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: 24 pages, 9 figures

MSC Class: 90C30; 90C26; 90C06; 65M55; 68T07

arXiv:2406.02917 [pdf, other]

A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks

Authors: Khemraj Shukla, Juan Diego Toscano, Zhicheng Wang, Zongren Zou, George Em Karniadakis

Abstract: Kolmogorov-Arnold Networks (KANs) were recently introduced as an alternative representation model to MLP. Herein, we employ KANs to construct physics-informed machine learning models (PIKANs) and deep operator models (DeepOKANs) for solving differential equations for forward and inverse problems. In particular, we compare them with physics-informed neural networks (PINNs) and deep operator network… ▽ More Kolmogorov-Arnold Networks (KANs) were recently introduced as an alternative representation model to MLP. Herein, we employ KANs to construct physics-informed machine learning models (PIKANs) and deep operator models (DeepOKANs) for solving differential equations for forward and inverse problems. In particular, we compare them with physics-informed neural networks (PINNs) and deep operator networks (DeepONets), which are based on the standard MLP representation. We find that although the original KANs based on the B-splines parameterization lack accuracy and efficiency, modified versions based on low-order orthogonal polynomials have comparable performance to PINNs and DeepONet although they still lack robustness as they may diverge for different random seeds or higher order orthogonal polynomials. We visualize their corresponding loss landscapes and analyze their learning dynamics using information bottleneck theory. Our study follows the FAIR principles so that other researchers can use our benchmarks to further advance this emerging topic. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2405.19166 [pdf, other]

Transformers as Neural Operators for Solutions of Differential Equations with Finite Regularity

Authors: Benjamin Shih, Ahmad Peyvan, Zhongqiang Zhang, George Em Karniadakis

Abstract: Neural operator learning models have emerged as very effective surrogates in data-driven methods for partial differential equations (PDEs) across different applications from computational science and engineering. Such operator learning models not only predict particular instances of a physical or biological system in real-time but also forecast classes of solutions corresponding to a distribution… ▽ More Neural operator learning models have emerged as very effective surrogates in data-driven methods for partial differential equations (PDEs) across different applications from computational science and engineering. Such operator learning models not only predict particular instances of a physical or biological system in real-time but also forecast classes of solutions corresponding to a distribution of initial and boundary conditions or forcing terms. % DeepONet is the first neural operator model and has been tested extensively for a broad class of solutions, including Riemann problems. Transformers have not been used in that capacity, and specifically, they have not been tested for solutions of PDEs with low regularity. % In this work, we first establish the theoretical groundwork that transformers possess the universal approximation property as operator learning models. We then apply transformers to forecast solutions of diverse dynamical systems with solutions of finite regularity for a plurality of initial conditions and forcing terms. In particular, we consider three examples: the Izhikevich neuron model, the tempered fractional-order Leaky Integrate-and-Fire (LIF) model, and the one-dimensional Euler equation Riemann problem. For the latter problem, we also compare with variants of DeepONet, and we find that transformers outperform DeepONet in accuracy but they are computationally more expensive. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.12380 [pdf, other]

Large scale scattering using fast solvers based on neural operators

Authors: Zongren Zou, Adar Kahana, Enrui Zhang, Eli Turkel, Rishikesh Ranade, Jay Pathak, George Em Karniadakis

Abstract: We extend a recently proposed machine-learning-based iterative solver, i.e. the hybrid iterative transferable solver (HINTS), to solve the scattering problem described by the Helmholtz equation in an exterior domain with a complex absorbing boundary condition. The HINTS method combines neural operators (NOs) with standard iterative solvers, e.g. Jacobi and Gauss-Seidel (GS), to achieve better perf… ▽ More We extend a recently proposed machine-learning-based iterative solver, i.e. the hybrid iterative transferable solver (HINTS), to solve the scattering problem described by the Helmholtz equation in an exterior domain with a complex absorbing boundary condition. The HINTS method combines neural operators (NOs) with standard iterative solvers, e.g. Jacobi and Gauss-Seidel (GS), to achieve better performance by leveraging the spectral bias of neural networks. In HINTS, some iterations of the conventional iterative method are replaced by inferences of the pre-trained NO. In this work, we employ HINTS to solve the scattering problem for both 2D and 3D problems, where the standard iterative solver fails. We consider square and triangular scatterers of various sizes in 2D, and a cube and a model submarine in 3D. We explore and illustrate the extrapolation capability of HINTS in handling diverse geometries of the scatterer, which is achieved by training the NO on non-scattering scenarios and then deploying it in HINTS to solve scattering problems. The accurate results demonstrate that the NO in HINTS method remains effective without retraining or fine-tuning it whenever a new scatterer is given. Taken together, our results highlight the adaptability and versatility of the extended HINTS methodology in addressing diverse scattering problems. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.00217 [pdf, other]

GMC-PINNs: A new general Monte Carlo PINNs method for solving fractional partial differential equations on irregular domains

Authors: Shupeng Wang, George Em Karniadakis

Abstract: Physics-Informed Neural Networks (PINNs) have been widely used for solving partial differential equations (PDEs) of different types, including fractional PDEs (fPDES) [29]. Herein, we propose a new general (quasi) Monte Carlo PINN for solving fPDEs on irregular domains. Specifically, instead of approximating fractional derivatives by Monte Carlo approximations of integrals as was done previously i… ▽ More Physics-Informed Neural Networks (PINNs) have been widely used for solving partial differential equations (PDEs) of different types, including fractional PDEs (fPDES) [29]. Herein, we propose a new general (quasi) Monte Carlo PINN for solving fPDEs on irregular domains. Specifically, instead of approximating fractional derivatives by Monte Carlo approximations of integrals as was done previously in [31], we use a more general Monte Carlo approximation method to solve different fPDEs, which is valid for fractional differentiation under any definition. Moreover, based on the ensemble probability density function, the generated nodes are all located in denser regions near the target point where we perform the differentiation. This has an unexpected connection with known finite difference methods on non-equidistant or nested grids, and hence our method inherits their advantages. At the same time, the generated nodes exhibit a block-like dense distribution, leading to a good computational efficiency of this approach. We present the framework for using this algorithm and apply it to several examples. Our results demonstrate the effectiveness of GMC-PINNs in dealing with irregular domain problems and show a higher computational efficiency compared to the original fPINN method. We also include comparisons with the Monte Carlo fPINN [31]. Finally, we use examples to demonstrate the effectiveness of the method in dealing with fuzzy boundary location problems, and then use the method to solve the coupled 3D fractional Bloch-Torrey equation defined in the ventricular domain of the human brain, and compare the results with classical numerical methods. △ Less

Submitted 30 April, 2024; originally announced May 2024.

arXiv:2404.08809 [pdf, other]

Leveraging viscous Hamilton-Jacobi PDEs for uncertainty quantification in scientific machine learning

Authors: Zongren Zou, Tingwei Meng, Paula Chen, Jérôme Darbon, George Em Karniadakis

Abstract: Uncertainty quantification (UQ) in scientific machine learning (SciML) combines the powerful predictive power of SciML with methods for quantifying the reliability of the learned models. However, two major challenges remain: limited interpretability and expensive training procedures. We provide a new interpretation for UQ problems by establishing a new theoretical connection between some Bayesian… ▽ More Uncertainty quantification (UQ) in scientific machine learning (SciML) combines the powerful predictive power of SciML with methods for quantifying the reliability of the learned models. However, two major challenges remain: limited interpretability and expensive training procedures. We provide a new interpretation for UQ problems by establishing a new theoretical connection between some Bayesian inference problems arising in SciML and viscous Hamilton-Jacobi partial differential equations (HJ PDEs). Namely, we show that the posterior mean and covariance can be recovered from the spatial gradient and Hessian of the solution to a viscous HJ PDE. As a first exploration of this connection, we specialize to Bayesian inference problems with linear models, Gaussian likelihoods, and Gaussian priors. In this case, the associated viscous HJ PDEs can be solved using Riccati ODEs, and we develop a new Riccati-based methodology that provides computational advantages when continuously updating the model predictions. Specifically, our Riccati-based approach can efficiently add or remove data points to the training set invariant to the order of the data and continuously tune hyperparameters. Moreover, neither update requires retraining on or access to previously incorporated data. We provide several examples from SciML involving noisy data and \textit{epistemic uncertainty} to illustrate the potential advantages of our approach. In particular, this approach's amenability to data streaming applications demonstrates its potential for real-time inferences, which, in turn, allows for applications in which the predicted uncertainty is used to dynamically alter the learning process. △ Less

Submitted 12 April, 2024; originally announced April 2024.

MSC Class: 35F21; 62F15; 65L99; 65N99; 68T05; 35B37

arXiv:2404.05615 [pdf, other]

Tensor neural networks for high-dimensional Fokker-Planck equations

Authors: Taorui Wang, Zheyuan Hu, Kenji Kawaguchi, Zhongqiang Zhang, George Em Karniadakis

Abstract: We solve high-dimensional steady-state Fokker-Planck equations on the whole space by applying tensor neural networks. The tensor networks are a tensor product of one-dimensional feedforward networks or a linear combination of several selected radial basis functions. The use of tensor feedforward networks allows us to efficiently exploit auto-differentiation in major Python packages while using rad… ▽ More We solve high-dimensional steady-state Fokker-Planck equations on the whole space by applying tensor neural networks. The tensor networks are a tensor product of one-dimensional feedforward networks or a linear combination of several selected radial basis functions. The use of tensor feedforward networks allows us to efficiently exploit auto-differentiation in major Python packages while using radial basis functions can fully avoid auto-differentiation, which is rather expensive in high dimensions. We then use the physics-informed neural networks and stochastic gradient descent methods to learn the tensor networks. One essential step is to determine a proper truncated bounded domain or numerical support for the Fokker-Planck equation. To better train the tensor radial basis function networks, we impose some constraints on parameters, which lead to relatively high accuracy. We demonstrate numerically that the tensor neural networks in physics-informed machine learning are efficient for steady-state Fokker-Planck equations from two to ten dimensions. △ Less

Submitted 8 April, 2024; originally announced April 2024.

arXiv:2403.18494 [pdf, other]

Learning in PINNs: Phase transition, total diffusion, and generalization

Authors: Sokratis J. Anagnostopoulos, Juan Diego Toscano, Nikolaos Stergiopulos, George Em Karniadakis

Abstract: We investigate the learning dynamics of fully-connected neural networks through the lens of gradient signal-to-noise ratio (SNR), examining the behavior of first-order optimizers like Adam in non-convex objectives. By interpreting the drift/diffusion phases in the information bottleneck theory, focusing on gradient homogeneity, we identify a third phase termed ``total diffusion", characterized by… ▽ More We investigate the learning dynamics of fully-connected neural networks through the lens of gradient signal-to-noise ratio (SNR), examining the behavior of first-order optimizers like Adam in non-convex objectives. By interpreting the drift/diffusion phases in the information bottleneck theory, focusing on gradient homogeneity, we identify a third phase termed ``total diffusion", characterized by equilibrium in the learning rates and homogeneous gradients. This phase is marked by an abrupt SNR increase, uniform residuals across the sample space and the most rapid training convergence. We propose a residual-based re-weighting scheme to accelerate this diffusion in quadratic loss functions, enhancing generalization. We also explore the information compression phenomenon, pinpointing a significant saturation-induced compression of activations at the total diffusion phase, with deeper layers experiencing negligible information loss. Supported by experimental data on physics-informed neural networks (PINNs), which underscore the importance of gradient homogeneity due to their PDE-based sample inter-dependence, our findings suggest that recognizing phase transitions could refine ML optimization strategies for improved generalization. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2402.17232 [pdf, other]

Two-scale Neural Networks for Partial Differential Equations with Small Parameters

Authors: Qiao Zhuang, Chris Ziyi Yao, Zhongqiang Zhang, George Em Karniadakis

Abstract: We propose a two-scale neural network method for solving partial differential equations (PDEs) with small parameters using physics-informed neural networks (PINNs). We directly incorporate the small parameters into the architecture of neural networks. The proposed method enables solving PDEs with small parameters in a simple fashion, without adding Fourier features or other computationally taxing… ▽ More We propose a two-scale neural network method for solving partial differential equations (PDEs) with small parameters using physics-informed neural networks (PINNs). We directly incorporate the small parameters into the architecture of neural networks. The proposed method enables solving PDEs with small parameters in a simple fashion, without adding Fourier features or other computationally taxing searches of truncation parameters. Various numerical examples demonstrate reasonable accuracy in capturing features of large derivatives in the solutions caused by small parameters. △ Less

Submitted 27 February, 2024; originally announced February 2024.

MSC Class: 65N35; 35B25 ACM Class: I.2.6

arXiv:2402.07465 [pdf, other]

Score-Based Physics-Informed Neural Networks for High-Dimensional Fokker-Planck Equations

Authors: Zheyuan Hu, Zhongqiang Zhang, George Em Karniadakis, Kenji Kawaguchi

Abstract: The Fokker-Planck (FP) equation is a foundational PDE in stochastic processes. However, curse of dimensionality (CoD) poses challenge when dealing with high-dimensional FP PDEs. Although Monte Carlo and vanilla Physics-Informed Neural Networks (PINNs) have shown the potential to tackle CoD, both methods exhibit numerical errors in high dimensions when dealing with the probability density function… ▽ More The Fokker-Planck (FP) equation is a foundational PDE in stochastic processes. However, curse of dimensionality (CoD) poses challenge when dealing with high-dimensional FP PDEs. Although Monte Carlo and vanilla Physics-Informed Neural Networks (PINNs) have shown the potential to tackle CoD, both methods exhibit numerical errors in high dimensions when dealing with the probability density function (PDF) associated with Brownian motion. The point-wise PDF values tend to decrease exponentially as dimension increases, surpassing the precision of numerical simulations and resulting in substantial errors. Moreover, due to its massive sampling, Monte Carlo fails to offer fast sampling. Modeling the logarithm likelihood (LL) via vanilla PINNs transforms the FP equation into a difficult HJB equation, whose error grows rapidly with dimension. To this end, we propose a novel approach utilizing a score-based solver to fit the score function in SDEs. The score function, defined as the gradient of the LL, plays a fundamental role in inferring LL and PDF and enables fast SDE sampling. Three fitting methods, Score Matching (SM), Sliced SM (SSM), and Score-PINN, are introduced. The proposed score-based SDE solver operates in two stages: first, employing SM, SSM, or Score-PINN to acquire the score; and second, solving the LL via an ODE using the obtained score. Comparative evaluations across these methods showcase varying trade-offs. The proposed method is evaluated across diverse SDEs, including anisotropic OU processes, geometric Brownian, and Brownian with varying eigenspace. We also test various distributions, including Gaussian, Log-normal, Laplace, and Cauchy. The numerical results demonstrate the score-based SDE solver's stability, speed, and performance across different settings, solidifying its potential as a solution to CoD for high-dimensional FP equations. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: 22 pages

MSC Class: 14J60

arXiv:2401.08886 [pdf, other]

doi 10.1016/j.cma.2024.116996

RiemannONets: Interpretable Neural Operators for Riemann Problems

Authors: Ahmad Peyvan, Vivek Oommen, Ameya D. Jagtap, George Em Karniadakis

Abstract: Develo** the proper representations for simulating high-speed flows with strong shock waves, rarefactions, and contact discontinuities has been a long-standing question in numerical analysis. Herein, we employ neural operators to solve Riemann problems encountered in compressible flows for extreme pressure jumps (up to $10^{10}$ pressure ratio). In particular, we first consider the DeepONet that… ▽ More Develo** the proper representations for simulating high-speed flows with strong shock waves, rarefactions, and contact discontinuities has been a long-standing question in numerical analysis. Herein, we employ neural operators to solve Riemann problems encountered in compressible flows for extreme pressure jumps (up to $10^{10}$ pressure ratio). In particular, we first consider the DeepONet that we train in a two-stage process, following the recent work of \cite{lee2023training}, wherein the first stage, a basis is extracted from the trunk net, which is orthonormalized and subsequently is used in the second stage in training the branch net. This simple modification of DeepONet has a profound effect on its accuracy, efficiency, and robustness and leads to very accurate solutions to Riemann problems compared to the vanilla version. It also enables us to interpret the results physically as the hierarchical data-driven produced basis reflects all the flow features that would otherwise be introduced using ad hoc feature expansion layers. We also compare the results with another neural operator based on the U-Net for low, intermediate, and very high-pressure ratios that are very accurate for Riemann problems, especially for large pressure ratios, due to their multiscale nature but computationally more expensive. Overall, our study demonstrates that simple neural network architectures, if properly pre-trained, can achieve very accurate solutions of Riemann problems for real-time forecasting. The source code, along with its corresponding data, can be found at the following URL: https://github.com/apey236/RiemannONet/tree/main △ Less

Submitted 16 April, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

arXiv:2401.02016 [pdf, other]

DeepOnet Based Preconditioning Strategies For Solving Parametric Linear Systems of Equations

Authors: Alena Kopaničáková, George Em Karniadakis

Abstract: We introduce a new class of hybrid preconditioners for solving parametric linear systems of equations. The proposed preconditioners are constructed by hybridizing the deep operator network, namely DeepONet, with standard iterative methods. Exploiting the spectral bias, DeepONet-based components are harnessed to address low-frequency error components, while conventional iterative methods are employ… ▽ More We introduce a new class of hybrid preconditioners for solving parametric linear systems of equations. The proposed preconditioners are constructed by hybridizing the deep operator network, namely DeepONet, with standard iterative methods. Exploiting the spectral bias, DeepONet-based components are harnessed to address low-frequency error components, while conventional iterative methods are employed to mitigate high-frequency error components. Our preconditioning framework comprises two distinct hybridization approaches: direct preconditioning (DP) and trunk basis (TB) approaches. In the DP approach, DeepONet is used to approximate an action of an inverse operator to a vector during each preconditioning step. In contrast, the TB approach extracts basis functions from the trained DeepONet to construct a map to a smaller subspace, in which the low-frequency component of the error can be effectively eliminated. Our numerical results demonstrate that utilizing the TB approach enhances the convergence of Krylov methods by a large margin compared to standard non-hybrid preconditioning strategies. Moreover, the proposed hybrid preconditioners exhibit robustness across a wide range of model parameters and problem resolutions. △ Less

Submitted 9 January, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

Comments: 35 pages

arXiv:2401.00369 [pdf, other]

Analysis of biologically plausible neuron models for regression with spiking neural networks

Authors: Mario De Florio, Adar Kahana, George Em Karniadakis

Abstract: This paper explores the impact of biologically plausible neuron models on the performance of Spiking Neural Networks (SNNs) for regression tasks. While SNNs are widely recognized for classification tasks, their application to Scientific Machine Learning and regression remains underexplored. We focus on the membrane component of SNNs, comparing four neuron models: Leaky Integrate-and-Fire, FitzHugh… ▽ More This paper explores the impact of biologically plausible neuron models on the performance of Spiking Neural Networks (SNNs) for regression tasks. While SNNs are widely recognized for classification tasks, their application to Scientific Machine Learning and regression remains underexplored. We focus on the membrane component of SNNs, comparing four neuron models: Leaky Integrate-and-Fire, FitzHugh-Nagumo, Izhikevich, and Hodgkin-Huxley. We investigate their effect on SNN accuracy and efficiency for function regression tasks, by using Euler and Runge-Kutta 4th-order approximation schemes. We show how more biologically plausible neuron models improve the accuracy of SNNs while reducing the number of spikes in the system. The latter represents an energetic gain on actual neuromorphic chips since it directly reflects the amount of energy required for the computations. △ Less

Submitted 30 December, 2023; originally announced January 2024.

arXiv:2401.00061 [pdf, other]

Learning thermoacoustic interactions in combustors using a physics-informed neural network

Authors: Sathesh Mariappan, Kamaljyoti Nath, George Em Karniadakis

Abstract: We introduce a physics-informed neural network (PINN) method to study thermoacoustic interactions leading to combustion instability in combustors. Specifically, we employ a PINN to investigate thermoacoustic interactions in a bluff body anchored flame combustor, representative of ramjet and industrial combustors. Vortex shedding and acoustic oscillations appear in such combustors, and their intera… ▽ More We introduce a physics-informed neural network (PINN) method to study thermoacoustic interactions leading to combustion instability in combustors. Specifically, we employ a PINN to investigate thermoacoustic interactions in a bluff body anchored flame combustor, representative of ramjet and industrial combustors. Vortex shedding and acoustic oscillations appear in such combustors, and their interactions lead to the phenomenon of vortex-acoustic lock-in. Acoustic pressure fluctuations at three locations and the total flame heat release rate serve as the measured data. The coupled parameterized model is based on the acoustic equations and the van der Pol oscillator for vortex shedding. The PINN was applied in the combustor, where the measurements suitable for a future machine learning application were not anticipated at the time of the experiments, as is the case in the vast majority of available data in the literature. We demonstrate a good performance of PINN in generating the acoustic field (pressure and velocity fluctuations) in the entire spatiotemporal domain, along with estimating all the parameters of the model. Therefore, this PINN-based model can potentially serve as an effective tool in improving existing combustors or designing new thermoacoustically stable and structurally efficient combustors. △ Less

Submitted 29 December, 2023; originally announced January 2024.

arXiv:2312.14499 [pdf, other]

doi 10.1016/j.cma.2024.116883

Hutchinson Trace Estimation for High-Dimensional and High-Order Physics-Informed Neural Networks

Authors: Zheyuan Hu, Zekun Shi, George Em Karniadakis, Kenji Kawaguchi

Abstract: Physics-Informed Neural Networks (PINNs) have proven effective in solving partial differential equations (PDEs), especially when some data are available by seamlessly blending data and physics. However, extending PINNs to high-dimensional and even high-order PDEs encounters significant challenges due to the computational cost associated with automatic differentiation in the residual loss. Herein,… ▽ More Physics-Informed Neural Networks (PINNs) have proven effective in solving partial differential equations (PDEs), especially when some data are available by seamlessly blending data and physics. However, extending PINNs to high-dimensional and even high-order PDEs encounters significant challenges due to the computational cost associated with automatic differentiation in the residual loss. Herein, we address the limitations of PINNs in handling high-dimensional and high-order PDEs by introducing Hutchinson Trace Estimation (HTE). Starting with the second-order high-dimensional PDEs ubiquitous in scientific computing, HTE transforms the calculation of the entire Hessian matrix into a Hessian vector product (HVP). This approach alleviates the computational bottleneck via Taylor-mode automatic differentiation and significantly reduces memory consumption from the Hessian matrix to HVP. We further showcase HTE's convergence to the original PINN loss and its unbiased behavior under specific conditions. Comparisons with Stochastic Dimension Gradient Descent (SDGD) highlight the distinct advantages of HTE, particularly in scenarios with significant variance among dimensions. We further extend HTE to higher-order and higher-dimensional PDEs, specifically addressing the biharmonic equation. By employing tensor-vector products (TVP), HTE efficiently computes the colossal tensor associated with the fourth-order high-dimensional biharmonic equation, saving memory and enabling rapid computation. The effectiveness of HTE is illustrated through experimental setups, demonstrating comparable convergence rates with SDGD under memory and speed constraints. Additionally, HTE proves valuable in accelerating the Gradient-Enhanced PINN (gPINN) version as well as the Biharmonic equation. Overall, HTE opens up a new capability in scientific machine learning for tackling high-order and high-dimensional PDEs. △ Less

Submitted 3 March, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

Comments: Published in Computer Methods in Applied Mechanics and Engineering

MSC Class: 14J60

Journal ref: Computer Methods in Applied Mechanics and Engineering, Volume 424, 1 May 2024, 116883

arXiv:2312.14237 [pdf, other]

AI-Lorenz: A physics-data-driven framework for black-box and gray-box identification of chaotic systems with symbolic regression

Authors: Mario De Florio, Ioannis G. Kevrekidis, George Em Karniadakis

Abstract: Discovering mathematical models that characterize the observed behavior of dynamical systems remains a major challenge, especially for systems in a chaotic regime. The challenge is even greater when the physics underlying such systems is not yet understood, and scientific inquiry must solely rely on empirical data. Driven by the need to fill this gap, we develop a framework that learns mathematica… ▽ More Discovering mathematical models that characterize the observed behavior of dynamical systems remains a major challenge, especially for systems in a chaotic regime. The challenge is even greater when the physics underlying such systems is not yet understood, and scientific inquiry must solely rely on empirical data. Driven by the need to fill this gap, we develop a framework that learns mathematical expressions modeling complex dynamical behaviors by identifying differential equations from noisy and sparse observable data. We train a small neural network to learn the dynamics of a system, its rate of change in time, and missing model terms, which are used as input for a symbolic regression algorithm to autonomously distill the explicit mathematical terms. This, in turn, enables us to predict the future evolution of the dynamical behavior. The performance of this framework is validated by recovering the right-hand sides and unknown terms of certain complex, chaotic systems such as the well-known Lorenz system, a six-dimensional hyperchaotic system, and the non-autonomous Sprott chaotic system, and comparing them with their known analytical expressions. △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: 28 pages, 15 figures, 9 tables

MSC Class: 34A34; 34A55; 70K55 ACM Class: J.2; G.1.7; I.2.0

arXiv:2312.05410 [pdf, other]

Rethinking materials simulations: Blending direct numerical simulations with neural operators

Authors: Vivek Oommen, Khemraj Shukla, Saaketh Desai, Remi Dingreville, George Em Karniadakis

Abstract: Direct numerical simulations (DNS) are accurate but computationally expensive for predicting materials evolution across timescales, due to the complexity of the underlying evolution equations, the nature of multiscale spatio-temporal interactions, and the need to reach long-time integration. We develop a new method that blends numerical solvers with neural operators to accelerate such simulations.… ▽ More Direct numerical simulations (DNS) are accurate but computationally expensive for predicting materials evolution across timescales, due to the complexity of the underlying evolution equations, the nature of multiscale spatio-temporal interactions, and the need to reach long-time integration. We develop a new method that blends numerical solvers with neural operators to accelerate such simulations. This methodology is based on the integration of a community numerical solver with a U-Net neural operator, enhanced by a temporal-conditioning mechanism that enables accurate extrapolation and efficient time-to-solution predictions of the dynamics. We demonstrate the effectiveness of this framework on simulations of microstructure evolution during physical vapor deposition modeled via the phase-field method. Such simulations exhibit high spatial gradients due to the co-evolution of different material phases with simultaneous slow and fast materials dynamics. We establish accurate extrapolation of the coupled solver with up to 16.5$\times$ speed-up compared to DNS. This methodology is generalizable to a broad range of evolutionary models, from solid mechanics, to fluid dynamics, geophysics, climate, and more. △ Less

Submitted 8 December, 2023; originally announced December 2023.

arXiv:2312.03769 [pdf, other]

GPT vs Human for Scientific Reviews: A Dual Source Review on Applications of ChatGPT in Science

Authors: Chenxi Wu, Alan John Varghese, Vivek Oommen, George Em Karniadakis

Abstract: The new polymath Large Language Models (LLMs) can speed-up greatly scientific reviews, possibly using more unbiased quantitative metrics, facilitating cross-disciplinary connections, and identifying emerging trends and research gaps by analyzing large volumes of data. However, at the present time, they lack the required deep understanding of complex methodologies, they have difficulty in evaluatin… ▽ More The new polymath Large Language Models (LLMs) can speed-up greatly scientific reviews, possibly using more unbiased quantitative metrics, facilitating cross-disciplinary connections, and identifying emerging trends and research gaps by analyzing large volumes of data. However, at the present time, they lack the required deep understanding of complex methodologies, they have difficulty in evaluating innovative claims, and they are unable to assess ethical issues and conflicts of interest. Herein, we consider 13 GPT-related papers across different scientific domains, reviewed by a human reviewer and SciSpace, a large language model, with the reviews evaluated by three distinct types of evaluators, namely GPT-3.5, a crowd panel, and GPT-4. We found that 50% of SciSpace's responses to objective questions align with those of a human reviewer, with GPT-4 (informed evaluator) often rating the human reviewer higher in accuracy, and SciSpace higher in structure, clarity, and completeness. In subjective questions, the uninformed evaluators (GPT-3.5 and crowd panel) showed varying preferences between SciSpace and human responses, with the crowd panel showing a preference for the human responses. However, GPT-4 rated them equally in accuracy and structure but favored SciSpace for completeness. △ Less

Submitted 5 December, 2023; originally announced December 2023.

arXiv:2312.00919 [pdf, other]

Rethinking Skip Connections in Spiking Neural Networks with Time-To-First-Spike Coding

Authors: Youngeun Kim, Adar Kahana, Ruokai Yin, Yuhang Li, Panos Stinis, George Em Karniadakis, Priyadarshini Panda

Abstract: Time-To-First-Spike (TTFS) coding in Spiking Neural Networks (SNNs) offers significant advantages in terms of energy efficiency, closely mimicking the behavior of biological neurons. In this work, we delve into the role of skip connections, a widely used concept in Artificial Neural Networks (ANNs), within the domain of SNNs with TTFS coding. Our focus is on two distinct types of skip connection a… ▽ More Time-To-First-Spike (TTFS) coding in Spiking Neural Networks (SNNs) offers significant advantages in terms of energy efficiency, closely mimicking the behavior of biological neurons. In this work, we delve into the role of skip connections, a widely used concept in Artificial Neural Networks (ANNs), within the domain of SNNs with TTFS coding. Our focus is on two distinct types of skip connection architectures: (1) addition-based skip connections, and (2) concatenation-based skip connections. We find that addition-based skip connections introduce an additional delay in terms of spike timing. On the other hand, concatenation-based skip connections circumvent this delay but produce time gaps between after-convolution and skip connection paths, thereby restricting the effective mixing of information from these two paths. To mitigate these issues, we propose a novel approach involving a learnable delay for skip connections in the concatenation-based skip connection architecture. This approach successfully bridges the time gap between the convolutional and skip branches, facilitating improved information mixing. We conduct experiments on public datasets including MNIST and Fashion-MNIST, illustrating the advantage of the skip connection in TTFS coding architectures. Additionally, we demonstrate the applicability of TTFS coding on beyond image recognition tasks and extend it to scientific machine-learning tasks, broadening the potential uses of SNNs. △ Less

Submitted 1 December, 2023; originally announced December 2023.

arXiv:2311.15283 [pdf, other]

Bias-Variance Trade-off in Physics-Informed Neural Networks with Randomized Smoothing for High-Dimensional PDEs

Authors: Zheyuan Hu, Zhouhao Yang, Yezhen Wang, George Em Karniadakis, Kenji Kawaguchi

Abstract: While physics-informed neural networks (PINNs) have been proven effective for low-dimensional partial differential equations (PDEs), the computational cost remains a hurdle in high-dimensional scenarios. This is particularly pronounced when computing high-order and high-dimensional derivatives in the physics-informed loss. Randomized Smoothing PINN (RS-PINN) introduces Gaussian noise for stochasti… ▽ More While physics-informed neural networks (PINNs) have been proven effective for low-dimensional partial differential equations (PDEs), the computational cost remains a hurdle in high-dimensional scenarios. This is particularly pronounced when computing high-order and high-dimensional derivatives in the physics-informed loss. Randomized Smoothing PINN (RS-PINN) introduces Gaussian noise for stochastic smoothing of the original neural net model, enabling Monte Carlo methods for derivative approximation, eliminating the need for costly auto-differentiation. Despite its computational efficiency in high dimensions, RS-PINN introduces biases in both loss and gradients, negatively impacting convergence, especially when coupled with stochastic gradient descent (SGD). We present a comprehensive analysis of biases in RS-PINN, attributing them to the nonlinearity of the Mean Squared Error (MSE) loss and the PDE nonlinearity. We propose tailored bias correction techniques based on the order of PDE nonlinearity. The unbiased RS-PINN allows for a detailed examination of its pros and cons compared to the biased version. Specifically, the biased version has a lower variance and runs faster than the unbiased version, but it is less accurate due to the bias. To optimize the bias-variance trade-off, we combine the two approaches in a hybrid method that balances the rapid convergence of the biased version with the high accuracy of the unbiased version. In addition, we present an enhanced implementation of RS-PINN. Extensive experiments on diverse high-dimensional PDEs, including Fokker-Planck, HJB, viscous Burgers', Allen-Cahn, and Sine-Gordon equations, illustrate the bias-variance trade-off and highlight the effectiveness of the hybrid RS-PINN. Empirical guidelines are provided for selecting biased, unbiased, or hybrid versions, depending on the dimensionality and nonlinearity of the specific PDE problem. △ Less

Submitted 26 November, 2023; originally announced November 2023.

Comments: 21 pages, 5 figures

MSC Class: 14J60

arXiv:2311.13812 [pdf, other]

Mechanical Characterization and Inverse Design of Stochastic Architected Metamaterials Using Neural Operators

Authors: Hanxun **, Enrui Zhang, Boyu Zhang, Sridhar Krishnaswamy, George Em Karniadakis, Horacio D. Espinosa

Abstract: Machine learning (ML) is emerging as a transformative tool for the design of architected materials, offering properties that far surpass those achievable through lab-based trial-and-error methods. However, a major challenge in current inverse design strategies is their reliance on extensive computational and/or experimental datasets, which becomes particularly problematic for designing micro-scale… ▽ More Machine learning (ML) is emerging as a transformative tool for the design of architected materials, offering properties that far surpass those achievable through lab-based trial-and-error methods. However, a major challenge in current inverse design strategies is their reliance on extensive computational and/or experimental datasets, which becomes particularly problematic for designing micro-scale stochastic architected materials that exhibit nonlinear mechanical behaviors. Here, we introduce a new end-to-end scientific ML framework, leveraging deep neural operators (DeepONet), to directly learn the relationship between the complete microstructure and mechanical response of architected metamaterials from sparse but high-quality in situ experimental data. The approach facilitates the inverse design of structures tailored to specific nonlinear mechanical behaviors. Results obtained from spinodal microstructures, printed using two-photon lithography, reveal that the prediction error for mechanical responses is within a range of 5 - 10%. Our work underscores that by employing neural operators with advanced micro-mechanics experimental techniques, the design of complex micro-architected materials with desired properties becomes feasible, even in scenarios constrained by data scarcity. Our work marks a significant advancement in the field of materials-by-design, potentially heralding a new era in the discovery and development of next-generation metamaterials with unparalleled mechanical characteristics derived directly from experimental insights. △ Less

Submitted 10 December, 2023; v1 submitted 23 November, 2023; originally announced November 2023.

Comments: 29 pages, 5 figures

arXiv:2311.11262 [pdf, other]

Uncertainty quantification for noisy inputs-outputs in physics-informed neural networks and neural operators

Authors: Zongren Zou, Xuhui Meng, George Em Karniadakis

Abstract: Uncertainty quantification (UQ) in scientific machine learning (SciML) becomes increasingly critical as neural networks (NNs) are being widely adopted in addressing complex problems across various scientific disciplines. Representative SciML models are physics-informed neural networks (PINNs) and neural operators (NOs). While UQ in SciML has been increasingly investigated in recent years, very few… ▽ More Uncertainty quantification (UQ) in scientific machine learning (SciML) becomes increasingly critical as neural networks (NNs) are being widely adopted in addressing complex problems across various scientific disciplines. Representative SciML models are physics-informed neural networks (PINNs) and neural operators (NOs). While UQ in SciML has been increasingly investigated in recent years, very few works have focused on addressing the uncertainty caused by the noisy inputs, such as spatial-temporal coordinates in PINNs and input functions in NOs. The presence of noise in the inputs of the models can pose significantly more challenges compared to noise in the outputs of the models, primarily due to the inherent nonlinearity of most SciML algorithms. As a result, UQ for noisy inputs becomes a crucial factor for reliable and trustworthy deployment of these models in applications involving physical knowledge. To this end, we introduce a Bayesian approach to quantify uncertainty arising from noisy inputs-outputs in PINNs and NOs. We show that this approach can be seamlessly integrated into PINNs and NOs, when they are employed to encode the physical information. PINNs incorporate physics by including physics-informed terms via automatic differentiation, either in the loss function or the likelihood, and often take as input the spatial-temporal coordinate. Therefore, the present method equips PINNs with the capability to address problems where the observed coordinate is subject to noise. On the other hand, pretrained NOs are also commonly employed as equation-free surrogates in solving differential equations and Bayesian inverse problems, in which they take functions as inputs. The proposed approach enables them to handle noisy measurements for both input and output functions with UQ. △ Less

Submitted 19 November, 2023; originally announced November 2023.

arXiv:2311.07790 [pdf, other]

Leveraging Hamilton-Jacobi PDEs with time-dependent Hamiltonians for continual scientific machine learning

Authors: Paula Chen, Tingwei Meng, Zongren Zou, Jérôme Darbon, George Em Karniadakis

Abstract: We address two major challenges in scientific machine learning (SciML): interpretability and computational efficiency. We increase the interpretability of certain learning processes by establishing a new theoretical connection between optimization problems arising from SciML and a generalized Hopf formula, which represents the viscosity solution to a Hamilton-Jacobi partial differential equation (… ▽ More We address two major challenges in scientific machine learning (SciML): interpretability and computational efficiency. We increase the interpretability of certain learning processes by establishing a new theoretical connection between optimization problems arising from SciML and a generalized Hopf formula, which represents the viscosity solution to a Hamilton-Jacobi partial differential equation (HJ PDE) with time-dependent Hamiltonian. Namely, we show that when we solve certain regularized learning problems with integral-type losses, we actually solve an optimal control problem and its associated HJ PDE with time-dependent Hamiltonian. This connection allows us to reinterpret incremental updates to learned models as the evolution of an associated HJ PDE and optimal control problem in time, where all of the previous information is intrinsically encoded in the solution to the HJ PDE. As a result, existing HJ PDE solvers and optimal control algorithms can be reused to design new efficient training approaches for SciML that naturally coincide with the continual learning framework, while avoiding catastrophic forgetting. As a first exploration of this connection, we consider the special case of linear regression and leverage our connection to develop a new Riccati-based methodology for solving these learning problems that is amenable to continual learning applications. We also provide some corresponding numerical examples that demonstrate the potential computational and memory advantages our Riccati-based approach can provide. △ Less

Submitted 6 May, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

arXiv:2310.19590 [pdf, other]

Operator Learning Enhanced Physics-informed Neural Networks for Solving Partial Differential Equations Characterized by Sharp Solutions

Authors: Bin Lin, Zhi** Mao, Zhicheng Wang, George Em Karniadakis

Abstract: Physics-informed Neural Networks (PINNs) have been shown as a promising approach for solving both forward and inverse problems of partial differential equations (PDEs). Meanwhile, the neural operator approach, including methods such as Deep Operator Network (DeepONet) and Fourier neural operator (FNO), has been introduced and extensively employed in approximating solution of PDEs. Nevertheless, to… ▽ More Physics-informed Neural Networks (PINNs) have been shown as a promising approach for solving both forward and inverse problems of partial differential equations (PDEs). Meanwhile, the neural operator approach, including methods such as Deep Operator Network (DeepONet) and Fourier neural operator (FNO), has been introduced and extensively employed in approximating solution of PDEs. Nevertheless, to solve problems consisting of sharp solutions poses a significant challenge when employing these two approaches. To address this issue, we propose in this work a novel framework termed Operator Learning Enhanced Physics-informed Neural Networks (OL-PINN). Initially, we utilize DeepONet to learn the solution operator for a set of smooth problems relevant to the PDEs characterized by sharp solutions. Subsequently, we integrate the pre-trained DeepONet with PINN to resolve the target sharp solution problem. We showcase the efficacy of OL-PINN by successfully addressing various problems, such as the nonlinear diffusion-reaction equation, the Burgers equation and the incompressible Navier-Stokes equation at high Reynolds number. Compared with the vanilla PINN, the proposed method requires only a small number of residual points to achieve a strong generalization capability. Moreover, it substantially enhances accuracy, while also ensuring a robust training process. Furthermore, OL-PINN inherits the advantage of PINN for solving inverse problems. To this end, we apply the OL-PINN approach for solving problems with only partial boundary conditions, which usually cannot be solved by the classical numerical methods, showing its capacity in solving ill-posed problems and consequently more complex inverse problems. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: Preprint submitted to Elsevier

arXiv:2310.10776 [pdf, other]

Correcting model misspecification in physics-informed neural networks (PINNs)

Authors: Zongren Zou, Xuhui Meng, George Em Karniadakis

Abstract: Data-driven discovery of governing equations in computational science has emerged as a new paradigm for obtaining accurate physical models and as a possible alternative to theoretical derivations. The recently developed physics-informed neural networks (PINNs) have also been employed to learn governing equations given data across diverse scientific disciplines. Despite the effectiveness of PINNs f… ▽ More Data-driven discovery of governing equations in computational science has emerged as a new paradigm for obtaining accurate physical models and as a possible alternative to theoretical derivations. The recently developed physics-informed neural networks (PINNs) have also been employed to learn governing equations given data across diverse scientific disciplines. Despite the effectiveness of PINNs for discovering governing equations, the physical models encoded in PINNs may be misspecified in complex systems as some of the physical processes may not be fully understood, leading to the poor accuracy of PINN predictions. In this work, we present a general approach to correct the misspecified physical models in PINNs for discovering governing equations, given some sparse and/or noisy data. Specifically, we first encode the assumed physical models, which may be misspecified, then employ other deep neural networks (DNNs) to model the discrepancy between the imperfect models and the observational data. Due to the expressivity of DNNs, the proposed method is capable of reducing the computational errors caused by the model misspecification and thus enables the applications of PINNs in complex systems where the physical processes are not exactly known. Furthermore, we utilize the Bayesian PINNs (B-PINNs) and/or ensemble PINNs to quantify uncertainties arising from noisy and/or gappy data in the discovered governing equations. A series of numerical examples including non-Newtonian channel and cavity flows demonstrate that the added DNNs are capable of correcting the model misspecification in PINNs and thus reduce the discrepancy between the physical models and the observational data. We envision that the proposed approach will extend the applications of PINNs for discovering governing equations in problems where the physico-chemical or biological processes are not well understood. △ Less

Submitted 16 October, 2023; originally announced October 2023.

arXiv:2310.03001 [pdf, other]

Learning characteristic parameters and dynamics of centrifugal pumps under multi-phase flow using physics-informed neural networks

Authors: Felipe de Castro Teixeira Carvalho, Kamaljyoti Nath, Alberto Luiz Serpa, George Em Karniadakis

Abstract: Electrical submersible pumps (ESP) are the second most used artificial lifting equipment in the oil and gas industry due to their high flow rates and boost pressures. They often have to handle multiphase flows, which usually contain a mixture of hydrocarbons, water, and/or sediments. Given these circumstances, emulsions are commonly formed. It is a liquid-liquid flow composed of two immiscible flu… ▽ More Electrical submersible pumps (ESP) are the second most used artificial lifting equipment in the oil and gas industry due to their high flow rates and boost pressures. They often have to handle multiphase flows, which usually contain a mixture of hydrocarbons, water, and/or sediments. Given these circumstances, emulsions are commonly formed. It is a liquid-liquid flow composed of two immiscible fluids whose effective viscosity and density differ from the single phase separately. In this context, accurate modeling of ESP systems is crucial for optimizing oil production and implementing control strategies. However, real-time and direct measurement of fluid and system characteristics is often impractical due to time constraints and economy. Hence, indirect methods are generally considered to estimate the system parameters. In this paper, we formulate a machine learning model based on Physics-Informed Neural Networks (PINNs) to estimate crucial system parameters. In order to study the efficacy of the proposed PINN model, we conduct computational studies using not only simulated but also experimental data for different water-oil ratios. We evaluate the state variable's dynamics and unknown parameters for various combinations when only intake and discharge pressure measurements are available. We also study structural and practical identifiability analyses based on commonly available pressure measurements. The PINN model could reduce the requirement of expensive field laboratory tests used to estimate fluid properties. △ Less

Submitted 4 October, 2023; originally announced October 2023.

arXiv:2310.02491 [pdf, other]

DON-LSTM: Multi-Resolution Learning with DeepONets and Long Short-Term Memory Neural Networks

Authors: Katarzyna Michałowska, Somdatta Goswami, George Em Karniadakis, Signe Riemer-Sørensen

Abstract: Deep operator networks (DeepONets, DONs) offer a distinct advantage over traditional neural networks in their ability to be trained on multi-resolution data. This property becomes especially relevant in real-world scenarios where high-resolution measurements are difficult to obtain, while low-resolution data is more readily available. Nevertheless, DeepONets alone often struggle to capture and mai… ▽ More Deep operator networks (DeepONets, DONs) offer a distinct advantage over traditional neural networks in their ability to be trained on multi-resolution data. This property becomes especially relevant in real-world scenarios where high-resolution measurements are difficult to obtain, while low-resolution data is more readily available. Nevertheless, DeepONets alone often struggle to capture and maintain dependencies over long sequences compared to other state-of-the-art algorithms. We propose a novel architecture, named DON-LSTM, which extends the DeepONet with a long short-term memory network (LSTM). Combining these two architectures, we equip the network with explicit mechanisms to leverage multi-resolution data, as well as capture temporal dependencies in long sequences. We test our method on long-time-evolution modeling of multiple non-linear systems and show that the proposed multi-resolution DON-LSTM achieves significantly lower generalization error and requires fewer high-resolution samples compared to its vanilla counterparts. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: 18 pages, 3 figures

arXiv:2310.01433 [pdf, other]

AI-Aristotle: A Physics-Informed framework for Systems Biology Gray-Box Identification

Authors: Nazanin Ahmadi Daryakenari, Mario De Florio, Khemraj Shukla, George Em Karniadakis

Abstract: Discovering mathematical equations that govern physical and biological systems from observed data is a fundamental challenge in scientific research. We present a new physics-informed framework for parameter estimation and missing physics identification (gray-box) in the field of Systems Biology. The proposed framework -- named AI-Aristotle -- combines eXtreme Theory of Functional Connections (X-TF… ▽ More Discovering mathematical equations that govern physical and biological systems from observed data is a fundamental challenge in scientific research. We present a new physics-informed framework for parameter estimation and missing physics identification (gray-box) in the field of Systems Biology. The proposed framework -- named AI-Aristotle -- combines eXtreme Theory of Functional Connections (X-TFC) domain-decomposition and Physics-Informed Neural Networks (PINNs) with symbolic regression (SR) techniques for parameter discovery and gray-box identification. We test the accuracy, speed, flexibility and robustness of AI-Aristotle based on two benchmark problems in Systems Biology: a pharmacokinetics drug absorption model, and an ultradian endocrine model for glucose-insulin interactions. We compare the two machine learning methods (X-TFC and PINNs), and moreover, we employ two different symbolic regression techniques to cross-verify our results. While the current work focuses on the performance of AI-Aristotle based on synthetic data, it can equally handle noisy experimental data and can even be used for black-box identification in just a few minutes on a laptop. More broadly, our work provides insights into the accuracy, cost, scalability, and robustness of integrating neural networks with symbolic regressors, offering a comprehensive guide for researchers tackling gray-box identification challenges in complex dynamical systems in biomedicine and beyond. △ Less

Submitted 29 September, 2023; originally announced October 2023.

MSC Class: 37N25 (Primary); 34-04 (Secondary) ACM Class: G.1.7; I.2.0

arXiv:2309.06010 [pdf, other]

Solution multiplicity and effects of data and eddy viscosity on Navier-Stokes solutions inferred by physics-informed neural networks

Authors: Zhicheng Wang, Xuhui Meng, Xiaomo Jiang, Hui Xiang, George Em Karniadakis

Abstract: Physics-informed neural networks (PINNs) have emerged as a new simulation paradigm for fluid flows and are especially effective for inverse and hybrid problems. However, vanilla PINNs often fail in forward problems, especially at high Reynolds (Re) number flows. Herein, we study systematically the classical lid-driven cavity flow at $Re=2,000$, $3,000$ and $5,000$. We observe that vanilla PINNs ob… ▽ More Physics-informed neural networks (PINNs) have emerged as a new simulation paradigm for fluid flows and are especially effective for inverse and hybrid problems. However, vanilla PINNs often fail in forward problems, especially at high Reynolds (Re) number flows. Herein, we study systematically the classical lid-driven cavity flow at $Re=2,000$, $3,000$ and $5,000$. We observe that vanilla PINNs obtain two classes of solutions, one class that agrees with direct numerical simulations (DNS), and another that is an unstable solution to the Navier-Stokes equations and not physically realizable. We attribute this solution multiplicity to singularities and unbounded vorticity, and we propose regularization methods that restore a unique solution within 1\% difference from the DNS solution. In particular, we introduce a parameterized entropy-viscosity method as artificial eddy viscosity and identify suitable parameters that drive the PINNs solution towards the DNS solution. Furthermore, we solve the inverse problem by subsampling the DNS solution, and identify a new eddy viscosity distribution that leads to velocity and pressure fields almost identical to their DNS counterparts. Surprisingly, a single measurement at a random point suffices to obtain a unique PINNs DNS-like solution even without artificial viscosity, which suggests possible pathways in simulating high Reynolds number turbulent flows using vanilla PINNs. △ Less

Submitted 12 September, 2023; originally announced September 2023.

arXiv:2308.16372 [pdf, other]

Artificial to Spiking Neural Networks Conversion for Scientific Machine Learning

Authors: Qian Zhang, Chenxi Wu, Adar Kahana, Youngeun Kim, Yuhang Li, George Em Karniadakis, Priyadarshini Panda

Abstract: We introduce a method to convert Physics-Informed Neural Networks (PINNs), commonly used in scientific machine learning, to Spiking Neural Networks (SNNs), which are expected to have higher energy efficiency compared to traditional Artificial Neural Networks (ANNs). We first extend the calibration technique of SNNs to arbitrary activation functions beyond ReLU, making it more versatile, and we pro… ▽ More We introduce a method to convert Physics-Informed Neural Networks (PINNs), commonly used in scientific machine learning, to Spiking Neural Networks (SNNs), which are expected to have higher energy efficiency compared to traditional Artificial Neural Networks (ANNs). We first extend the calibration technique of SNNs to arbitrary activation functions beyond ReLU, making it more versatile, and we prove a theorem that ensures the effectiveness of the calibration. We successfully convert PINNs to SNNs, enabling computational efficiency for diverse regression tasks in solving multiple differential equations, including the unsteady Navier-Stokes equations. We demonstrate great gains in terms of overall efficiency, including Separable PINNs (SPINNs), which accelerate the training process. Overall, this is the first work of this kind and the proposed method achieves relatively good accuracy with low spike rates. △ Less

Submitted 30 August, 2023; originally announced August 2023.

arXiv:2308.05141 [pdf, other]

Sound propagation in realistic interactive 3D scenes with parameterized sources using deep neural operators

Authors: Nikolas Borrel-Jensen, Somdatta Goswami, Allan P. Engsig-Karup, George Em Karniadakis, Cheol-Ho Jeong

Abstract: We address the challenge of sound propagation simulations in 3D virtual rooms with moving sources, which have applications in virtual/augmented reality, game audio, and spatial computing. Solutions to the wave equation can describe wave phenomena such as diffraction and interference. However, simulating them using conventional numerical discretization methods with hundreds of source and receiver p… ▽ More We address the challenge of sound propagation simulations in 3D virtual rooms with moving sources, which have applications in virtual/augmented reality, game audio, and spatial computing. Solutions to the wave equation can describe wave phenomena such as diffraction and interference. However, simulating them using conventional numerical discretization methods with hundreds of source and receiver positions is intractable, making stimulating a sound field with moving sources impractical. To overcome this limitation, we propose using deep operator networks to approximate linear wave-equation operators. This enables the rapid prediction of sound propagation in realistic 3D acoustic scenes with moving sources, achieving millisecond-scale computations. By learning a compact surrogate model, we avoid the offline calculation and storage of impulse responses for all relevant source/listener pairs. Our experiments, including various complex scene geometries, show good agreement with reference solutions, with root mean squared errors ranging from 0.02 Pa to 0.10 Pa. Notably, our method signifies a paradigm shift as no prior machine learning approach has achieved precise predictions of complete wave fields within realistic domains. We anticipate that our findings will drive further exploration of deep neural operator methods, advancing research in immersive user experiences within virtual environments.$ △ Less

Submitted 13 January, 2024; v1 submitted 9 August, 2023; originally announced August 2023.

Comments: 25 pages, 10 figures, 4 tables

arXiv:2307.12306 [pdf, other]

doi 10.1016/j.neunet.2024.106369

Tackling the Curse of Dimensionality with Physics-Informed Neural Networks

Authors: Zheyuan Hu, Khemraj Shukla, George Em Karniadakis, Kenji Kawaguchi

Abstract: The curse-of-dimensionality taxes computational resources heavily with exponentially increasing computational cost as the dimension increases. This poses great challenges in solving high-dimensional PDEs, as Richard E. Bellman first pointed out over 60 years ago. While there has been some recent success in solving numerically partial differential equations (PDEs) in high dimensions, such computati… ▽ More The curse-of-dimensionality taxes computational resources heavily with exponentially increasing computational cost as the dimension increases. This poses great challenges in solving high-dimensional PDEs, as Richard E. Bellman first pointed out over 60 years ago. While there has been some recent success in solving numerically partial differential equations (PDEs) in high dimensions, such computations are prohibitively expensive, and true scaling of general nonlinear PDEs to high dimensions has never been achieved. We develop a new method of scaling up physics-informed neural networks (PINNs) to solve arbitrary high-dimensional PDEs. The new method, called Stochastic Dimension Gradient Descent (SDGD), decomposes a gradient of PDEs into pieces corresponding to different dimensions and randomly samples a subset of these dimensional pieces in each iteration of training PINNs. We prove theoretically the convergence and other desired properties of the proposed method. We demonstrate in various diverse tests that the proposed method can solve many notoriously hard high-dimensional PDEs, including the Hamilton-Jacobi-Bellman (HJB) and the Schrödinger equations in tens of thousands of dimensions very fast on a single GPU using the PINNs mesh-free approach. Notably, we solve nonlinear PDEs with nontrivial, anisotropic, and inseparable solutions in 100,000 effective dimensions in 12 hours on a single GPU using SDGD with PINNs. Since SDGD is a general training methodology of PINNs, it can be applied to any current and future variants of PINNs to scale them up for arbitrary high-dimensional PDEs. △ Less

Submitted 17 May, 2024; v1 submitted 23 July, 2023; originally announced July 2023.

Comments: Accepted by Neural Networks. Code is available at https://github.com/zheyuanhu01/SDGD_PINN

MSC Class: 14J60 ACM Class: F.2.2; I.2.7

Journal ref: Neural Networks, Volume 176, 2024, 106369, ISSN 0893-6080

arXiv:2307.09142 [pdf, other]

Characterization of partial wetting by CMAS droplets using multiphase many-body dissipative particle dynamics and data-driven discovery based on PINNs

Authors: Elham Kiyani, Mahdi Kooshkbaghi, Khemraj Shukla, Rahul Babu Koneru, Zhen Li, Luis Bravo, Anindya Ghoshal, George Em Karniadakis, Mikko Karttunen

Abstract: The molten sand, a mixture of calcia, magnesia, alumina, and silicate, known as CMAS, is characterized by its high viscosity, density, and surface tension. The unique properties of CMAS make it a challenging material to deal with in high-temperature applications, requiring innovative solutions and materials to prevent its buildup and damage to critical equipment. Here, we use multiphase many-body… ▽ More The molten sand, a mixture of calcia, magnesia, alumina, and silicate, known as CMAS, is characterized by its high viscosity, density, and surface tension. The unique properties of CMAS make it a challenging material to deal with in high-temperature applications, requiring innovative solutions and materials to prevent its buildup and damage to critical equipment. Here, we use multiphase many-body dissipative particle dynamics (mDPD) simulations to study the wetting dynamics of highly viscous molten CMAS droplets. The simulations are performed in three dimensions, with varying initial droplet sizes and equilibrium contact angles. We propose a coarse parametric ordinary differential equation (ODE) that captures the spreading radius behavior of the CMAS droplets. The ODE parameters are then identified based on the Physics-Informed Neural Network (PINN) framework. Subsequently, the closed form dependency of parameter values found by PINN on the initial radii and contact angles are given using symbolic regression. Finally, we employ Bayesian PINNs (B-PINNs) to assess and quantify the uncertainty associated with the discovered parameters. In brief, this study provides insight into spreading dynamics of CMAS droplets by fusing simple parametric ODE modeling and state-of-the-art machine learning techniques. △ Less

Submitted 18 July, 2023; originally announced July 2023.

arXiv:2307.09072 [pdf, other]

Real-time Inference and Extrapolation via a Diffusion-inspired Temporal Transformer Operator (DiTTO)

Authors: Oded Ovadia, Vivek Oommen, Adar Kahana, Ahmad Peyvan, Eli Turkel, George Em Karniadakis

Abstract: Extrapolation remains a grand challenge in deep neural networks across all application domains. We propose an operator learning method to solve time-dependent partial differential equations (PDEs) continuously and with extrapolation in time without any temporal discretization. The proposed method, named Diffusion-inspired Temporal Transformer Operator (DiTTO), is inspired by latent diffusion model… ▽ More Extrapolation remains a grand challenge in deep neural networks across all application domains. We propose an operator learning method to solve time-dependent partial differential equations (PDEs) continuously and with extrapolation in time without any temporal discretization. The proposed method, named Diffusion-inspired Temporal Transformer Operator (DiTTO), is inspired by latent diffusion models and their conditioning mechanism, which we use to incorporate the temporal evolution of the PDE, in combination with elements from the transformer architecture to improve its capabilities. Upon training, DiTTO can make inferences in real-time. We demonstrate its extrapolation capability on a climate problem by estimating the temperature around the globe for several years, and also in modeling hypersonic flows around a double-cone. We propose different training strategies involving temporal-bundling and sub-sampling and demonstrate performance improvements for several benchmarks, performing extrapolation for long time intervals as well as zero-shot super-resolution in time. △ Less

Submitted 8 December, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

arXiv:2307.08107 [pdf, other]

Discovering a reaction-diffusion model for Alzheimer's disease by combining PINNs with symbolic regression

Authors: Zhen Zhang, Zongren Zou, Ellen Kuhl, George Em Karniadakis

Abstract: Misfolded tau proteins play a critical role in the progression and pathology of Alzheimer's disease. Recent studies suggest that the spatio-temporal pattern of misfolded tau follows a reaction-diffusion type equation. However, the precise mathematical model and parameters that characterize the progression of misfolded protein across the brain remain incompletely understood. Here, we use deep learn… ▽ More Misfolded tau proteins play a critical role in the progression and pathology of Alzheimer's disease. Recent studies suggest that the spatio-temporal pattern of misfolded tau follows a reaction-diffusion type equation. However, the precise mathematical model and parameters that characterize the progression of misfolded protein across the brain remain incompletely understood. Here, we use deep learning and artificial intelligence to discover a mathematical model for the progression of Alzheimer's disease using longitudinal tau positron emission tomography from the Alzheimer's Disease Neuroimaging Initiative database. Specifically, we integrate physics informed neural networks (PINNs) and symbolic regression to discover a reaction-diffusion type partial differential equation for tau protein misfolding and spreading. First, we demonstrate the potential of our model and parameter discovery on synthetic data. Then, we apply our method to discover the best model and parameters to explain tau imaging data from 46 individuals who are likely to develop Alzheimer's disease and 30 healthy controls. Our symbolic regression discovers different misfolding models $f(c)$ for two groups, with a faster misfolding for the Alzheimer's group, $f(c) = 0.23c^3 - 1.34c^2 + 1.11c$, than for the healthy control group, $f(c) = -c^3 +0.62c^2 + 0.39c$. Our results suggest that PINNs, supplemented by symbolic regression, can discover a reaction-diffusion type model to explain misfolded tau protein concentrations in Alzheimer's disease. We expect our study to be the starting point for a more holistic analysis to provide image-based technologies for early diagnosis, and ideally early treatment of neurodegeneration in Alzheimer's disease and possibly other misfolding-protein based neurodegenerative disorders. △ Less

Submitted 16 July, 2023; originally announced July 2023.

arXiv:2307.02588 [pdf, other]

TransformerG2G: Adaptive time-step** for learning temporal graph embeddings using transformers

Authors: Alan John Varghese, Aniruddha Bora, Mengjia Xu, George Em Karniadakis

Abstract: Dynamic graph embedding has emerged as a very effective technique for addressing diverse temporal graph analytic tasks (i.e., link prediction, node classification, recommender systems, anomaly detection, and graph generation) in various applications. Such temporal graphs exhibit heterogeneous transient dynamics, varying time intervals, and highly evolving node features throughout their evolution.… ▽ More Dynamic graph embedding has emerged as a very effective technique for addressing diverse temporal graph analytic tasks (i.e., link prediction, node classification, recommender systems, anomaly detection, and graph generation) in various applications. Such temporal graphs exhibit heterogeneous transient dynamics, varying time intervals, and highly evolving node features throughout their evolution. Hence, incorporating long-range dependencies from the historical graph context plays a crucial role in accurately learning their temporal dynamics. In this paper, we develop a graph embedding model with uncertainty quantification, TransformerG2G, by exploiting the advanced transformer encoder to first learn intermediate node representations from its current state ($t$) and previous context (over timestamps [$t-1, t-l$], $l$ is the length of context). Moreover, we employ two projection layers to generate lower-dimensional multivariate Gaussian distributions as each node's latent embedding at timestamp $t$. We consider diverse benchmarks with varying levels of ``novelty" as measured by the TEA (Temporal Edge Appearance) plots. Our experiments demonstrate that the proposed TransformerG2G model outperforms conventional multi-step methods and our prior work (DynG2G) in terms of both link prediction accuracy and computational efficiency, especially for high degree of novelty. Furthermore, the learned time-dependent attention weights across multiple graph snapshots reveal the development of an automatic adaptive time step** enabled by the transformer. Importantly, by examining the attention weights, we can uncover temporal dependencies, identify influential elements, and gain insights into the complex interactions within the graph structure. For example, we identified a strong correlation between attention weights and node degree at the various stages of the graph topology evolution. △ Less

Submitted 22 December, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

Comments: 19 pages, 8 figures

arXiv:2307.00379 [pdf, other]

Residual-based attention and connection to information bottleneck theory in PINNs

Authors: Sokratis J. Anagnostopoulos, Juan Diego Toscano, Nikolaos Stergiopulos, George Em Karniadakis

Abstract: Driven by the need for more efficient and seamless integration of physical models and data, physics-informed neural networks (PINNs) have seen a surge of interest in recent years. However, ensuring the reliability of their convergence and accuracy remains a challenge. In this work, we propose an efficient, gradient-less weighting scheme for PINNs, that accelerates the convergence of dynamic or sta… ▽ More Driven by the need for more efficient and seamless integration of physical models and data, physics-informed neural networks (PINNs) have seen a surge of interest in recent years. However, ensuring the reliability of their convergence and accuracy remains a challenge. In this work, we propose an efficient, gradient-less weighting scheme for PINNs, that accelerates the convergence of dynamic or static systems. This simple yet effective attention mechanism is a function of the evolving cumulative residuals and aims to make the optimizer aware of problematic regions at no extra computational cost or adversarial learning. We illustrate that this general method consistently achieves a relative $L^{2}$ error of the order of $10^{-5}$ using standard optimizers on typical benchmark cases of the literature. Furthermore, by investigating the evolution of weights during training, we identify two distinct learning phases reminiscent of the fitting and diffusion phases proposed by the information bottleneck (IB) theory. Subsequent gradient analysis supports this hypothesis by aligning the transition from high to low signal-to-noise ratio (SNR) with the transition from fitting to diffusion regimes of the adopted weights. This novel correlation between PINNs and IB theory could open future possibilities for understanding the underlying mechanisms behind the training and stability of PINNs and, more broadly, of neural operators. △ Less

Submitted 1 July, 2023; originally announced July 2023.

arXiv:2306.17648 [pdf, other]

doi 10.1137/23M1583375

Enhancing training of physics-informed neural networks using domain-decomposition based preconditioning strategies

Authors: Alena Kopaničáková, Hardik Kothari, George Em Karniadakis, Rolf Krause

Abstract: We propose to enhance the training of physics-informed neural networks (PINNs). To this aim, we introduce nonlinear additive and multiplicative preconditioning strategies for the widely used L-BFGS optimizer. The nonlinear preconditioners are constructed by utilizing the Schwarz domain-decomposition framework, where the parameters of the network are decomposed in a layer-wise manner. Through a ser… ▽ More We propose to enhance the training of physics-informed neural networks (PINNs). To this aim, we introduce nonlinear additive and multiplicative preconditioning strategies for the widely used L-BFGS optimizer. The nonlinear preconditioners are constructed by utilizing the Schwarz domain-decomposition framework, where the parameters of the network are decomposed in a layer-wise manner. Through a series of numerical experiments, we demonstrate that both, additive and multiplicative preconditioners significantly improve the convergence of the standard L-BFGS optimizer, while providing more accurate solutions of the underlying partial differential equations. Moreover, the additive preconditioner is inherently parallel, thus giving rise to a novel approach to model parallelism. △ Less

Submitted 27 December, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

Comments: 23 pages, 7 figures

MSC Class: 90C30; 90C26; 90C06; 65M55; 68T07

arXiv:2306.15551 [pdf, other]

MyCrunchGPT: A chatGPT assisted framework for scientific machine learning

Authors: Varun Kumar, Leonard Gleyzer, Adar Kahana, Khemraj Shukla, George Em Karniadakis

Abstract: Scientific Machine Learning (SciML) has advanced recently across many different areas in computational science and engineering. The objective is to integrate data and physics seamlessly without the need of employing elaborate and computationally taxing data assimilation schemes. However, preprocessing, problem formulation, code generation, postprocessing and analysis are still time consuming and m… ▽ More Scientific Machine Learning (SciML) has advanced recently across many different areas in computational science and engineering. The objective is to integrate data and physics seamlessly without the need of employing elaborate and computationally taxing data assimilation schemes. However, preprocessing, problem formulation, code generation, postprocessing and analysis are still time consuming and may prevent SciML from wide applicability in industrial applications and in digital twin frameworks. Here, we integrate the various stages of SciML under the umbrella of ChatGPT, to formulate MyCrunchGPT, which plays the role of a conductor orchestrating the entire workflow of SciML based on simple prompts by the user. Specifically, we present two examples that demonstrate the potential use of MyCrunchGPT in optimizing airfoils in aerodynamics, and in obtaining flow fields in various geometries in interactive mode, with emphasis on the validation stage. To demonstrate the flow of the MyCrunchGPT, and create an infrastructure that can facilitate a broader vision, we built a webapp based guided user interface, that includes options for a comprehensive summary report. The overall objective is to extend MyCrunchGPT to handle diverse problems in computational mechanics, design, optimization and controls, and general scientific computing tasks involved in SciML, hence using it as a research assistant tool but also as an educational tool. While here the examples focus in fluid mechanics, future versions will target solid mechanics and materials science, geophysics, systems biology and bioinformatics. △ Less

Submitted 31 July, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

Comments: Updated title, abstract and added references

arXiv:2305.10706 [pdf, other]

doi 10.1016/j.cma.2023.116258

A Framework Based on Symbolic Regression Coupled with eXtended Physics-Informed Neural Networks for Gray-Box Learning of Equations of Motion from Data

Authors: Elham Kiyani, Khemraj Shukla, George Em Karniadakis, Mikko Karttunen

Abstract: We propose a framework and an algorithm to uncover the unknown parts of nonlinear equations directly from data. The framework is based on eXtended Physics-Informed Neural Networks (X-PINNs), domain decomposition in space-time, but we augment the original X-PINN method by imposing flux continuity across the domain interfaces. The well-known Allen-Cahn equation is used to demonstrate the approach. T… ▽ More We propose a framework and an algorithm to uncover the unknown parts of nonlinear equations directly from data. The framework is based on eXtended Physics-Informed Neural Networks (X-PINNs), domain decomposition in space-time, but we augment the original X-PINN method by imposing flux continuity across the domain interfaces. The well-known Allen-Cahn equation is used to demonstrate the approach. The Frobenius matrix norm is used to evaluate the accuracy of the X-PINN predictions and the results show excellent performance. In addition, symbolic regression is employed to determine the closed form of the unknown part of the equation from the data, and the results confirm the accuracy of the X-PINNs based approach. To test the framework in a situation resembling real-world data, random noise is added to the datasets to mimic scenarios such as the presence of thermal noise or instrument errors. The results show that the framework is stable against significant amount of noise. As the final part, we determine the minimal amount of data required for training the neural network. The framework is able to predict the correct form and coefficients of the underlying dynamical equation when at least 50\% data is used for training. △ Less

Submitted 18 May, 2023; originally announced May 2023.

arXiv:2305.03184 [pdf, other]

doi 10.1016/j.jmps.2023.105424

A Generative Modeling Framework for Inferring Families of Biomechanical Constitutive Laws in Data-Sparse Regimes

Authors: Minglang Yin, Zongren Zou, Enrui Zhang, Cristina Cavinato, Jay D. Humphrey, George Em Karniadakis

Abstract: Quantifying biomechanical properties of the human vasculature could deepen our understanding of cardiovascular diseases. Standard nonlinear regression in constitutive modeling requires considerable high-quality data and an explicit form of the constitutive model as prior knowledge. By contrast, we propose a novel approach that combines generative deep learning with Bayesian inference to efficientl… ▽ More Quantifying biomechanical properties of the human vasculature could deepen our understanding of cardiovascular diseases. Standard nonlinear regression in constitutive modeling requires considerable high-quality data and an explicit form of the constitutive model as prior knowledge. By contrast, we propose a novel approach that combines generative deep learning with Bayesian inference to efficiently infer families of constitutive relationships in data-sparse regimes. Inspired by the concept of functional priors, we develop a generative adversarial network (GAN) that incorporates a neural operator as the generator and a fully-connected neural network as the discriminator. The generator takes a vector of noise conditioned on measurement data as input and yields the predicted constitutive relationship, which is scrutinized by the discriminator in the following step. We demonstrate that this framework can accurately estimate means and standard deviations of the constitutive relationships of the murine aorta using data collected either from model-generated synthetic data or ex vivo experiments for mice with genetic deficiencies. In addition, the framework learns priors of constitutive models without explicitly knowing their functional form, providing a new model-agnostic approach to learning hidden constitutive behaviors from data. △ Less

Submitted 4 May, 2023; originally announced May 2023.

arXiv:2304.13799 [pdf, other]

doi 10.1038/s41598-023-39989-4

Physics-informed neural networks for predicting gas flow dynamics and unknown parameters in diesel engines

Authors: Kamaljyoti Nath, Xuhui Meng, Daniel J Smith, George Em Karniadakis

Abstract: This paper presents a physics-informed neural network (PINN) approach for monitoring the health of diesel engines. The aim is to evaluate the engine dynamics, identify unknown parameters in a "mean value" model, and anticipate maintenance requirements. The PINN model is applied to diesel engines with a variable-geometry turbocharger and exhaust gas recirculation, using measurement data of selected… ▽ More This paper presents a physics-informed neural network (PINN) approach for monitoring the health of diesel engines. The aim is to evaluate the engine dynamics, identify unknown parameters in a "mean value" model, and anticipate maintenance requirements. The PINN model is applied to diesel engines with a variable-geometry turbocharger and exhaust gas recirculation, using measurement data of selected state variables. The results demonstrate the ability of the PINN model to predict simultaneously both unknown parameters and dynamics accurately with both clean and noisy data, and the importance of the self-adaptive weight in the loss function for faster convergence. The input data for these simulations are derived from actual engine running conditions, while the outputs are simulated data, making this a practical case study of PINN's ability to predict real-world dynamical systems. The mean value model of the diesel engine incorporates empirical formulae to represent certain states, but these formulae may not be generalizable to other engines. To address this, the study considers the use of deep neural networks (DNNs) in addition to the PINN model. The DNNs are trained using laboratory test data and are used to model the engine-specific empirical formulae in the mean value model, allowing for a more flexible and adaptive representation of the engine's states. In other words, the mean value model uses both the PINN model and the DNNs to represent the engine's states, with the PINN providing a physics-based understanding of the engine's overall dynamics and the DNNs offering a more engine-specific and adaptive representation of the empirical formulae. By combining these two approaches, the study aims to offer a comprehensive and versatile approach to monitoring the health and performance of diesel engines. △ Less

Submitted 5 August, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

arXiv:2304.07599 [pdf, other]

Learning in latent spaces improves the predictive accuracy of deep neural operators

Authors: Katiana Kontolati, Somdatta Goswami, George Em Karniadakis, Michael D. Shields

Abstract: Operator regression provides a powerful means of constructing discretization-invariant emulators for partial-differential equations (PDEs) describing physical systems. Neural operators specifically employ deep neural networks to approximate map**s between infinite-dimensional Banach spaces. As data-driven models, neural operators require the generation of labeled observations, which in cases of… ▽ More Operator regression provides a powerful means of constructing discretization-invariant emulators for partial-differential equations (PDEs) describing physical systems. Neural operators specifically employ deep neural networks to approximate map**s between infinite-dimensional Banach spaces. As data-driven models, neural operators require the generation of labeled observations, which in cases of complex high-fidelity models result in high-dimensional datasets containing redundant and noisy features, which can hinder gradient-based optimization. Map** these high-dimensional datasets to a low-dimensional latent space of salient features can make it easier to work with the data and also enhance learning. In this work, we investigate the latent deep operator network (L-DeepONet), an extension of standard DeepONet, which leverages latent representations of high-dimensional PDE input and output functions identified with suitable autoencoders. We illustrate that L-DeepONet outperforms the standard approach in terms of both accuracy and computational efficiency across diverse time-dependent PDEs, e.g., modeling the growth of fracture in brittle materials, convective fluid flows, and large-scale atmospheric flows exhibiting multiscale dynamical features. △ Less

Submitted 15 April, 2023; originally announced April 2023.

Comments: 22 pages, 12 figures

arXiv:2304.00567 [pdf, other]

doi 10.1007/s10489-023-05178-z

Real-Time Prediction of Gas Flow Dynamics in Diesel Engines using a Deep Neural Operator Framework

Authors: Varun Kumar, Somdatta Goswami, Daniel J. Smith, George Em Karniadakis

Abstract: We develop a data-driven deep neural operator framework to approximate multiple output states for a diesel engine and generate real-time predictions with reasonable accuracy. As emission norms become more stringent, the need for fast and accurate models that enable analysis of system behavior have become an essential requirement for system development. The fast transient processes involved in the… ▽ More We develop a data-driven deep neural operator framework to approximate multiple output states for a diesel engine and generate real-time predictions with reasonable accuracy. As emission norms become more stringent, the need for fast and accurate models that enable analysis of system behavior have become an essential requirement for system development. The fast transient processes involved in the operation of a combustion engine make it difficult to develop accurate physics-based models for such systems. As an alternative to physics based models, we develop an operator-based regression model (DeepONet) to learn the relevant output states for a mean-value gas flow engine model using the engine operating conditions as input variables. We have adopted a mean-value model as a benchmark for comparison, simulated using Simulink. The developed approach necessitates using the initial conditions of the output states to predict the accurate sequence over the temporal domain. To this end, a sequence-to-sequence approach is embedded into the proposed framework. The accuracy of the model is evaluated by comparing the prediction output to ground truth generated from Simulink model. The maximum $\mathcal L_2$ relative error observed was approximately $6.5\%$. The sensitivity of the DeepONet model is evaluated under simulated noise conditions and the model shows relatively low sensitivity to noise. The uncertainty in model prediction is further assessed by using a mean ensemble approach. The worst-case error at the $(μ+ 2σ)$ boundary was found to be $12\%$. The proposed framework provides the ability to predict output states in real-time and enables data-driven learning of complex input-output operator map**. As a result, this model can be applied during initial development stages, where accurate models may not be available. △ Less

Submitted 6 July, 2023; v1 submitted 2 April, 2023; originally announced April 2023.

Comments: Updated manuscript title to better reflect this work and field of study

Journal ref: Applied Intelligence, 2023

arXiv:2303.12928 [pdf, other]

Leveraging Multi-time Hamilton-Jacobi PDEs for Certain Scientific Machine Learning Problems

Authors: Paula Chen, Tingwei Meng, Zongren Zou, Jérôme Darbon, George Em Karniadakis

Abstract: Hamilton-Jacobi partial differential equations (HJ PDEs) have deep connections with a wide range of fields, including optimal control, differential games, and imaging sciences. By considering the time variable to be a higher dimensional quantity, HJ PDEs can be extended to the multi-time case. In this paper, we establish a novel theoretical connection between specific optimization problems arising… ▽ More Hamilton-Jacobi partial differential equations (HJ PDEs) have deep connections with a wide range of fields, including optimal control, differential games, and imaging sciences. By considering the time variable to be a higher dimensional quantity, HJ PDEs can be extended to the multi-time case. In this paper, we establish a novel theoretical connection between specific optimization problems arising in machine learning and the multi-time Hopf formula, which corresponds to a representation of the solution to certain multi-time HJ PDEs. Through this connection, we increase the interpretability of the training process of certain machine learning applications by showing that when we solve these learning problems, we also solve a multi-time HJ PDE and, by extension, its corresponding optimal control problem. As a first exploration of this connection, we develop the relation between the regularized linear regression problem and the Linear Quadratic Regulator (LQR). We then leverage our theoretical connection to adapt standard LQR solvers (namely, those based on the Riccati ordinary differential equations) to design new training approaches for machine learning. Finally, we provide some numerical examples that demonstrate the versatility and possible computational advantages of our Riccati-based approach in the context of continual learning, post-training calibration, transfer learning, and sparse dynamics identification. △ Less

Submitted 8 December, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

MSC Class: 35F21; 49N05; 49N10; 68T05; 35B37

arXiv:2303.10913 [pdf, other]

Bi-orthogonal fPINN: A physics-informed neural network method for solving time-dependent stochastic fractional PDEs

Authors: Lei Ma, Rong xin Li, Fanhai Zeng, Ling Guo, George Em Karniadakis

Abstract: Fractional partial differential equations (FPDEs) can effectively represent anomalous transport and nonlocal interactions. However, inherent uncertainties arise naturally in real applications due to random forcing or unknown material properties. Mathematical models considering nonlocal interactions with uncertainty quantification can be formulated as stochastic fractional partial differential equa… ▽ More Fractional partial differential equations (FPDEs) can effectively represent anomalous transport and nonlocal interactions. However, inherent uncertainties arise naturally in real applications due to random forcing or unknown material properties. Mathematical models considering nonlocal interactions with uncertainty quantification can be formulated as stochastic fractional partial differential equations (SFPDEs). There are many challenges in solving SFPDEs numerically, especially for long-time integration since such problems are high-dimensional and nonlocal. Here, we combine the bi-orthogonal (BO) method for representing stochastic processes with physics-informed neural networks (PINNs) for solving partial differential equations to formulate the bi-orthogonal PINN method (BO-fPINN) for solving time-dependent SFPDEs. Specifically, we introduce a deep neural network for the stochastic solution of the time-dependent SFPDEs, and include the BO constraints in the loss function following a weak formulation. Since automatic differentiation is not currently applicable to fractional derivatives, we employ discretization on a grid to to compute the fractional derivatives of the neural network output. The weak formulation loss function of the BO-fPINN method can overcome some drawbacks of the BO methods and thus can be used to solve SFPDEs with eigenvalue crossings. Moreover, the BO-fPINN method can be used for inverse SFPDEs with the same framework and same computational complexity as for forward problems. We demonstrate the effectiveness of the BO-fPINN method for different benchmark problems. The results demonstrate the flexibility and efficiency of the proposed method, especially for inverse problems. △ Less

Submitted 20 March, 2023; originally announced March 2023.

arXiv:2303.10528 [pdf, other]

LNO: Laplace Neural Operator for Solving Differential Equations

Authors: Qianying Cao, Somdatta Goswami, George Em Karniadakis

Abstract: We introduce the Laplace neural operator (LNO), which leverages the Laplace transform to decompose the input space. Unlike the Fourier Neural Operator (FNO), LNO can handle non-periodic signals, account for transient responses, and exhibit exponential convergence. LNO incorporates the pole-residue relationship between the input and the output space, enabling greater interpretability and improved g… ▽ More We introduce the Laplace neural operator (LNO), which leverages the Laplace transform to decompose the input space. Unlike the Fourier Neural Operator (FNO), LNO can handle non-periodic signals, account for transient responses, and exhibit exponential convergence. LNO incorporates the pole-residue relationship between the input and the output space, enabling greater interpretability and improved generalization ability. Herein, we demonstrate the superior approximation accuracy of a single Laplace layer in LNO over four Fourier modules in FNO in approximating the solutions of three ODEs (Duffing oscillator, driven gravity pendulum, and Lorenz system) and three PDEs (Euler-Bernoulli beam, diffusion equation, and reaction-diffusion system). Notably, LNO outperforms FNO in capturing transient responses in undamped scenarios. For the linear Euler-Bernoulli beam and diffusion equation, LNO's exact representation of the pole-residue formulation yields significantly better results than FNO. For the nonlinear reaction-diffusion system, LNO's errors are smaller than those of FNO, demonstrating the effectiveness of using system poles and residues as network parameters for operator learning. Overall, our results suggest that LNO represents a promising new approach for learning neural operators that map functions between infinite-dimensional spaces. △ Less

Submitted 30 May, 2023; v1 submitted 18 March, 2023; originally announced March 2023.

Comments: 18 pages, 8 figures, 2 tables

arXiv:2303.08891 [pdf, other]

ViTO: Vision Transformer-Operator

Authors: Oded Ovadia, Adar Kahana, Panos Stinis, Eli Turkel, George Em Karniadakis

Abstract: We combine vision transformers with operator learning to solve diverse inverse problems described by partial differential equations (PDEs). Our approach, named ViTO, combines a U-Net based architecture with a vision transformer. We apply ViTO to solve inverse PDE problems of increasing complexity, namely for the wave equation, the Navier-Stokes equations and the Darcy equation. We focus on the mor… ▽ More We combine vision transformers with operator learning to solve diverse inverse problems described by partial differential equations (PDEs). Our approach, named ViTO, combines a U-Net based architecture with a vision transformer. We apply ViTO to solve inverse PDE problems of increasing complexity, namely for the wave equation, the Navier-Stokes equations and the Darcy equation. We focus on the more challenging case of super-resolution, where the input dataset for the inverse problem is at a significantly coarser resolution than the output. The results we obtain are comparable or exceed the leading operator network benchmarks in terms of accuracy. Furthermore, ViTO`s architecture has a small number of trainable parameters (less than 10% of the leading competitor), resulting in a performance speed-up of over 5x when averaged over the various test cases. △ Less

Submitted 15 March, 2023; originally announced March 2023.

Report number: PNNL-SA-182861

Showing 1–50 of 197 results for author: Karniadakis, G E