Search | arXiv e-print repository

arXiv:2403.02054 [pdf, other]

Large Language Model-Based Evolutionary Optimizer: Reasoning with elitism

Authors: Shuvayan Brahmachary, Subodh M. Joshi, Aniruddha Panda, Kaushik Koneripalli, Arun Kumar Sagotra, Harshil Patel, Ankush Sharma, Ameya D. Jagtap, Kaushic Kalyanaraman

Abstract: Large Language Models (LLMs) have demonstrated remarkable reasoning abilities, prompting interest in their application as black-box optimizers. This paper asserts that LLMs possess the capability for zero-shot optimization across diverse scenarios, including multi-objective and high-dimensional problems. We introduce a novel population-based method for numerical optimization using LLMs called Lang… ▽ More Large Language Models (LLMs) have demonstrated remarkable reasoning abilities, prompting interest in their application as black-box optimizers. This paper asserts that LLMs possess the capability for zero-shot optimization across diverse scenarios, including multi-objective and high-dimensional problems. We introduce a novel population-based method for numerical optimization using LLMs called Language-Model-Based Evolutionary Optimizer (LEO). Our hypothesis is supported through numerical examples, spanning benchmark and industrial engineering problems such as supersonic nozzle shape optimization, heat transfer, and windfarm layout optimization. We compare our method to several gradient-based and gradient-free optimization approaches. While LLMs yield comparable results to state-of-the-art methods, their imaginative nature and propensity to hallucinate demand careful handling. We provide practical guidelines for obtaining reliable answers from LLMs and discuss method limitations and potential research directions. △ Less

Submitted 4 March, 2024; originally announced March 2024.

arXiv:2401.08886 [pdf, other]

doi 10.1016/j.cma.2024.116996

RiemannONets: Interpretable Neural Operators for Riemann Problems

Authors: Ahmad Peyvan, Vivek Oommen, Ameya D. Jagtap, George Em Karniadakis

Abstract: Develo** the proper representations for simulating high-speed flows with strong shock waves, rarefactions, and contact discontinuities has been a long-standing question in numerical analysis. Herein, we employ neural operators to solve Riemann problems encountered in compressible flows for extreme pressure jumps (up to $10^{10}$ pressure ratio). In particular, we first consider the DeepONet that… ▽ More Develo** the proper representations for simulating high-speed flows with strong shock waves, rarefactions, and contact discontinuities has been a long-standing question in numerical analysis. Herein, we employ neural operators to solve Riemann problems encountered in compressible flows for extreme pressure jumps (up to $10^{10}$ pressure ratio). In particular, we first consider the DeepONet that we train in a two-stage process, following the recent work of \cite{lee2023training}, wherein the first stage, a basis is extracted from the trunk net, which is orthonormalized and subsequently is used in the second stage in training the branch net. This simple modification of DeepONet has a profound effect on its accuracy, efficiency, and robustness and leads to very accurate solutions to Riemann problems compared to the vanilla version. It also enables us to interpret the results physically as the hierarchical data-driven produced basis reflects all the flow features that would otherwise be introduced using ad hoc feature expansion layers. We also compare the results with another neural operator based on the U-Net for low, intermediate, and very high-pressure ratios that are very accurate for Riemann problems, especially for large pressure ratios, due to their multiscale nature but computationally more expensive. Overall, our study demonstrates that simple neural network architectures, if properly pre-trained, can achieve very accurate solutions of Riemann problems for real-time forecasting. The source code, along with its corresponding data, can be found at the following URL: https://github.com/apey236/RiemannONet/tree/main △ Less

Submitted 16 April, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

arXiv:2309.11891 [pdf, other]

Heart Rate Detection Using an Event Camera

Authors: Aniket Jagtap, RamaKrishna Venkatesh Saripalli, Joe Lemley, Waseem Shariff, Alan F. Smeaton

Abstract: Event cameras, also known as neuromorphic cameras, are an emerging technology that offer advantages over traditional shutter and frame-based cameras, including high temporal resolution, low power consumption, and selective data acquisition. In this study, we propose to harnesses the capabilities of event-based cameras to capture subtle changes in the surface of the skin caused by the pulsatile flo… ▽ More Event cameras, also known as neuromorphic cameras, are an emerging technology that offer advantages over traditional shutter and frame-based cameras, including high temporal resolution, low power consumption, and selective data acquisition. In this study, we propose to harnesses the capabilities of event-based cameras to capture subtle changes in the surface of the skin caused by the pulsatile flow of blood in the wrist region. We investigate whether an event camera could be used for continuous noninvasive monitoring of heart rate (HR). Event camera video data from 25 participants, comprising varying age groups and skin colours, was collected and analysed. Ground-truth HR measurements obtained using conventional methods were used to evaluate of the accuracy of automatic detection of HR from event camera data. Our experimental results and comparison to the performance of other non-contact HR measurement methods demonstrate the feasibility of using event cameras for pulse detection. We also acknowledge the challenges and limitations of our method, such as light-induced flickering and the sub-conscious but naturally-occurring tremors of an individual during data capture. △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: Dataset available at https://doi.org/10.6084/m9.figshare.24039501.v1

arXiv:2309.10117 [pdf, other]

Deep smoothness WENO scheme for two-dimensional hyperbolic conservation laws: A deep learning approach for learning smoothness indicators

Authors: Tatiana Kossaczká, Ameya D. Jagtap, Matthias Ehrhardt

Abstract: In this paper, we introduce an improved version of the fifth-order weighted essentially non-oscillatory (WENO) shock-capturing scheme by incorporating deep learning techniques. The established WENO algorithm is improved by training a compact neural network to adjust the smoothness indicators within the WENO scheme. This modification enhances the accuracy of the numerical results, particularly near… ▽ More In this paper, we introduce an improved version of the fifth-order weighted essentially non-oscillatory (WENO) shock-capturing scheme by incorporating deep learning techniques. The established WENO algorithm is improved by training a compact neural network to adjust the smoothness indicators within the WENO scheme. This modification enhances the accuracy of the numerical results, particularly near abrupt shocks. Unlike previous deep learning-based methods, no additional post-processing steps are necessary for maintaining consistency. We demonstrate the superiority of our new approach using several examples from the literature for the two-dimensional Euler equations of gas dynamics. Through intensive study of these test problems, which involve various shocks and rarefaction waves, the new technique is shown to outperform traditional fifth-order WENO schemes, especially in cases where the numerical solutions exhibit excessive diffusion or overshoot around shocks. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: 33 pages, 18 figures

MSC Class: 65M06; 68T05; 76M20

arXiv:2302.14227 [pdf, other]

doi 10.1016/j.jcp.2023.112464

A unified scalable framework for causal swee** strategies for Physics-Informed Neural Networks (PINNs) and their temporal decompositions

Authors: Michael Penwarden, Ameya D. Jagtap, Shandian Zhe, George Em Karniadakis, Robert M. Kirby

Abstract: Physics-informed neural networks (PINNs) as a means of solving partial differential equations (PDE) have garnered much attention in the Computational Science and Engineering (CS&E) world. However, a recent topic of interest is exploring various training (i.e., optimization) challenges - in particular, arriving at poor local minima in the optimization landscape results in a PINN approximation givin… ▽ More Physics-informed neural networks (PINNs) as a means of solving partial differential equations (PDE) have garnered much attention in the Computational Science and Engineering (CS&E) world. However, a recent topic of interest is exploring various training (i.e., optimization) challenges - in particular, arriving at poor local minima in the optimization landscape results in a PINN approximation giving an inferior, and sometimes trivial, solution when solving forward time-dependent PDEs with no data. This problem is also found in, and in some sense more difficult, with domain decomposition strategies such as temporal decomposition using XPINNs. We furnish examples and explanations for different training challenges, their cause, and how they relate to information propagation and temporal decomposition. We then propose a new stacked-decomposition method that bridges the gap between time-marching PINNs and XPINNs. We also introduce significant computational speed-ups by using transfer learning concepts to initialize subnetworks in the domain and loss tolerance-based propagation for the subdomains. Finally, we formulate a new time-swee** collocation point algorithm inspired by the previous PINNs causality literature, which our framework can still describe, and provides a significant computational speed-up via reduced-cost collocation point segmentation. The proposed methods form our unified framework, which overcomes training challenges in PINNs and XPINNs for time-dependent PDEs by respecting the causality in multiple forms and improving scalability by limiting the computation required per optimization iteration. Finally, we provide numerical results for these methods on baseline PDE problems for which unmodified PINNs and XPINNs struggle to train. △ Less

Submitted 18 September, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

Journal ref: Journal of Computational Physics, 493, 2023, 112464

arXiv:2302.12645 [pdf, other]

Learning stiff chemical kinetics using extended deep neural operators

Authors: Somdatta Goswami, Ameya D. Jagtap, Hessam Babaee, Bryan T. Susi, George Em Karniadakis

Abstract: We utilize neural operators to learn the solution propagator for the challenging chemical kinetics equation. Specifically, we apply the deep operator network (DeepONet) along with its extensions, such as the autoencoder-based DeepONet and the newly proposed Partition-of-Unity (PoU-) DeepONet to study a range of examples, including the ROBERS problem with three species, the POLLU problem with 25 sp… ▽ More We utilize neural operators to learn the solution propagator for the challenging chemical kinetics equation. Specifically, we apply the deep operator network (DeepONet) along with its extensions, such as the autoencoder-based DeepONet and the newly proposed Partition-of-Unity (PoU-) DeepONet to study a range of examples, including the ROBERS problem with three species, the POLLU problem with 25 species, pure kinetics of the syngas skeletal model for $CO/H_2$ burning, which contains 11 species and 21 reactions and finally, a temporally develo** planar $CO/H_2$ jet flame (turbulent flame) using the same syngas mechanism. We have demonstrated the advantages of the proposed approach through these numerical examples. Specifically, to train the DeepONet for the syngas model, we solve the skeletal kinetic model for different initial conditions. In the first case, we parametrize the initial conditions based on equivalence ratios and initial temperature values. In the second case, we perform a direct numerical simulation of a two-dimensional temporally develo** $CO/H_2$ jet flame. Then, we initialize the kinetic model by the thermochemical states visited by a subset of grid points at different time snapshots. Stiff problems are computationally expensive to solve with traditional stiff solvers. Thus, this work aims to develop a neural operator-based surrogate model to solve stiff chemical kinetics. The operator, once trained offline, can accurately integrate the thermochemical state for arbitrarily large time advancements, leading to significant computational gains compared to stiff integration schemes. △ Less

Submitted 23 February, 2023; originally announced February 2023.

Comments: 21 pages, 11 figures

arXiv:2211.08939 [pdf, other]

doi 10.1016/j.engappai.2023.107183

Augmented Physics-Informed Neural Networks (APINNs): A gating network-based soft domain decomposition methodology

Authors: Zheyuan Hu, Ameya D. Jagtap, George Em Karniadakis, Kenji Kawaguchi

Abstract: In this paper, we propose the augmented physics-informed neural network (APINN), which adopts soft and trainable domain decomposition and flexible parameter sharing to further improve the extended PINN (XPINN) as well as the vanilla PINN methods. In particular, a trainable gate network is employed to mimic the hard decomposition of XPINN, which can be flexibly fine-tuned for discovering a potentia… ▽ More In this paper, we propose the augmented physics-informed neural network (APINN), which adopts soft and trainable domain decomposition and flexible parameter sharing to further improve the extended PINN (XPINN) as well as the vanilla PINN methods. In particular, a trainable gate network is employed to mimic the hard decomposition of XPINN, which can be flexibly fine-tuned for discovering a potentially better partition. It weight-averages several sub-nets as the output of APINN. APINN does not require complex interface conditions, and its sub-nets can take advantage of all training samples rather than just part of the training data in their subdomains. Lastly, each sub-net shares part of the common parameters to capture the similar components in each decomposed function. Furthermore, following the PINN generalization theory in Hu et al. [2021], we show that APINN can improve generalization by proper gate network initialization and general domain & function decomposition. Extensive experiments on different types of PDEs demonstrate how APINN improves the PINN and XPINN methods. Specifically, we present examples where XPINN performs similarly to or worse than PINN, so that APINN can significantly improve both. We also show cases where XPINN is already better than PINN, so APINN can still slightly improve XPINN. Furthermore, we visualize the optimized gating networks and their optimization trajectories, and connect them with their performance, which helps discover the possibly optimal decomposition. Interestingly, if initialized by different decomposition, the performances of corresponding APINNs can differ drastically. This, in turn, shows the potential to design an optimal domain decomposition for the differential equation problem under consideration. △ Less

Submitted 29 September, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

Comments: Accepted at Engineering Applications of Artificial Intelligence (EAAI)

Journal ref: Engineering Applications of Artificial Intelligence, Volume 126, Part B, November 2023, 107183

arXiv:2209.02681 [pdf, other]

How important are activation functions in regression and classification? A survey, performance comparison, and future directions

Authors: Ameya D. Jagtap, George Em Karniadakis

Abstract: Inspired by biological neurons, the activation functions play an essential part in the learning process of any artificial neural network commonly used in many real-world problems. Various activation functions have been proposed in the literature for classification as well as regression tasks. In this work, we survey the activation functions that have been employed in the past as well as the curren… ▽ More Inspired by biological neurons, the activation functions play an essential part in the learning process of any artificial neural network commonly used in many real-world problems. Various activation functions have been proposed in the literature for classification as well as regression tasks. In this work, we survey the activation functions that have been employed in the past as well as the current state-of-the-art. In particular, we present various developments in activation functions over the years and the advantages as well as disadvantages or limitations of these activation functions. We also discuss classical (fixed) activation functions, including rectifier units, and adaptive activation functions. In addition to discussing the taxonomy of activation functions based on characterization, a taxonomy of activation functions based on applications is presented. To this end, the systematic comparison of various fixed and adaptive activation functions is performed for classification data sets such as the MNIST, CIFAR-10, and CIFAR- 100. In recent years, a physics-informed machine learning framework has emerged for solving problems related to scientific computations. For this purpose, we also discuss various requirements for activation functions that have been used in the physics-informed machine learning framework. Furthermore, various comparisons are made among different fixed and adaptive activation functions using various machine learning libraries such as TensorFlow, Pytorch, and JAX. △ Less

Submitted 28 December, 2022; v1 submitted 6 September, 2022; originally announced September 2022.

Comments: 28 pages, 15 figures

arXiv:2205.10368 [pdf, other]

Automatic Generation of Synthetic Colonoscopy Videos for Domain Randomization

Authors: Abhishek Dinkar Jagtap, Mattias Heinrich, Marian Himstedt

Abstract: An increasing number of colonoscopic guidance and assistance systems rely on machine learning algorithms which require a large amount of high-quality training data. In order to ensure high performance, the latter has to resemble a substantial portion of possible configurations. This particularly addresses varying anatomy, mucosa appearance and image sensor characteristics which are likely deterior… ▽ More An increasing number of colonoscopic guidance and assistance systems rely on machine learning algorithms which require a large amount of high-quality training data. In order to ensure high performance, the latter has to resemble a substantial portion of possible configurations. This particularly addresses varying anatomy, mucosa appearance and image sensor characteristics which are likely deteriorated by motion blur and inadequate illumination. The limited amount of readily available training data hampers to account for all of these possible configurations which results in reduced generalization capabilities of machine learning models. We propose an exemplary solution for synthesizing colonoscopy videos with substantial appearance and anatomical variations which enables to learn discriminative domain-randomized representations of the interior colon while mimicking real-world settings. △ Less

Submitted 20 May, 2022; originally announced May 2022.

Comments: 4 pages, 5 figures

arXiv:2203.09346 [pdf, other]

Error estimates for physics informed neural networks approximating the Navier-Stokes equations

Authors: Tim De Ryck, Ameya D. Jagtap, Siddhartha Mishra

Abstract: We prove rigorous bounds on the errors resulting from the approximation of the incompressible Navier-Stokes equations with (extended) physics informed neural networks. We show that the underlying PDE residual can be made arbitrarily small for tanh neural networks with two hidden layers. Moreover, the total error can be estimated in terms of the training error, network size and number of quadrature… ▽ More We prove rigorous bounds on the errors resulting from the approximation of the incompressible Navier-Stokes equations with (extended) physics informed neural networks. We show that the underlying PDE residual can be made arbitrarily small for tanh neural networks with two hidden layers. Moreover, the total error can be estimated in terms of the training error, network size and number of quadrature points. The theory is illustrated with numerical experiments. △ Less

Submitted 2 February, 2023; v1 submitted 17 March, 2022; originally announced March 2022.

arXiv:2202.11821 [pdf, other]

doi 10.1016/j.jcp.2022.111402

Physics-informed neural networks for inverse problems in supersonic flows

Authors: Ameya D. Jagtap, Zhi** Mao, Nikolaus Adams, George Em Karniadakis

Abstract: Accurate solutions to inverse supersonic compressible flow problems are often required for designing specialized aerospace vehicles. In particular, we consider the problem where we have data available for density gradients from Schlieren photography as well as data at the inflow and part of wall boundaries. These inverse problems are notoriously difficult and traditional methods may not be adequat… ▽ More Accurate solutions to inverse supersonic compressible flow problems are often required for designing specialized aerospace vehicles. In particular, we consider the problem where we have data available for density gradients from Schlieren photography as well as data at the inflow and part of wall boundaries. These inverse problems are notoriously difficult and traditional methods may not be adequate to solve such ill-posed inverse problems. To this end, we employ the physics-informed neural networks (PINNs) and its extended version, extended PINNs (XPINNs), where domain decomposition allows deploying locally powerful neural networks in each subdomain, which can provide additional expressivity in subdomains, where a complex solution is expected. Apart from the governing compressible Euler equations, we also enforce the entropy conditions in order to obtain viscosity solutions. Moreover, we enforce positivity conditions on density and pressure. We consider inverse problems involving two-dimensional expansion waves, two-dimensional oblique and bow shock waves. We compare solutions obtained by PINNs and XPINNs and invoke some theoretical results that can be used to decide on the generalization errors of the two methods. △ Less

Submitted 23 February, 2022; originally announced February 2022.

Comments: 19 pages, 20 figures

arXiv:2202.02899 [pdf, other]

Deep learning of inverse water waves problems using multi-fidelity data: Application to Serre-Green-Naghdi equations

Authors: Ameya D. Jagtap, Dimitrios Mitsotakis, George Em Karniadakis

Abstract: We consider strongly-nonlinear and weakly-dispersive surface water waves governed by equations of Boussinesq type, known as the Serre-Green-Naghdi system; it describes future states of the free water surface and depth averaged horizontal velocity, given their initial state. The lack of knowledge of the velocity field as well as the initial states provided by measurements lead to an ill-posed probl… ▽ More We consider strongly-nonlinear and weakly-dispersive surface water waves governed by equations of Boussinesq type, known as the Serre-Green-Naghdi system; it describes future states of the free water surface and depth averaged horizontal velocity, given their initial state. The lack of knowledge of the velocity field as well as the initial states provided by measurements lead to an ill-posed problem that cannot be solved by traditional techniques. To this end, we employ physics-informed neural networks (PINNs) to generate solutions to such ill-posed problems using only data of the free surface elevation and depth of the water. PINNs can readily incorporate the physical laws and the observational data, thereby enabling inference of the physical quantities of interest. In the present study, both experimental and synthetic (generated by numerical methods) training data are used to train PINNs. Furthermore, multi-fidelity data are used to solve the inverse water wave problem by leveraging both high- and low-fidelity data sets. The applicability of the PINN methodology for the estimation of the impact of water waves onto solid obstacles is demonstrated after deriving the corresponding equations. The present methodology can be employed to efficiently design offshore structures such as oil platforms, wind turbines, etc. by solving the corresponding ill-posed inverse water waves problem. △ Less

Submitted 6 February, 2022; originally announced February 2022.

arXiv:2109.09444 [pdf, other]

doi 10.1137/21M1447039

When Do Extended Physics-Informed Neural Networks (XPINNs) Improve Generalization?

Authors: Zheyuan Hu, Ameya D. Jagtap, George Em Karniadakis, Kenji Kawaguchi

Abstract: Physics-informed neural networks (PINNs) have become a popular choice for solving high-dimensional partial differential equations (PDEs) due to their excellent approximation power and generalization ability. Recently, Extended PINNs (XPINNs) based on domain decomposition methods have attracted considerable attention due to their effectiveness in modeling multiscale and multiphysics problems and th… ▽ More Physics-informed neural networks (PINNs) have become a popular choice for solving high-dimensional partial differential equations (PDEs) due to their excellent approximation power and generalization ability. Recently, Extended PINNs (XPINNs) based on domain decomposition methods have attracted considerable attention due to their effectiveness in modeling multiscale and multiphysics problems and their parallelization. However, theoretical understanding on their convergence and generalization properties remains unexplored. In this study, we take an initial step towards understanding how and when XPINNs outperform PINNs. Specifically, for general multi-layer PINNs and XPINNs, we first provide a prior generalization bound via the complexity of the target functions in the PDE problem, and a posterior generalization bound via the posterior matrix norms of the networks after optimization. Moreover, based on our bounds, we analyze the conditions under which XPINNs improve generalization. Concretely, our theory shows that the key building block of XPINN, namely the domain decomposition, introduces a tradeoff for generalization. On the one hand, XPINNs decompose the complex PDE solution into several simple parts, which decreases the complexity needed to learn each part and boosts generalization. On the other hand, decomposition leads to less training data being available in each subdomain, and hence such model is typically prone to overfitting and may become less generalizable. Empirically, we choose five PDEs to show when XPINNs perform better than, similar to, or worse than PINNs, hence demonstrating and justifying our new theory. △ Less

Submitted 18 October, 2022; v1 submitted 20 September, 2021; originally announced September 2021.

Comments: Published in SIAM Journal on Scientific Computing (SISC)

Journal ref: SIAM Journal on Scientific Computing Vol. 44, Iss. 5 (2022)

arXiv:2105.09513 [pdf, other]

Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions

Authors: Ameya D. Jagtap, Yeonjong Shin, Kenji Kawaguchi, George Em Karniadakis

Abstract: We propose a new type of neural networks, Kronecker neural networks (KNNs), that form a general framework for neural networks with adaptive activation functions. KNNs employ the Kronecker product, which provides an efficient way of constructing a very wide network while kee** the number of parameters low. Our theoretical analysis reveals that under suitable conditions, KNNs induce a faster decay… ▽ More We propose a new type of neural networks, Kronecker neural networks (KNNs), that form a general framework for neural networks with adaptive activation functions. KNNs employ the Kronecker product, which provides an efficient way of constructing a very wide network while kee** the number of parameters low. Our theoretical analysis reveals that under suitable conditions, KNNs induce a faster decay of the loss than that by the feed-forward networks. This is also empirically verified through a set of computational examples. Furthermore, under certain technical assumptions, we establish global convergence of gradient descent for KNNs. As a specific case, we propose the Rowdy activation function that is designed to get rid of any saturation region by injecting sinusoidal fluctuations, which include trainable parameters. The proposed Rowdy activation function can be employed in any neural network architecture like feed-forward neural networks, Recurrent neural networks, Convolutional neural networks etc. The effectiveness of KNNs with Rowdy activation is demonstrated through various computational experiments including function approximation using feed-forward neural networks, solution inference of partial differential equations using the physics-informed neural networks, and standard deep learning benchmark problems using convolutional and fully-connected neural networks. △ Less

Submitted 19 October, 2021; v1 submitted 20 May, 2021; originally announced May 2021.

Comments: 24 pages, 16 figures

arXiv:2104.10013 [pdf, other]

Parallel Physics-Informed Neural Networks via Domain Decomposition

Authors: Khemraj Shukla, Ameya D. Jagtap, George Em Karniadakis

Abstract: We develop a distributed framework for the physics-informed neural networks (PINNs) based on two recent extensions, namely conservative PINNs (cPINNs) and extended PINNs (XPINNs), which employ domain decomposition in space and in time-space, respectively. This domain decomposition endows cPINNs and XPINNs with several advantages over the vanilla PINNs, such as parallelization capacity, large repre… ▽ More We develop a distributed framework for the physics-informed neural networks (PINNs) based on two recent extensions, namely conservative PINNs (cPINNs) and extended PINNs (XPINNs), which employ domain decomposition in space and in time-space, respectively. This domain decomposition endows cPINNs and XPINNs with several advantages over the vanilla PINNs, such as parallelization capacity, large representation capacity, efficient hyperparameter tuning, and is particularly effective for multi-scale and multi-physics problems. Here, we present a parallel algorithm for cPINNs and XPINNs constructed with a hybrid programming model described by MPI $+$ X, where X $\in \{\text{CPUs},~\text{GPUs}\}$. The main advantage of cPINN and XPINN over the more classical data and model parallel approaches is the flexibility of optimizing all hyperparameters of each neural network separately in each subdomain. We compare the performance of distributed cPINNs and XPINNs for various forward problems, using both weak and strong scalings. Our results indicate that for space domain decomposition, cPINNs are more efficient in terms of communication cost but XPINNs provide greater flexibility as they can also handle time-domain decomposition for any differential equations, and can deal with any arbitrarily shaped complex subdomains. To this end, we also present an application of the parallel XPINN method for solving an inverse diffusion problem with variable conductivity on the United States map, using ten regions as subdomains. △ Less

Submitted 8 September, 2021; v1 submitted 20 April, 2021; originally announced April 2021.

Comments: 23 pages, 13 figures

arXiv:2103.14104 [pdf, other]

doi 10.1109/MSP.2021.3118904

A physics-informed neural network for quantifying the microstructure properties of polycrystalline Nickel using ultrasound data

Authors: Khemraj Shukla, Ameya D. Jagtap, James L. Blackshire, Daniel Sparkman, George Em Karniadakis

Abstract: We employ physics-informed neural networks (PINNs) to quantify the microstructure of a polycrystalline Nickel by computing the spatial variation of compliance coefficients (compressibility, stiffness and rigidity) of the material. The PINN is supervised with realistic ultrasonic surface acoustic wavefield data acquired at an ultrasonic frequency of 5 MHz for the polycrystalline material. The ultra… ▽ More We employ physics-informed neural networks (PINNs) to quantify the microstructure of a polycrystalline Nickel by computing the spatial variation of compliance coefficients (compressibility, stiffness and rigidity) of the material. The PINN is supervised with realistic ultrasonic surface acoustic wavefield data acquired at an ultrasonic frequency of 5 MHz for the polycrystalline material. The ultrasonic wavefield data is represented as a deformation on the top surface of the material with the deformation measured using the method of laser vibrometry. The ultrasonic data is further complemented with wavefield data generated using a finite element based solver. The neural network is physically-informed by the in-plane and out-of-plane elastic wave equations and its convergence is accelerated using adaptive activation functions. The overarching goal of this work is to infer the spatial variation of compliance coefficients of materials using PINNs, which for ultrasound involves the spatially varying speed of the elastic waves. More broadly, the resulting PINN based surrogate model shows a promising approach for solving ill-posed inverse problems, often encountered in the non-destructive evaluation of materials. △ Less

Submitted 5 October, 2021; v1 submitted 25 March, 2021; originally announced March 2021.

Comments: 18 pages, 5 figures

arXiv:1909.12228 [pdf, other]

doi 10.1098/rspa.2020.0334

Locally adaptive activation functions with slope recovery term for deep and physics-informed neural networks

Authors: Ameya D. Jagtap, Kenji Kawaguchi, George Em Karniadakis

Abstract: We propose two approaches of locally adaptive activation functions namely, layer-wise and neuron-wise locally adaptive activation functions, which improve the performance of deep and physics-informed neural networks. The local adaptation of activation function is achieved by introducing a scalable parameter in each layer (layer-wise) and for every neuron (neuron-wise) separately, and then optimizi… ▽ More We propose two approaches of locally adaptive activation functions namely, layer-wise and neuron-wise locally adaptive activation functions, which improve the performance of deep and physics-informed neural networks. The local adaptation of activation function is achieved by introducing a scalable parameter in each layer (layer-wise) and for every neuron (neuron-wise) separately, and then optimizing it using a variant of stochastic gradient descent algorithm. In order to further increase the training speed, an activation slope based slope recovery term is added in the loss function, which further accelerates convergence, thereby reducing the training cost. On the theoretical side, we prove that in the proposed method, the gradient descent algorithms are not attracted to sub-optimal critical points or local minima under practical conditions on the initialization and learning rate, and that the gradient dynamics of the proposed method is not achievable by base methods with any (adaptive) learning rates. We further show that the adaptive activation methods accelerate the convergence by implicitly multiplying conditioning matrices to the gradient of the base method without any explicit computation of the conditioning matrix and the matrix-vector product. The different adaptive activation functions are shown to induce different implicit conditioning matrices. Furthermore, the proposed methods with the slope recovery are shown to accelerate the training process. △ Less

Submitted 17 June, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

Comments: 19 pages, 8 figures

arXiv:1906.01170 [pdf, other]

doi 10.1016/j.jcp.2019.109136

Adaptive activation functions accelerate convergence in deep and physics-informed neural networks

Authors: Ameya D. Jagtap, George Em Karniadakis

Abstract: We employ adaptive activation functions for regression in deep and physics-informed neural networks (PINNs) to approximate smooth and discontinuous functions as well as solutions of linear and nonlinear partial differential equations. In particular, we solve the nonlinear Klein-Gordon equation, which has smooth solutions, the nonlinear Burgers equation, which can admit high gradient solutions, and… ▽ More We employ adaptive activation functions for regression in deep and physics-informed neural networks (PINNs) to approximate smooth and discontinuous functions as well as solutions of linear and nonlinear partial differential equations. In particular, we solve the nonlinear Klein-Gordon equation, which has smooth solutions, the nonlinear Burgers equation, which can admit high gradient solutions, and the Helmholtz equation. We introduce a scalable hyper-parameter in the activation function, which can be optimized to achieve best performance of the network as it changes dynamically the topology of the loss function involved in the optimization process. The adaptive activation function has better learning capabilities than the traditional one (fixed activation) as it improves greatly the convergence rate, especially at early training, as well as the solution accuracy. To better understand the learning process, we plot the neural network solution in the frequency domain to examine how the network captures successively different frequency bands present in the solution. We consider both forward problems, where the approximate solutions are obtained, as well as inverse problems, where parameters involved in the governing equation are identified. Our simulation results show that the proposed method is a very simple and effective approach to increase the efficiency, robustness and accuracy of the neural network approximation of nonlinear functions as well as solutions of partial differential equations, especially for forward problems. △ Less

Submitted 3 June, 2019; originally announced June 2019.

Comments: 24 pages, 21 figures

arXiv:1905.06052 [pdf]

Survival of the Fittest in PlayerUnknown BattleGround

Authors: Brij Rokad, Tushar Karumudi, Omkar Acharya, Akshay Jagtap

Abstract: The goal of this paper was to predict the placement in the multiplayer game PUBG (playerunknown battleground). In the game, up to one hundred players parachutes onto an island and scavenge for weapons and equipment to kill others, while avoiding getting killed themselves. The available safe area of the game map decreases in size over time, directing surviving players into tighter areas to force en… ▽ More The goal of this paper was to predict the placement in the multiplayer game PUBG (playerunknown battleground). In the game, up to one hundred players parachutes onto an island and scavenge for weapons and equipment to kill others, while avoiding getting killed themselves. The available safe area of the game map decreases in size over time, directing surviving players into tighter areas to force encounters. The last player or team standing wins the round. In this paper specifically, we have tried to predict the placement of the player in the ultimate survival test. The data set has been taken from Kaggle. Entire dataset has 29 attributes which are categories to 1 label(winPlacePerc), training set has 4.5 million instances and testing set has 1.9 million. winPlacePerc is continuous category, which makes it harder to predict the survival of the fittest. To overcome this problem, we have applied multiple machine learning models to find the optimum prediction. Model consists of LightGBM Regression (Light Gradient Boosting Machine Regression), MultiLayer Perceptron, M5P (improvement on C4.5) and Random Forest. To measure the error rate, Mean Absolute Error has been used. With the final prediction we have achieved MAE of 0.02047, 0.065, 0.0592 and 0634 respectively. △ Less

Submitted 15 May, 2019; originally announced May 2019.

Comments: 14 pages, 9 figures

arXiv:1611.03338 [pdf, other]

Method of Relaxed Streamline Upwinding for Hyperbolic Conservation Laws

Authors: Ameya D. Jagtap

Abstract: In this work a new finite element based Method of Relaxed Streamline Upwinding is proposed to solve hyperbolic conservation laws. Formulation of the proposed scheme is based on relaxation system which replaces hyperbolic conservation laws by semi-linear system with stiff source term also called as relaxation term. The advantage of the semi-linear system is that the nonlinearity in the convection t… ▽ More In this work a new finite element based Method of Relaxed Streamline Upwinding is proposed to solve hyperbolic conservation laws. Formulation of the proposed scheme is based on relaxation system which replaces hyperbolic conservation laws by semi-linear system with stiff source term also called as relaxation term. The advantage of the semi-linear system is that the nonlinearity in the convection term is pushed towards the source term on right hand side which can be handled with ease. Six symmetric discrete velocity models are introduced in two dimensions which symmetrically spread foot of the characteristics in all four quadrants thereby taking information symmetrically from all directions. Proposed scheme gives exact diffusion vectors which are very simple. Moreover, the formulation is easily extendable from scalar to vector conservation laws. Various test cases are solved for Burgers equation (with convex and non-convex flux functions), Euler equations and shallow water equations in one and two dimensions which demonstrate the robustness and accuracy of the proposed scheme. New test cases are proposed for Burgers equation, Euler and shallow water equations. Exact solution is given for two-dimensional Burgers test case which involves normal discontinuity and series of oblique discontinuities. Error analysis of the proposed scheme shows optimal convergence rate. Moreover, spectral stability analysis gives implicit expression of critical time step. △ Less

Submitted 3 June, 2019; v1 submitted 9 November, 2016; originally announced November 2016.

Comments: 33 pages 27 figures

arXiv:1505.04089 [pdf, other]

Explicit and Implicit Kinetic Streamlined-Upwind Petrov Galerkin Method for Hyperbolic Partial Differential Equations

Authors: Ameya Dilip Jagtap, S. V. Raghurama Rao

Abstract: A novel explicit and implicit Kinetic Streamlined-Upwind Petrov Galerkin (KSUPG) scheme is presented for hyperbolic equations such as Burgers equation and compressible Euler equations. The proposed scheme performs better than the original SUPG stabilized method in multi-dimensions. To demonstrate the numerical accuracy of the scheme, various numerical experiments have been carried out for 1D and 2… ▽ More A novel explicit and implicit Kinetic Streamlined-Upwind Petrov Galerkin (KSUPG) scheme is presented for hyperbolic equations such as Burgers equation and compressible Euler equations. The proposed scheme performs better than the original SUPG stabilized method in multi-dimensions. To demonstrate the numerical accuracy of the scheme, various numerical experiments have been carried out for 1D and 2D Burgers equation as well as for 1D and 2D Euler equations using Q4 and T3 elements. Furthermore, spectral stability analysis is done for the explicit 2D formulation. Finally, a comparison is made between explicit and implicit versions of the KSUPG scheme. △ Less

Submitted 7 May, 2015; originally announced May 2015.

Comments: 30 pages, 22 figures

arXiv:1303.1697 [pdf]

Secure Video Streaming Plug-In

Authors: Avinash Bhujbal, Ashish Jagtap, Devendra Gurav, Tino Jameskutty

Abstract: Video sharing sites like YouTube, Metacafe, Dailymotion, Vimeo, etc. provide a platform for media content sharing among its users. Some of these videos are copyright protected and restricted from being downloaded and saved. But users can use various download managers or application programs to download and save these videos. This affects the incoming traffic on these websites reducing their hit ra… ▽ More Video sharing sites like YouTube, Metacafe, Dailymotion, Vimeo, etc. provide a platform for media content sharing among its users. Some of these videos are copyright protected and restricted from being downloaded and saved. But users can use various download managers or application programs to download and save these videos. This affects the incoming traffic on these websites reducing their hit rate and consequently reducing their revenue. Adobe Flash Player is the most commonly used player for watching online videos. It uses RTMP (Real Time Messaging Protocol) to stream audio, video and data over the Internet, between a Flash Player and Adobe Flash Media Server.Here, we propose a plug-in that enables the site owner control over downloading of videos from such website. The plug-in will be installed at the client side with the consent of the user. When the video is being played this plug-in will send unique keys to the media server. The server will continue streaming the video after verifying the keys. Download managers or application programs will not be able to download the videos as they wont be able to create the unique keys that need to be sent to the server. △ Less

Submitted 7 March, 2013; originally announced March 2013.

Showing 1–22 of 22 results for author: Jagtap, A