Search | arXiv e-print repository

Discretization Error of Fourier Neural Operators

Authors: Samuel Lanthaler, Andrew M. Stuart, Margaret Trautner

Abstract: Operator learning is a variant of machine learning that is designed to approximate maps between function spaces from data. The Fourier Neural Operator (FNO) is a common model architecture used for operator learning. The FNO combines pointwise linear and nonlinear operations in physical space with pointwise linear operations in Fourier space, leading to a parameterized map acting between function s… ▽ More Operator learning is a variant of machine learning that is designed to approximate maps between function spaces from data. The Fourier Neural Operator (FNO) is a common model architecture used for operator learning. The FNO combines pointwise linear and nonlinear operations in physical space with pointwise linear operations in Fourier space, leading to a parameterized map acting between function spaces. Although FNOs formally involve convolutions of functions on a continuum, in practice the computations are performed on a discretized grid, allowing efficient implementation via the FFT. In this paper, the aliasing error that results from such a discretization is quantified and algebraic rates of convergence in terms of the grid resolution are obtained as a function of the regularity of the input. Numerical experiments that validate the theory and describe model stability are performed. △ Less

Submitted 3 May, 2024; originally announced May 2024.

MSC Class: 41A35 (Primary) 65T50; 68T07 (Secondary)

arXiv:2402.06031 [pdf, other]

An operator learning perspective on parameter-to-observable maps

Authors: Daniel Zhengyu Huang, Nicholas H. Nelsen, Margaret Trautner

Abstract: Computationally efficient surrogates for parametrized physical models play a crucial role in science and engineering. Operator learning provides data-driven surrogates that map between function spaces. However, instead of full-field measurements, often the available data are only finite-dimensional parametrizations of model inputs or finite observables of model outputs. Building on Fourier Neural… ▽ More Computationally efficient surrogates for parametrized physical models play a crucial role in science and engineering. Operator learning provides data-driven surrogates that map between function spaces. However, instead of full-field measurements, often the available data are only finite-dimensional parametrizations of model inputs or finite observables of model outputs. Building on Fourier Neural Operators, this paper introduces the Fourier Neural Map**s (FNMs) framework that is able to accommodate such finite-dimensional vector inputs or outputs. The paper develops universal approximation theorems for the method. Moreover, in many applications the underlying parameter-to-observable (PtO) map is defined implicitly through an infinite-dimensional operator, such as the solution operator of a partial differential equation. A natural question is whether it is more data-efficient to learn the PtO map end-to-end or first learn the solution operator and subsequently compute the observable from the full-field solution. A theoretical analysis of Bayesian nonparametric regression of linear functionals, which is of independent interest, suggests that the end-to-end approach can actually have worse sample complexity. Extending beyond the theory, numerical results for the FNM approximation of three nonlinear PtO maps demonstrate the benefits of the operator learning perspective that this paper adopts. △ Less

Submitted 6 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

Comments: 63 pages, 10 figures, 1 table

MSC Class: 68T07; 62G20; 65J15

arXiv:2306.12006 [pdf, other]

Learning Homogenization for Elliptic Operators

Authors: Kaushik Bhattacharya, Nikola Kovachki, Aakila Rajan, Andrew M. Stuart, Margaret Trautner

Abstract: Multiscale partial differential equations (PDEs) arise in various applications, and several schemes have been developed to solve them efficiently. Homogenization theory is a powerful methodology that eliminates the small-scale dependence, resulting in simplified equations that are computationally tractable while accurately predicting the macroscopic response. In the field of continuum mechanics, h… ▽ More Multiscale partial differential equations (PDEs) arise in various applications, and several schemes have been developed to solve them efficiently. Homogenization theory is a powerful methodology that eliminates the small-scale dependence, resulting in simplified equations that are computationally tractable while accurately predicting the macroscopic response. In the field of continuum mechanics, homogenization is crucial for deriving constitutive laws that incorporate microscale physics in order to formulate balance laws for the macroscopic quantities of interest. However, obtaining homogenized constitutive laws is often challenging as they do not in general have an analytic form and can exhibit phenomena not present on the microscale. In response, data-driven learning of the constitutive law has been proposed as appropriate for this task. However, a major challenge in data-driven learning approaches for this problem has remained unexplored: the impact of discontinuities and corner interfaces in the underlying material. These discontinuities in the coefficients affect the smoothness of the solutions of the underlying equations. Given the prevalence of discontinuous materials in continuum mechanics applications, it is important to address the challenge of learning in this context; in particular, to develop underpinning theory that establishes the reliability of data-driven methods in this scientific domain. The paper addresses this unexplored challenge by investigating the learnability of homogenized constitutive laws for elliptic operators in the presence of such complexities. Approximation theory is presented, and numerical experiments are performed which validate the theory in the context of learning the solution operator defined by the cell problem arising in homogenization for elliptic PDEs. △ Less

Submitted 4 January, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

MSC Class: 35B27; 35J47; 74H15

arXiv:2210.17443 [pdf, other]

doi 10.1016/j.jmps.2023.105329

Learning macroscopic internal variables and history dependence from microscopic models

Authors: Burigede Liu, Eric Ocegueda, Margaret Trautner, Andrew M. Stuart, Kaushik Bhattacharya

Abstract: This paper concerns the study of history dependent phenomena in heterogeneous materials in a two-scale setting where the material is specified at a fine microscopic scale of heterogeneities that is much smaller than the coarse macroscopic scale of application. We specifically study a polycrystalline medium where each grain is governed by crystal plasticity while the solid is subjected to macroscop… ▽ More This paper concerns the study of history dependent phenomena in heterogeneous materials in a two-scale setting where the material is specified at a fine microscopic scale of heterogeneities that is much smaller than the coarse macroscopic scale of application. We specifically study a polycrystalline medium where each grain is governed by crystal plasticity while the solid is subjected to macroscopic dynamic loads. The theory of homogenization allows us to solve the macroscale problem directly with a constitutive relation that is defined implicitly by the solution of the microscale problem. However, the homogenization leads to a highly complex history dependence at the macroscale, one that can be quite different from that at the microscale. In this paper, we examine the use of machine-learning, and especially deep neural networks, to harness data generated by repeatedly solving the finer scale model to: (i) gain insights into the history dependence and the macroscopic internal variables that govern the overall response; and (ii) to create a computationally efficient surrogate of its solution operator, that can directly be used at the coarser scale with no further modeling. We do so by introducing a recurrent neural operator (RNO), and show that: (i) the architecture and the learned internal variables can provide insight into the physics of the macroscopic problem; and (ii) that the RNO can provide multiscale, specifically FE2, accuracy at a cost comparable to a conventional empirical constitutive relation. △ Less

Submitted 30 April, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

arXiv:2205.14139 [pdf, other]

Learning Markovian Homogenized Models in Viscoelasticity

Authors: Kaushik Bhattacharya, Burigede Liu, Andrew M. Stuart, Margaret Trautner

Abstract: Fully resolving dynamics of materials with rapidly-varying features involves expensive fine-scale computations which need to be conducted on macroscopic scales. The theory of homogenization provides an approach to derive effective macroscopic equations which eliminates the small scales by exploiting scale separation. An accurate homogenized model avoids the computationally-expensive task of numeri… ▽ More Fully resolving dynamics of materials with rapidly-varying features involves expensive fine-scale computations which need to be conducted on macroscopic scales. The theory of homogenization provides an approach to derive effective macroscopic equations which eliminates the small scales by exploiting scale separation. An accurate homogenized model avoids the computationally-expensive task of numerically solving the underlying balance laws at a fine scale, thereby rendering a numerical solution of the balance laws more computationally tractable. In complex settings, homogenization only defines the constitutive model implicitly, and machine learning can be used to learn the constitutive model explicitly from localized fine-scale simulations. In the case of one-dimensional viscoelasticity, the linearity of the model allows for a complete analysis. We establish that the homogenized constitutive model may be approximated by a recurrent neural network (RNN) that captures the memory. The memory is encapsulated in the evolution of an appropriate finite set of internal variables, discovered through the learning process and dependent on the history of the strain. Simulations are presented which validate the theory. Guidance for the learning of more complex models, such as arise in plasticity, by similar techniques, is given. △ Less

Submitted 4 June, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

arXiv:2106.11409 [pdf, other]

Learn Like The Pro: Norms from Theory to Size Neural Computation

Authors: Margaret Trautner, Ziwei Li, Sai Ravela

Abstract: The optimal design of neural networks is a critical problem in many applications. Here, we investigate how dynamical systems with polynomial nonlinearities can inform the design of neural systems that seek to emulate them. We propose a Learnability metric and its associated features to quantify the near-equilibrium behavior of learning dynamics. Equating the Learnability of neural systems with equ… ▽ More The optimal design of neural networks is a critical problem in many applications. Here, we investigate how dynamical systems with polynomial nonlinearities can inform the design of neural systems that seek to emulate them. We propose a Learnability metric and its associated features to quantify the near-equilibrium behavior of learning dynamics. Equating the Learnability of neural systems with equivalent parameter estimation metric of the reference system establishes bounds on network structure. In this way, norms from theory provide a good first guess for neural structure, which may then further adapt with data. The proposed approach neither requires training nor training data. It reveals exact sizing for a class of neural networks with multiplicative nodes that mimic continuous- or discrete-time polynomial dynamics. It also provides relatively tight lower size bounds for classical feed-forward networks that is consistent with simulated assessments. △ Less

Submitted 21 June, 2021; originally announced June 2021.

Comments: 7 pages

MSC Class: 68T07 ACM Class: I.2.6; G.1.7

arXiv:2008.09915 [pdf, other]

Informative Neural Ensemble Kalman Learning

Authors: Margaret Trautner, Gabriel Margolis, Sai Ravela

Abstract: In stochastic systems, informative approaches select key measurement or decision variables that maximize information gain to enhance the efficacy of model-related inferences. Neural Learning also embodies stochastic dynamics, but informative Learning is less developed. Here, we propose Informative Ensemble Kalman Learning, which replaces backpropagation with an adaptive Ensemble Kalman Filter to q… ▽ More In stochastic systems, informative approaches select key measurement or decision variables that maximize information gain to enhance the efficacy of model-related inferences. Neural Learning also embodies stochastic dynamics, but informative Learning is less developed. Here, we propose Informative Ensemble Kalman Learning, which replaces backpropagation with an adaptive Ensemble Kalman Filter to quantify uncertainty and enables maximizing information gain during Learning. After demonstrating Ensemble Kalman Learning's competitive performance on standard datasets, we apply the informative approach to neural structure learning. In particular, we show that when trained from the Lorenz-63 system's simulations, the efficaciously learned structure recovers the dynamical equations. To the best of our knowledge, Informative Ensemble Kalman Learning is new. Results suggest that this approach to optimized Learning is promising. △ Less

Submitted 22 August, 2020; originally announced August 2020.

Comments: ten pages; accepted for presentation in DDDAS-2020

arXiv:1911.10309 [pdf, other]

Neural Integration of Continuous Dynamics

Authors: Margaret Trautner, Sai Ravela

Abstract: Neural dynamical systems are dynamical systems that are described at least in part by neural networks. The class of continuous-time neural dynamical systems must, however, be numerically integrated for simulation and learning. Here, we present a compact neural circuit for two common numerical integrators: the explicit fixed-step Runge-Kutta method of any order and the semi-implicit/predictor-corre… ▽ More Neural dynamical systems are dynamical systems that are described at least in part by neural networks. The class of continuous-time neural dynamical systems must, however, be numerically integrated for simulation and learning. Here, we present a compact neural circuit for two common numerical integrators: the explicit fixed-step Runge-Kutta method of any order and the semi-implicit/predictor-corrector Adams-Bashforth-Moulton method. Modeled as constant-sized recurrent networks embedding a continuous neural differential equation, they achieve fully neural temporal output. Using the polynomial class of dynamical systems, we demonstrate the equivalence of neural and numerical integration. △ Less

Submitted 22 November, 2019; originally announced November 2019.

MSC Class: 65D30

Showing 1–8 of 8 results for author: Trautner, M