Search | arXiv e-print repository

Sequential-in-time training of nonlinear parametrizations for solving time-dependent partial differential equations

Authors: Huan Zhang, Yifan Chen, Eric Vanden-Eijnden, Benjamin Peherstorfer

Abstract: Sequential-in-time methods solve a sequence of training problems to fit nonlinear parametrizations such as neural networks to approximate solution trajectories of partial differential equations over time. This work shows that sequential-in-time training methods can be understood broadly as either optimize-then-discretize (OtD) or discretize-then-optimize (DtO) schemes, which are well known concept… ▽ More Sequential-in-time methods solve a sequence of training problems to fit nonlinear parametrizations such as neural networks to approximate solution trajectories of partial differential equations over time. This work shows that sequential-in-time training methods can be understood broadly as either optimize-then-discretize (OtD) or discretize-then-optimize (DtO) schemes, which are well known concepts in numerical analysis. The unifying perspective leads to novel stability and a posteriori error analysis results that provide insights into theoretical and numerical aspects that are inherent to either OtD or DtO schemes such as the tangent space collapse phenomenon, which is a form of over-fitting. Additionally, the unified perspective facilitates establishing connections between variants of sequential-in-time training methods, which is demonstrated by identifying natural gradient descent methods on energy functionals as OtD schemes applied to the corresponding gradient flows. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2403.06732 [pdf, ps, other]

Greedy construction of quadratic manifolds for nonlinear dimensionality reduction and nonlinear model reduction

Authors: Paul Schwerdtner, Benjamin Peherstorfer

Abstract: Dimensionality reduction on quadratic manifolds augments linear approximations with quadratic correction terms. Previous works rely on linear approximations given by projections onto the first few leading principal components of the training data; however, linear approximations in subspaces spanned by the leading principal components alone can miss information that are necessary for the quadratic… ▽ More Dimensionality reduction on quadratic manifolds augments linear approximations with quadratic correction terms. Previous works rely on linear approximations given by projections onto the first few leading principal components of the training data; however, linear approximations in subspaces spanned by the leading principal components alone can miss information that are necessary for the quadratic correction terms to be efficient. In this work, we propose a greedy method that constructs subspaces from leading as well as later principal components so that the corresponding linear approximations can be corrected most efficiently with quadratic terms. Properties of the greedily constructed manifolds allow applying linear algebra reformulations so that the greedy method scales to data points with millions of dimensions. Numerical experiments demonstrate that an orders of magnitude higher accuracy is achieved with the greedily constructed quadratic manifolds compared to manifolds that are based on the leading principal components alone. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: 25 pages, 14 figures

MSC Class: 65F55; 62H25; 65F30; 68T09

arXiv:2402.14646 [pdf, other]

CoLoRA: Continuous low-rank adaptation for reduced implicit neural modeling of parameterized partial differential equations

Authors: Jules Berman, Benjamin Peherstorfer

Abstract: This work introduces reduced models based on Continuous Low Rank Adaptation (CoLoRA) that pre-train neural networks for a given partial differential equation and then continuously adapt low-rank weights in time to rapidly predict the evolution of solution fields at new physics parameters and new initial conditions. The adaptation can be either purely data-driven or via an equation-driven variation… ▽ More This work introduces reduced models based on Continuous Low Rank Adaptation (CoLoRA) that pre-train neural networks for a given partial differential equation and then continuously adapt low-rank weights in time to rapidly predict the evolution of solution fields at new physics parameters and new initial conditions. The adaptation can be either purely data-driven or via an equation-driven variational approach that provides Galerkin-optimal approximations. Because CoLoRA approximates solution fields locally in time, the rank of the weights can be kept small, which means that only few training trajectories are required offline so that CoLoRA is well suited for data-scarce regimes. Predictions with CoLoRA are orders of magnitude faster than with classical methods and their accuracy and parameter efficiency is higher compared to other neural network approaches. △ Less

Submitted 22 February, 2024; originally announced February 2024.

arXiv:2310.07485 [pdf, other]

Nonlinear embeddings for conserving Hamiltonians and other quantities with Neural Galerkin schemes

Authors: Paul Schwerdtner, Philipp Schulze, Jules Berman, Benjamin Peherstorfer

Abstract: This work focuses on the conservation of quantities such as Hamiltonians, mass, and momentum when solution fields of partial differential equations are approximated with nonlinear parametrizations such as deep networks. The proposed approach builds on Neural Galerkin schemes that are based on the Dirac--Frenkel variational principle to train nonlinear parametrizations sequentially in time. We firs… ▽ More This work focuses on the conservation of quantities such as Hamiltonians, mass, and momentum when solution fields of partial differential equations are approximated with nonlinear parametrizations such as deep networks. The proposed approach builds on Neural Galerkin schemes that are based on the Dirac--Frenkel variational principle to train nonlinear parametrizations sequentially in time. We first show that only adding constraints that aim to conserve quantities in continuous time can be insufficient because the nonlinear dependence on the parameters implies that even quantities that are linear in the solution fields become nonlinear in the parameters and thus are challenging to discretize in time. Instead, we propose Neural Galerkin schemes that compute at each time step an explicit embedding onto the manifold of nonlinearly parametrized solution fields to guarantee conservation of quantities. The embeddings can be combined with standard explicit and implicit time integration schemes. Numerical experiments demonstrate that the proposed approach conserves quantities up to machine precision. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: 29 pages, 8 figures

MSC Class: 65M22; 65P10; 68T07; 70H33

arXiv:2310.04867 [pdf, other]

Randomized Sparse Neural Galerkin Schemes for Solving Evolution Equations with Deep Networks

Authors: Jules Berman, Benjamin Peherstorfer

Abstract: Training neural networks sequentially in time to approximate solution fields of time-dependent partial differential equations can be beneficial for preserving causality and other physics properties; however, the sequential-in-time training is numerically challenging because training errors quickly accumulate and amplify over time. This work introduces Neural Galerkin schemes that update randomized… ▽ More Training neural networks sequentially in time to approximate solution fields of time-dependent partial differential equations can be beneficial for preserving causality and other physics properties; however, the sequential-in-time training is numerically challenging because training errors quickly accumulate and amplify over time. This work introduces Neural Galerkin schemes that update randomized sparse subsets of network parameters at each time step. The randomization avoids overfitting locally in time and so helps prevent the error from accumulating quickly over the sequential-in-time training, which is motivated by dropout that addresses a similar issue of overfitting due to neuron co-adaptation. The sparsity of the update reduces the computational costs of training without losing expressiveness because many of the network parameters are redundant locally at each time step. In numerical experiments with a wide range of evolution equations, the proposed scheme with randomized sparse updates is up to two orders of magnitude more accurate at a fixed computational budget and up to two orders of magnitude faster at a fixed accuracy than schemes with dense updates. △ Less

Submitted 7 October, 2023; originally announced October 2023.

arXiv:2307.14874 [pdf, ps, other]

Lookahead data-gathering strategies for online adaptive model reduction of transport-dominated problems

Authors: Rodrigo Singh, Wayne Isaac Tan Uy, Benjamin Peherstorfer

Abstract: Online adaptive model reduction efficiently reduces numerical models of transport-dominated problems by updating reduced spaces over time, which leads to nonlinear approximations on latent manifolds that can achieve a faster error decay than classical linear model reduction methods that keep reduced spaces fixed. Critical for online adaptive model reduction is coupling the full and reduced model t… ▽ More Online adaptive model reduction efficiently reduces numerical models of transport-dominated problems by updating reduced spaces over time, which leads to nonlinear approximations on latent manifolds that can achieve a faster error decay than classical linear model reduction methods that keep reduced spaces fixed. Critical for online adaptive model reduction is coupling the full and reduced model to judiciously gather data from the full model for adapting the reduced spaces so that accurate approximations of the evolving full-model solution fields can be maintained. In this work, we introduce lookahead data-gathering strategies that predict the next state of the full model for adapting reduced spaces towards dynamics that are likely to be seen in the immediate future. Numerical experiments demonstrate that the proposed lookahead strategies lead to accurate reduced models even for problems where previously introduced data-gathering strategies that look back in time fail to provide predictive models. The proposed lookahead strategies also improve the robustness and stability of online adaptive reduced models. △ Less

Submitted 27 July, 2023; originally announced July 2023.

arXiv:2307.12438 [pdf, other]

Multifidelity Covariance Estimation via Regression on the Manifold of Symmetric Positive Definite Matrices

Authors: Aimee Maurais, Terrence Alsup, Benjamin Peherstorfer, Youssef Marzouk

Abstract: We introduce a multifidelity estimator of covariance matrices formulated as the solution to a regression problem on the manifold of symmetric positive definite matrices. The estimator is positive definite by construction, and the Mahalanobis distance minimized to obtain it possesses properties which enable practical computation. We show that our manifold regression multifidelity (MRMF) covariance… ▽ More We introduce a multifidelity estimator of covariance matrices formulated as the solution to a regression problem on the manifold of symmetric positive definite matrices. The estimator is positive definite by construction, and the Mahalanobis distance minimized to obtain it possesses properties which enable practical computation. We show that our manifold regression multifidelity (MRMF) covariance estimator is a maximum likelihood estimator under a certain error model on manifold tangent space. More broadly, we show that our Riemannian regression framework encompasses existing multifidelity covariance estimators constructed from control variates. We demonstrate via numerical examples that our estimator can provide significant decreases, up to one order of magnitude, in squared estimation error relative to both single-fidelity and other multifidelity covariance estimators. Furthermore, preservation of positive definiteness ensures that our estimator is compatible with downstream tasks, such as data assimilation and metric learning, in which this property is essential. △ Less

Submitted 24 July, 2023; v1 submitted 23 July, 2023; originally announced July 2023.

Comments: 30 pages + 15-page supplement

arXiv:2306.15630 [pdf, ps, other]

Coupling parameter and particle dynamics for adaptive sampling in Neural Galerkin schemes

Authors: Yuxiao Wen, Eric Vanden-Eijnden, Benjamin Peherstorfer

Abstract: Training nonlinear parametrizations such as deep neural networks to numerically approximate solutions of partial differential equations is often based on minimizing a loss that includes the residual, which is analytically available in limited settings only. At the same time, empirically estimating the training loss is challenging because residuals and related quantities can have high variance, esp… ▽ More Training nonlinear parametrizations such as deep neural networks to numerically approximate solutions of partial differential equations is often based on minimizing a loss that includes the residual, which is analytically available in limited settings only. At the same time, empirically estimating the training loss is challenging because residuals and related quantities can have high variance, especially for transport-dominated and high-dimensional problems that exhibit local features such as waves and coherent structures. Thus, estimators based on data samples from un-informed, uniform distributions are inefficient. This work introduces Neural Galerkin schemes that estimate the training loss with data from adaptive distributions, which are empirically represented via ensembles of particles. The ensembles are actively adapted by evolving the particles with dynamics coupled to the nonlinear parametrizations of the solution fields so that the ensembles remain informative for estimating the training loss. Numerical experiments indicate that few dynamic particles are sufficient for obtaining accurate empirical estimates of the training loss, even for problems with local features and with high-dimensional spatial domains. △ Less

Submitted 27 June, 2023; originally announced June 2023.

arXiv:2302.09521 [pdf, other]

Rank-Minimizing and Structured Model Inference

Authors: Pawan Goyal, Benjamin Peherstorfer, Peter Benner

Abstract: While extracting information from data with machine learning plays an increasingly important role, physical laws and other first principles continue to provide critical insights about systems and processes of interest in science and engineering. This work introduces a method that infers models from data with physical insights encoded in the form of structure and that minimizes the model order so t… ▽ More While extracting information from data with machine learning plays an increasingly important role, physical laws and other first principles continue to provide critical insights about systems and processes of interest in science and engineering. This work introduces a method that infers models from data with physical insights encoded in the form of structure and that minimizes the model order so that the training data are fitted well while redundant degrees of freedom without conditions and sufficient data to fix them are automatically eliminated. The models are formulated via solution matrices of specific instances of generalized Sylvester equations that enforce interpolation of the training data and relate the model order to the rank of the solution matrices. The proposed method numerically solves the Sylvester equations for minimal-rank solutions and so obtains models of low order. Numerical experiments demonstrate that the combination of structure preservation and rank minimization leads to accurate models with orders of magnitude fewer degrees of freedom than models of comparable prediction quality that are learned with structure preservation alone. △ Less

Submitted 19 February, 2023; originally announced February 2023.

arXiv:2301.13749 [pdf, ps, other]

Multi-Fidelity Covariance Estimation in the Log-Euclidean Geometry

Authors: Aimee Maurais, Terrence Alsup, Benjamin Peherstorfer, Youssef Marzouk

Abstract: We introduce a multi-fidelity estimator of covariance matrices that employs the log-Euclidean geometry of the symmetric positive-definite manifold. The estimator fuses samples from a hierarchy of data sources of differing fidelities and costs for variance reduction while guaranteeing definiteness, in contrast with previous approaches. The new estimator makes covariance estimation tractable in appl… ▽ More We introduce a multi-fidelity estimator of covariance matrices that employs the log-Euclidean geometry of the symmetric positive-definite manifold. The estimator fuses samples from a hierarchy of data sources of differing fidelities and costs for variance reduction while guaranteeing definiteness, in contrast with previous approaches. The new estimator makes covariance estimation tractable in applications where simulation or data collection is expensive; to that end, we develop an optimal sample allocation scheme that minimizes the mean-squared error of the estimator given a fixed budget. Guaranteed definiteness is crucial to metric learning, data assimilation, and other downstream tasks. Evaluations of our approach using data from physical applications (heat conduction, fluid dynamics) demonstrate more accurate metric learning and speedups of more than one order of magnitude compared to benchmarks. △ Less

Submitted 26 May, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

Comments: To appear at the International Conference on Machine Learning (ICML) 2023

arXiv:2301.07280 [pdf, ps, other]

Meta variance reduction for Monte Carlo estimation of energetic particle confinement during stellarator optimization

Authors: Frederick Law, Antoine Cerfon, Benjamin Peherstorfer, Florian Wechsung

Abstract: This work introduces meta estimators that combine multiple multifidelity techniques based on control variates, importance sampling, and information reuse to yield a quasi-multiplicative amount of variance reduction. The proposed meta estimators are particularly efficient within outer-loop applications when the input distribution of the uncertainties changes during the outer loop, which is often th… ▽ More This work introduces meta estimators that combine multiple multifidelity techniques based on control variates, importance sampling, and information reuse to yield a quasi-multiplicative amount of variance reduction. The proposed meta estimators are particularly efficient within outer-loop applications when the input distribution of the uncertainties changes during the outer loop, which is often the case in reliability-based design and shape optimization. We derive asymptotic bounds of the variance reduction of the meta estimators in the limit of convergence of the outer-loop results. We demonstrate the meta estimators, using data-driven surrogate models and biasing densities, on a design problem under uncertainty motivated by magnetic confinement fusion, namely the optimization of stellarator coil designs to maximize the estimated confinement of energetic particles. The meta estimators outperform all of their constituent variance reduction techniques alone, ultimately yielding two orders of magnitude speedup compared to standard Monte Carlo estimation at the same computational budget. △ Less

Submitted 17 January, 2023; originally announced January 2023.

arXiv:2212.03366 [pdf, ps, other]

Further analysis of multilevel Stein variational gradient descent with an application to the Bayesian inference of glacier ice models

Authors: Terrence Alsup, Tucker Hartland, Benjamin Peherstorfer, Noemi Petra

Abstract: Multilevel Stein variational gradient descent is a method for particle-based variational inference that leverages hierarchies of surrogate target distributions with varying costs and fidelity to computationally speed up inference. The contribution of this work is twofold. First, an extension of a previous cost complexity analysis is presented that applies even when the exponential convergence rate… ▽ More Multilevel Stein variational gradient descent is a method for particle-based variational inference that leverages hierarchies of surrogate target distributions with varying costs and fidelity to computationally speed up inference. The contribution of this work is twofold. First, an extension of a previous cost complexity analysis is presented that applies even when the exponential convergence rate of single-level Stein variational gradient descent depends on iteration-varying parameters. Second, multilevel Stein variational gradient descent is applied to a large-scale Bayesian inverse problem of inferring discretized basal sliding coefficient fields of the Arolla glacier ice. The numerical experiments demonstrate that the multilevel version achieves orders of magnitude speedups compared to its single-level version. △ Less

Submitted 29 April, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

arXiv:2212.01418 [pdf, ps, other]

Operator inference with roll outs for learning reduced models from scarce and low-quality data

Authors: Wayne Isaac Tan Uy, Dirk Hartmann, Benjamin Peherstorfer

Abstract: Data-driven modeling has become a key building block in computational science and engineering. However, data that are available in science and engineering are typically scarce, often polluted with noise and affected by measurement errors and other perturbations, which makes learning the dynamics of systems challenging. In this work, we propose to combine data-driven modeling via operator inference… ▽ More Data-driven modeling has become a key building block in computational science and engineering. However, data that are available in science and engineering are typically scarce, often polluted with noise and affected by measurement errors and other perturbations, which makes learning the dynamics of systems challenging. In this work, we propose to combine data-driven modeling via operator inference with the dynamic training via roll outs of neural ordinary differential equations. Operator inference with roll outs inherits interpretability, scalability, and structure preservation of traditional operator inference while leveraging the dynamic training via roll outs over multiple time steps to increase stability and robustness for learning from low-quality and noisy data. Numerical experiments with data describing shallow water waves and surface quasi-geostrophic dynamics demonstrate that operator inference with roll outs provides predictive models from training trajectories even if data are sampled sparsely in time and polluted with noise of up to 10%. △ Less

Submitted 2 December, 2022; originally announced December 2022.

arXiv:2211.10835 [pdf, other]

Context-aware learning of hierarchies of low-fidelity models for multi-fidelity uncertainty quantification

Authors: Ionut-Gabriel Farcas, Benjamin Peherstorfer, Tobias Neckel, Frank Jenko, Hans-Joachim Bungartz

Abstract: Multi-fidelity Monte Carlo methods leverage low-fidelity and surrogate models for variance reduction to make tractable uncertainty quantification even when numerically simulating the physical systems of interest with high-fidelity models is computationally expensive. This work proposes a context-aware multi-fidelity Monte Carlo method that optimally balances the costs of training low-fidelity mode… ▽ More Multi-fidelity Monte Carlo methods leverage low-fidelity and surrogate models for variance reduction to make tractable uncertainty quantification even when numerically simulating the physical systems of interest with high-fidelity models is computationally expensive. This work proposes a context-aware multi-fidelity Monte Carlo method that optimally balances the costs of training low-fidelity models with the costs of Monte Carlo sampling. It generalizes the previously developed context-aware bi-fidelity Monte Carlo method to hierarchies of multiple models and to more general types of low-fidelity models. When training low-fidelity models, the proposed approach takes into account the context in which the learned low-fidelity models will be used, namely for variance reduction in Monte Carlo estimation, which allows it to find optimal trade-offs between training and sampling to minimize upper bounds of the mean-squared errors of the estimators for given computational budgets. This is in stark contrast to traditional surrogate modeling and model reduction techniques that construct low-fidelity models with the primary goal of approximating well the high-fidelity model outputs and typically ignore the context in which the learned models will be used in upstream tasks. The proposed context-aware multi-fidelity Monte Carlo method applies to hierarchies of a wide range of types of low-fidelity models such as sparse-grid and deep-network models. Numerical experiments with the gyrokinetic simulation code \textsc{Gene} show speedups of up to two orders of magnitude compared to standard estimators when quantifying uncertainties in small-scale fluctuations in confined plasma in fusion reactors. This corresponds to a runtime reduction from 72 days to about four hours on one node of the Lonestar6 supercomputer at the Texas Advanced Computing Center. △ Less

Submitted 19 November, 2022; originally announced November 2022.

Comments: 25 pages, 12 figures, 3 tables

arXiv:2209.06957 [pdf, ps, other]

Reduced models with nonlinear approximations of latent dynamics for model premixed flame problems

Authors: Wayne Isaac Tan Uy, Christopher R. Wentland, Cheng Huang, Benjamin Peherstorfer

Abstract: Efficiently reducing models of chemically reacting flows is often challenging because their characteristic features such as sharp gradients in the flow fields and couplings over various time and length scales lead to dynamics that evolve in high-dimensional spaces. In this work, we show that online adaptive reduced models that construct nonlinear approximations by adapting low-dimensional subspace… ▽ More Efficiently reducing models of chemically reacting flows is often challenging because their characteristic features such as sharp gradients in the flow fields and couplings over various time and length scales lead to dynamics that evolve in high-dimensional spaces. In this work, we show that online adaptive reduced models that construct nonlinear approximations by adapting low-dimensional subspaces over time can predict well latent dynamics with properties similar to those found in chemically reacting flows. The adaptation of the subspaces is driven by the online adaptive empirical interpolation method, which takes sparse residual evaluations of the full model to compute low-rank basis updates of the subspaces. Numerical experiments with a premixed flame model problem show that reduced models based on online adaptive empirical interpolation accurately predict flame dynamics far outside of the training regime and in regimes where traditional static reduced models, which keep reduced spaces fixed over time and so provide only linear approximations of latent dynamics, fail to make meaningful predictions. △ Less

Submitted 14 September, 2022; originally announced September 2022.

arXiv:2207.11049 [pdf, other]

doi 10.1098/rspa.2022.0506

Context-aware controller inference for stabilizing dynamical systems from scarce data

Authors: Steffen W. R. Werner, Benjamin Peherstorfer

Abstract: This work introduces a data-driven control approach for stabilizing high-dimensional dynamical systems from scarce data. The proposed context-aware controller inference approach is based on the observation that controllers need to act locally only on the unstable dynamics to stabilize systems. This means it is sufficient to learn the unstable dynamics alone, which are typically confined to much lo… ▽ More This work introduces a data-driven control approach for stabilizing high-dimensional dynamical systems from scarce data. The proposed context-aware controller inference approach is based on the observation that controllers need to act locally only on the unstable dynamics to stabilize systems. This means it is sufficient to learn the unstable dynamics alone, which are typically confined to much lower dimensional spaces than the high-dimensional state spaces of all system dynamics and thus few data samples are sufficient to identify them. Numerical experiments demonstrate that context-aware controller inference learns stabilizing controllers from orders of magnitude fewer data samples than traditional data-driven control techniques and variants of reinforcement learning. The experiments further show that the low data requirements of context-aware controller inference are especially beneficial in data-scarce engineering problems with complex physics, for which learning complete system dynamics is often intractable in terms of data and training costs. △ Less

Submitted 18 January, 2023; v1 submitted 22 July, 2022; originally announced July 2022.

Comments: 27 pages, 10 figures

Journal ref: Proc. R. Soc. A: Math. Phys. Eng. Sci., 479(2270):20220506, 2023

arXiv:2205.15050 [pdf, other]

doi 10.1137/22M1500137

Multi-fidelity robust controller design with gradient sampling

Authors: Steffen W. R. Werner, Michael L. Overton, Benjamin Peherstorfer

Abstract: Robust controllers that stabilize dynamical systems even under disturbances and noise are often formulated as solutions of nonsmooth, nonconvex optimization problems. While methods such as gradient sampling can handle the nonconvexity and nonsmoothness, the costs of evaluating the objective function may be substantial, making robust control challenging for dynamical systems with high-dimensional s… ▽ More Robust controllers that stabilize dynamical systems even under disturbances and noise are often formulated as solutions of nonsmooth, nonconvex optimization problems. While methods such as gradient sampling can handle the nonconvexity and nonsmoothness, the costs of evaluating the objective function may be substantial, making robust control challenging for dynamical systems with high-dimensional state spaces. In this work, we introduce multi-fidelity variants of gradient sampling that leverage low-cost, low-fidelity models with low-dimensional state spaces for speeding up the optimization process while nonetheless providing convergence guarantees for a high-fidelity model of the system of interest, which is primarily accessed in the last phase of the optimization process. Our first multi-fidelity method initiates gradient sampling on higher fidelity models with starting points obtained from cheaper, lower fidelity models. Our second multi-fidelity method relies on ensembles of gradients that are computed from low- and high-fidelity models. Numerical experiments with controlling the cooling of a steel rail profile and laminar flow in a cylinder wake demonstrate that our new multi-fidelity gradient sampling methods achieve up to two orders of magnitude speedup compared to the single-fidelity gradient sampling method that relies on the high-fidelity model alone. △ Less

Submitted 5 December, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

Comments: 28 pages, 4 figures

MSC Class: 37N35; 37N40; 65K10; 90C30; 90C59

Journal ref: SIAM J. Sci. Comput., 45(2):A933-A957, 2023

arXiv:2203.01360 [pdf, other]

doi 10.1016/j.jcp.2023.112588

Neural Galerkin Schemes with Active Learning for High-Dimensional Evolution Equations

Authors: Joan Bruna, Benjamin Peherstorfer, Eric Vanden-Eijnden

Abstract: Deep neural networks have been shown to provide accurate function approximations in high dimensions. However, fitting network parameters requires informative training data that are often challenging to collect in science and engineering applications. This work proposes Neural Galerkin schemes based on deep learning that generate training data with active learning for numerically solving high-dimen… ▽ More Deep neural networks have been shown to provide accurate function approximations in high dimensions. However, fitting network parameters requires informative training data that are often challenging to collect in science and engineering applications. This work proposes Neural Galerkin schemes based on deep learning that generate training data with active learning for numerically solving high-dimensional partial differential equations. Neural Galerkin schemes build on the Dirac-Frenkel variational principle to train networks by minimizing the residual sequentially over time, which enables adaptively collecting new training data in a self-informed manner that is guided by the dynamics described by the partial differential equations. This is in contrast to other machine learning methods that aim to fit network parameters globally in time without taking into account training data acquisition. Our finding is that the active form of gathering training data of the proposed Neural Galerkin schemes is key for numerically realizing the expressive power of networks in high dimensions. Numerical experiments demonstrate that Neural Galerkin schemes have the potential to enable simulating phenomena and processes with many variables for which traditional and other deep-learning-based solvers fail, especially when features of the solutions evolve locally such as in high-dimensional wave propagation problems and interacting particle systems described by Fokker-Planck and kinetic equations. △ Less

Submitted 29 February, 2024; v1 submitted 2 March, 2022; originally announced March 2022.

Journal ref: Journal of Computational Physics, Volume 496, 2024

arXiv:2203.00474 [pdf, other]

doi 10.1007/s10208-023-09605-y

On the sample complexity of stabilizing linear dynamical systems from data

Authors: Steffen W. R. Werner, Benjamin Peherstorfer

Abstract: Learning controllers from data for stabilizing dynamical systems typically follows a two step process of first identifying a model and then constructing a controller based on the identified model. However, learning models means identifying generic descriptions of the dynamics of systems, which can require large amounts of data and extracting information that are unnecessary for the specific task o… ▽ More Learning controllers from data for stabilizing dynamical systems typically follows a two step process of first identifying a model and then constructing a controller based on the identified model. However, learning models means identifying generic descriptions of the dynamics of systems, which can require large amounts of data and extracting information that are unnecessary for the specific task of stabilization. The contribution of this work is to show that if a linear dynamical system has dimension (McMillan degree) $n$, then there always exist $n$ states from which a stabilizing feedback controller can be constructed, independent of the dimension of the representation of the observed states and the number of inputs. By building on previous work, this finding implies that any linear dynamical system can be stabilized from fewer observed states than the minimal number of states required for learning a model of the dynamics. The theoretical findings are demonstrated with numerical experiments that show the stabilization of the flow behind a cylinder from less data than necessary for learning a model. △ Less

Submitted 22 July, 2022; v1 submitted 28 February, 2022; originally announced March 2022.

Comments: 29 pages, 4 figures

MSC Class: 65F55; 65P99; 93B52; 93C57; 93D15

Journal ref: Found. Comput. Math., 24(3):955-987, 2024

arXiv:2108.06408 [pdf, ps, other]

doi 10.1088/1741-4326/ac4777

Accelerating the estimation of energetic particle confinement statistics in stellarators using multifidelity Monte Carlo

Authors: Frederick Law, Antoine Cerfon, Benjamin Peherstorfer

Abstract: In the design of stellarators, energetic particle confinement is a critical point of concern which remains challenging to study from a numerical point of view. Standard Monte Carlo analyses are highly expensive because a large number of particle trajectories need to be integrated over long time scales, and small time steps must be taken to accurately capture the features of the wide variety of tra… ▽ More In the design of stellarators, energetic particle confinement is a critical point of concern which remains challenging to study from a numerical point of view. Standard Monte Carlo analyses are highly expensive because a large number of particle trajectories need to be integrated over long time scales, and small time steps must be taken to accurately capture the features of the wide variety of trajectories. Even when they are based on guiding center trajectories, as opposed to full-orbit trajectories, these standard Monte Carlo studies are too expensive to be included in most stellarator optimization codes. We present the first multifidelity Monte Carlo scheme for accelerating the estimation of energetic particle confinement in stellarators. Our approach relies on a two-level hierarchy, in which a guiding center model serves as the high-fidelity model, and a data-driven linear interpolant is leveraged as the low-fidelity surrogate model. We apply multifidelity Monte Carlo to the study of energetic particle confinement in a 4-period quasi-helically symmetric stellarator, assessing various metrics of confinement. Stemming from the very high computational efficiency of our surrogate model as well as its sufficient correlation to the high-fidelity model, we obtain speedups of up to 10 with multifidelity Monte Carlo compared to standard Monte Carlo. △ Less

Submitted 13 August, 2021; originally announced August 2021.

Comments: 18 pages

arXiv:2107.09256 [pdf, ps, other]

Active operator inference for learning low-dimensional dynamical-system models from noisy data

Authors: Wayne Isaac Tan Uy, Yuepeng Wang, Yuxiao Wen, Benjamin Peherstorfer

Abstract: Noise poses a challenge for learning dynamical-system models because already small variations can distort the dynamics described by trajectory data. This work builds on operator inference from scientific machine learning to infer low-dimensional models from high-dimensional state trajectories polluted with noise. The presented analysis shows that, under certain conditions, the inferred operators a… ▽ More Noise poses a challenge for learning dynamical-system models because already small variations can distort the dynamics described by trajectory data. This work builds on operator inference from scientific machine learning to infer low-dimensional models from high-dimensional state trajectories polluted with noise. The presented analysis shows that, under certain conditions, the inferred operators are unbiased estimators of the well-studied projection-based reduced operators from traditional model reduction. Furthermore, the connection between operator inference and projection-based model reduction enables bounding the mean-squared errors of predictions made with the learned models with respect to traditional reduced models. The analysis also motivates an active operator inference approach that judiciously samples high-dimensional trajectories with the aim of achieving a low mean-squared error by reducing the effect of noise. Numerical experiments with high-dimensional linear and nonlinear state dynamics demonstrate that predictions obtained with active operator inference have orders of magnitude lower mean-squared errors than operator inference with traditional, equidistantly sampled trajectory data. △ Less

Submitted 25 July, 2021; v1 submitted 20 July, 2021; originally announced July 2021.

arXiv:2107.02597 [pdf, ps, other]

Physics-informed regularization and structure preservation for learning stable reduced models from data with operator inference

Authors: Nihar Sawant, Boris Kramer, Benjamin Peherstorfer

Abstract: Operator inference learns low-dimensional dynamical-system models with polynomial nonlinear terms from trajectories of high-dimensional physical systems (non-intrusive model reduction). This work focuses on the large class of physical systems that can be well described by models with quadratic nonlinear terms and proposes a regularizer for operator inference that induces a stability bias onto quad… ▽ More Operator inference learns low-dimensional dynamical-system models with polynomial nonlinear terms from trajectories of high-dimensional physical systems (non-intrusive model reduction). This work focuses on the large class of physical systems that can be well described by models with quadratic nonlinear terms and proposes a regularizer for operator inference that induces a stability bias onto quadratic models. The proposed regularizer is physics informed in the sense that it penalizes quadratic terms with large norms and so explicitly leverages the quadratic model form that is given by the underlying physics. This means that the proposed approach judiciously learns from data and physical insights combined, rather than from either data or physics alone. Additionally, a formulation of operator inference is proposed that enforces model constraints for preserving structure such as symmetry and definiteness in the linear terms. Numerical results demonstrate that models learned with operator inference and the proposed regularizer and structure preservation are accurate and stable even in cases where using no regularization or Tikhonov regularization leads to models that are unstable. △ Less

Submitted 6 July, 2021; originally announced July 2021.

arXiv:2104.01945 [pdf, other]

Multilevel Stein variational gradient descent with applications to Bayesian inverse problems

Authors: Terrence Alsup, Luca Venturi, Benjamin Peherstorfer

Abstract: This work presents a multilevel variant of Stein variational gradient descent to more efficiently sample from target distributions. The key ingredient is a sequence of distributions with growing fidelity and costs that converges to the target distribution of interest. For example, such a sequence of distributions is given by a hierarchy of ever finer discretization levels of the forward model in B… ▽ More This work presents a multilevel variant of Stein variational gradient descent to more efficiently sample from target distributions. The key ingredient is a sequence of distributions with growing fidelity and costs that converges to the target distribution of interest. For example, such a sequence of distributions is given by a hierarchy of ever finer discretization levels of the forward model in Bayesian inverse problems. The proposed multilevel Stein variational gradient descent moves most of the iterations to lower, cheaper levels with the aim of requiring only a few iterations on the higher, more expensive levels when compared to the traditional, single-level Stein variational gradient descent variant that uses the highest-level distribution only. Under certain assumptions, in the mean-field limit, the error of the proposed multilevel Stein method decays by a log factor faster than the error of the single-level counterpart with respect to computational costs. Numerical experiments with Bayesian inverse problems show speedups of more than one order of magnitude of the proposed multilevel Stein method compared to the single-level variant that uses the highest level only. △ Less

Submitted 5 April, 2021; originally announced April 2021.

MSC Class: 65C05; 35R60; 62F15; 65C35

arXiv:2103.01362 [pdf, ps, other]

Operator inference of non-Markovian terms for learning reduced models from partially observed state trajectories

Authors: Wayne Isaac Tan Uy, Benjamin Peherstorfer

Abstract: This work introduces a non-intrusive model reduction approach for learning reduced models from partially observed state trajectories of high-dimensional dynamical systems. The proposed approach compensates for the loss of information due to the partially observed states by constructing non-Markovian reduced models that make future-state predictions based on a history of reduced states, in contrast… ▽ More This work introduces a non-intrusive model reduction approach for learning reduced models from partially observed state trajectories of high-dimensional dynamical systems. The proposed approach compensates for the loss of information due to the partially observed states by constructing non-Markovian reduced models that make future-state predictions based on a history of reduced states, in contrast to traditional Markovian reduced models that rely on the current reduced state alone to predict the next state. The core contributions of this work are a data sampling scheme to sample partially observed states from high-dimensional dynamical systems and a formulation of a regression problem to fit the non-Markovian reduced terms to the sampled states. Under certain conditions, the proposed approach recovers from data the very same non-Markovian terms that one obtains with intrusive methods that require the governing equations and discrete operators of the high-dimensional dynamical system. Numerical results demonstrate that the proposed approach leads to non-Markovian reduced models that are predictive far beyond the training regime. Additionally, in the numerical experiments, the proposed approach learns non-Markovian reduced models from trajectories with only 20% observed state components that are about as accurate as traditional Markovian reduced models fitted to trajectories with 99% observed components. △ Less

Submitted 26 March, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

arXiv:2010.11708 [pdf, ps, other]

Context-aware surrogate modeling for balancing approximation and sampling costs in multi-fidelity importance sampling and Bayesian inverse problems

Authors: Terrence Alsup, Benjamin Peherstorfer

Abstract: Multi-fidelity methods leverage low-cost surrogate models to speed up computations and make occasional recourse to expensive high-fidelity models to establish accuracy guarantees. Because surrogate and high-fidelity models are used together, poor predictions by surrogate models can be compensated with frequent recourse to high-fidelity models. Thus, there is a trade-off between investing computati… ▽ More Multi-fidelity methods leverage low-cost surrogate models to speed up computations and make occasional recourse to expensive high-fidelity models to establish accuracy guarantees. Because surrogate and high-fidelity models are used together, poor predictions by surrogate models can be compensated with frequent recourse to high-fidelity models. Thus, there is a trade-off between investing computational resources to improve the accuracy of surrogate models versus simply making more frequent recourse to expensive high-fidelity models; however, this trade-off is ignored by traditional modeling methods that construct surrogate models that are meant to replace high-fidelity models rather than being used together with high-fidelity models. This work considers multi-fidelity importance sampling and theoretically and computationally trades off increasing the fidelity of surrogate models for constructing more accurate biasing densities and the numbers of samples that are required from the high-fidelity models to compensate poor biasing densities. Numerical examples demonstrate that such context-aware surrogate models for multi-fidelity importance sampling have lower fidelity than what typically is set as tolerance in traditional model reduction, leading to runtime speedups of up to one order of magnitude in the presented examples. △ Less

Submitted 12 September, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

MSC Class: 65C60; 65C05; 35R60; 62F15

arXiv:2007.13977 [pdf, other]

Depth separation for reduced deep networks in nonlinear model reduction: Distilling shock waves in nonlinear hyperbolic problems

Authors: Donsub Rim, Luca Venturi, Joan Bruna, Benjamin Peherstorfer

Abstract: Classical reduced models are low-rank approximations using a fixed basis designed to achieve dimensionality reduction of large-scale systems. In this work, we introduce reduced deep networks, a generalization of classical reduced models formulated as deep neural networks. We prove depth separation results showing that reduced deep networks approximate solutions of parametrized hyperbolic partial d… ▽ More Classical reduced models are low-rank approximations using a fixed basis designed to achieve dimensionality reduction of large-scale systems. In this work, we introduce reduced deep networks, a generalization of classical reduced models formulated as deep neural networks. We prove depth separation results showing that reduced deep networks approximate solutions of parametrized hyperbolic partial differential equations with approximation error $ε$ with $\mathcal{O}(|\log(ε)|)$ degrees of freedom, even in the nonlinear setting where solutions exhibit shock waves. We also show that classical reduced models achieve exponentially worse approximation rates by establishing lower bounds on the relevant Kolmogorov $N$-widths. △ Less

Submitted 27 July, 2020; originally announced July 2020.

MSC Class: 68T07; 65M22; 41A46

arXiv:2005.05890 [pdf, ps, other]

Probabilistic error estimation for non-intrusive reduced models learned from data of systems governed by linear parabolic partial differential equations

Authors: Wayne Isaac Tan Uy, Benjamin Peherstorfer

Abstract: This work derives a residual-based a posteriori error estimator for reduced models learned with non-intrusive model reduction from data of high-dimensional systems governed by linear parabolic partial differential equations with control inputs. It is shown that quantities that are necessary for the error estimator can be either obtained exactly as the solutions of least-squares problems in a non-i… ▽ More This work derives a residual-based a posteriori error estimator for reduced models learned with non-intrusive model reduction from data of high-dimensional systems governed by linear parabolic partial differential equations with control inputs. It is shown that quantities that are necessary for the error estimator can be either obtained exactly as the solutions of least-squares problems in a non-intrusive way from data such as initial conditions, control inputs, and high-dimensional solution trajectories or bounded in a probabilistic sense. The computational procedure follows an offline/online decomposition. In the offline (training) phase, the high-dimensional system is judiciously solved in a black-box fashion to generate data and to set up the error estimator. In the online phase, the estimator is used to bound the error of the reduced-model predictions for new initial conditions and new control inputs without recourse to the high-dimensional system. Numerical results demonstrate the workflow of the proposed approach from data to reduced models to certified predictions. △ Less

Submitted 12 May, 2020; originally announced May 2020.

arXiv:2002.09726 [pdf, other]

doi 10.1016/j.cma.2020.113433

Operator inference for non-intrusive model reduction of systems with non-polynomial nonlinear terms

Authors: Peter Benner, Pawan Goyal, Boris Kramer, Benjamin Peherstorfer, Karen Willcox

Abstract: This work presents a non-intrusive model reduction method to learn low-dimensional models of dynamical systems with non-polynomial nonlinear terms that are spatially local and that are given in analytic form. In contrast to state-of-the-art model reduction methods that are intrusive and thus require full knowledge of the governing equations and the operators of a full model of the discretized dyna… ▽ More This work presents a non-intrusive model reduction method to learn low-dimensional models of dynamical systems with non-polynomial nonlinear terms that are spatially local and that are given in analytic form. In contrast to state-of-the-art model reduction methods that are intrusive and thus require full knowledge of the governing equations and the operators of a full model of the discretized dynamical system, the proposed approach requires only the non-polynomial terms in analytic form and learns the rest of the dynamics from snapshots computed with a potentially black-box full-model solver. The proposed method learns operators for the linear and polynomially nonlinear dynamics via a least-squares problem, where the given non-polynomial terms are incorporated in the right-hand side. The least-squares problem is linear and thus can be solved efficiently in practice. The proposed method is demonstrated on three problems governed by partial differential equations, namely the diffusion-reaction Chafee-Infante model, a tubular reactor model for reactive flows, and a batch-chromatography model that describes a chemical separation process. The numerical results provide evidence that the proposed approach learns reduced models that achieve comparable accuracy as models constructed with state-of-the-art intrusive model reduction methods that require full knowledge of the governing equations. △ Less

Submitted 19 September, 2020; v1 submitted 22 February, 2020; originally announced February 2020.

arXiv:1912.13024 [pdf, other]

Manifold Approximations via Transported Subspaces: Model reduction for transport-dominated problems

Authors: Donsub Rim, Benjamin Peherstorfer, Kyle T. Mandli

Abstract: This work presents a method for constructing online-efficient reduced models of large-scale systems governed by parametrized nonlinear scalar conservation laws. The solution manifolds induced by transport-dominated problems such as hyperbolic conservation laws typically exhibit nonlinear structures, which means that traditional model reduction methods based on linear approximations are inefficient… ▽ More This work presents a method for constructing online-efficient reduced models of large-scale systems governed by parametrized nonlinear scalar conservation laws. The solution manifolds induced by transport-dominated problems such as hyperbolic conservation laws typically exhibit nonlinear structures, which means that traditional model reduction methods based on linear approximations are inefficient when applied to these problems. In contrast, the approach introduced in this work derives reduced approximations that are nonlinear by explicitly composing global transport dynamics with locally linear approximations of the solution manifolds. A time-step** scheme evolves the nonlinear reduced models by transporting local approximation spaces along the characteristic curves of the governing equations. The proposed computational procedure allows an offline/online decomposition and is online-efficient in the sense that the complexity of accurately time-step** the nonlinear reduced model is independent of that of the full model. Numerical experiments with transport through heterogeneous media and the Burgers' equation show orders of magnitude speedups of the proposed nonlinear reduced models based on transported subspaces compared to traditional linear reduced models and full models. △ Less

Submitted 30 December, 2020; v1 submitted 30 December, 2019; originally announced December 2019.

MSC Class: 78M34; 41A46; 35F20; 78M12

arXiv:1912.08177 [pdf, other]

doi 10.1016/j.physd.2020.132401

Lift & Learn: Physics-informed machine learning for large-scale nonlinear dynamical systems

Authors: Elizabeth Qian, Boris Kramer, Benjamin Peherstorfer, Karen Willcox

Abstract: We present Lift & Learn, a physics-informed method for learning low-dimensional models for large-scale dynamical systems. The method exploits knowledge of a system's governing equations to identify a coordinate transformation in which the system dynamics have quadratic structure. This transformation is called a lifting map because it often adds auxiliary variables to the system state. The lifting… ▽ More We present Lift & Learn, a physics-informed method for learning low-dimensional models for large-scale dynamical systems. The method exploits knowledge of a system's governing equations to identify a coordinate transformation in which the system dynamics have quadratic structure. This transformation is called a lifting map because it often adds auxiliary variables to the system state. The lifting map is applied to data obtained by evaluating a model for the original nonlinear system. This lifted data is projected onto its leading principal components, and low-dimensional linear and quadratic matrix operators are fit to the lifted reduced data using a least-squares operator inference procedure. Analysis of our method shows that the Lift & Learn models are able to capture the system physics in the lifted coordinates at least as accurately as traditional intrusive model reduction approaches. This preservation of system physics makes the Lift & Learn models robust to changes in inputs. Numerical experiments on the FitzHugh-Nagumo neuron activation model and the compressible Euler equations demonstrate the generalizability of our model. △ Less

Submitted 26 March, 2020; v1 submitted 17 December, 2019; originally announced December 2019.

Journal ref: Physica D: Nonlinear Phenomena, Volume 406, p. 132401, 2020

arXiv:1910.00110 [pdf, ps, other]

Learning low-dimensional dynamical-system models from noisy frequency-response data with Loewner rational interpolation

Authors: Zlatko Drmač, Benjamin Peherstorfer

Abstract: Loewner rational interpolation provides a versatile tool to learn low-dimensional dynamical-system models from frequency-response measurements. This work investigates the robustness of the Loewner approach to noise. The key finding is that if the measurements are polluted with Gaussian noise, then the error due to noise grows at most linearly with the standard deviation with high probability under… ▽ More Loewner rational interpolation provides a versatile tool to learn low-dimensional dynamical-system models from frequency-response measurements. This work investigates the robustness of the Loewner approach to noise. The key finding is that if the measurements are polluted with Gaussian noise, then the error due to noise grows at most linearly with the standard deviation with high probability under certain conditions. The analysis gives insights into making the Loewner approach robust against noise via linear transformations and judicious selections of measurements. Numerical results demonstrate the linear growth of the error on benchmark examples. △ Less

Submitted 4 November, 2020; v1 submitted 30 September, 2019; originally announced October 2019.

arXiv:1908.11233 [pdf, ps, other]

Sampling low-dimensional Markovian dynamics for pre-asymptotically recovering reduced models from data with operator inference

Authors: Benjamin Peherstorfer

Abstract: This work introduces a method for learning low-dimensional models from data of high-dimensional black-box dynamical systems. The novelty is that the learned models are exactly the reduced models that are traditionally constructed with model reduction techniques that require full knowledge of governing equations and operators of the high-dimensional systems. Thus, the learned models are guaranteed… ▽ More This work introduces a method for learning low-dimensional models from data of high-dimensional black-box dynamical systems. The novelty is that the learned models are exactly the reduced models that are traditionally constructed with model reduction techniques that require full knowledge of governing equations and operators of the high-dimensional systems. Thus, the learned models are guaranteed to inherit the well-studied properties of reduced models from traditional model reduction. The key ingredient is a new data sampling scheme to obtain re-projected trajectories of high-dimensional systems that correspond to Markovian dynamics in low-dimensional subspaces. The exact recovery of reduced models from these re-projected trajectories is guaranteed pre-asymptotically under certain conditions for finite amounts of data and for a large class of systems with polynomial nonlinear terms. Numerical results demonstrate that the low-dimensional models learned with the proposed approach match reduced models from traditional model reduction up to numerical errors in practice. The numerical results further indicate that low-dimensional models fitted to re-projected trajectories are predictive even in situations where models fitted to trajectories without re-projection are inaccurate and unstable. △ Less

Submitted 29 August, 2019; originally announced August 2019.

MSC Class: 65M60; 68T10; 65P99; 65Y99; 65F99

arXiv:1905.02679 [pdf, other]

doi 10.1016/j.jcp.2019.04.071

Multifidelity probability estimation via fusion of estimators

Authors: Boris Kramer, Alexandre Noll Marques, Benjamin Peherstorfer, Umberto Villa, Karen Willcox

Abstract: This paper develops a multifidelity method that enables estimation of failure probabilities for expensive-to-evaluate models via information fusion and importance sampling. The presented general fusion method combines multiple probability estimators with the goal of variance reduction. We use low-fidelity models to derive biasing densities for importance sampling and then fuse the importance sampl… ▽ More This paper develops a multifidelity method that enables estimation of failure probabilities for expensive-to-evaluate models via information fusion and importance sampling. The presented general fusion method combines multiple probability estimators with the goal of variance reduction. We use low-fidelity models to derive biasing densities for importance sampling and then fuse the importance sampling estimators such that the fused multifidelity estimator is unbiased and has mean-squared error lower than or equal to that of any of the importance sampling estimators alone. By fusing all available estimators, the method circumvents the challenging problem of selecting the best biasing density and using only that density for sampling. A rigorous analysis shows that the fused estimator is optimal in the sense that it has minimal variance amongst all possible combinations of the estimators. The asymptotic behavior of the proposed method is demonstrated on a convection-diffusion-reaction partial differential equation model for which $10^5$ samples can be afforded. To illustrate the proposed method at scale, we consider a model of a free plane jet and quantify how uncertainties at the flow inlet propagate to a quantity of interest related to turbulent mixing. Compared to an importance sampling estimator that uses the high-fidelity model alone, our multifidelity estimator reduces the required CPU time by 65\% while achieving a similar coefficient of variation. △ Less

Submitted 7 May, 2019; originally announced May 2019.

Journal ref: Journal of Computational Physics 392, 385-402, 2019

arXiv:1812.02094 [pdf, ps, other]

Model reduction for transport-dominated problems via online adaptive bases and adaptive sampling

Authors: Benjamin Peherstorfer

Abstract: This work presents a model reduction approach for problems with coherent structures that propagate over time such as convection-dominated flows and wave-type phenomena. Traditional model reduction methods have difficulties with these transport-dominated problems because propagating coherent structures typically introduce high-dimensional features that require high-dimensional approximation spaces.… ▽ More This work presents a model reduction approach for problems with coherent structures that propagate over time such as convection-dominated flows and wave-type phenomena. Traditional model reduction methods have difficulties with these transport-dominated problems because propagating coherent structures typically introduce high-dimensional features that require high-dimensional approximation spaces. The approach proposed in this work exploits the locality in space and time of propagating coherent structures to derive efficient reduced models. Full-model solutions are approximated locally in time via local reduced spaces that are adapted with basis updates during time step**. The basis updates are derived from querying the full model at a few selected spatial coordinates. A core contribution of this work is an adaptive sampling scheme for selecting at which components to query the full model to compute basis updates. The presented analysis shows that, in probability, the more local the coherent structure is in space, the fewer full-model samples are required to adapt the reduced basis with the proposed adaptive sampling scheme. Numerical results on benchmark examples with interacting wave-type structures and time-varying transport speeds and on a model combustor of a single-element rocket engine demonstrate the wide applicability of the proposed approach and runtime speedups of up to one order of magnitude compared to full models and traditional reduced models. △ Less

Submitted 14 June, 2020; v1 submitted 5 December, 2018; originally announced December 2018.

MSC Class: 65M22; 65N22; 65F99; 49M15

arXiv:1808.10473 [pdf, ps, other]

Stability of discrete empirical interpolation and gappy proper orthogonal decomposition with randomized and deterministic sampling points

Authors: Benjamin Peherstorfer, Zlatko Drmač, Serkan Gugercin

Abstract: This work investigates the stability of (discrete) empirical interpolation for nonlinear model reduction and state field approximation from measurements. Empirical interpolation derives approximations from a few samples (measurements) via interpolation in low-dimensional spaces. It has been observed that empirical interpolation can become unstable if the samples are perturbed due to, e.g., noise,… ▽ More This work investigates the stability of (discrete) empirical interpolation for nonlinear model reduction and state field approximation from measurements. Empirical interpolation derives approximations from a few samples (measurements) via interpolation in low-dimensional spaces. It has been observed that empirical interpolation can become unstable if the samples are perturbed due to, e.g., noise, turbulence, and numerical inaccuracies. The main contribution of this work is a probabilistic analysis that shows that stable approximations are obtained if samples are randomized and if more samples than dimensions of the low-dimensional spaces are used. Oversampling, i.e., taking more sampling points than dimensions of the low-dimensional spaces, leads to approximations via regression and is known under the name of gappy proper orthogonal decomposition. Building on the insights of the probabilistic analysis, a deterministic sampling strategy is presented that aims to achieve lower approximation errors with fewer points than randomized sampling by taking information about the low-dimensional spaces into account. Numerical results of reconstructing velocity fields from noisy measurements of combustion processes and model reduction in the presence of noise demonstrate the instability of empirical interpolation and the stability of gappy proper orthogonal decomposition with oversampling. △ Less

Submitted 19 May, 2020; v1 submitted 30 August, 2018; originally announced August 2018.

arXiv:1808.09379 [pdf, ps, other]

A transport-based multifidelity preconditioner for Markov chain Monte Carlo

Authors: Benjamin Peherstorfer, Youssef Marzouk

Abstract: Markov chain Monte Carlo (MCMC) sampling of posterior distributions arising in Bayesian inverse problems is challenging when evaluations of the forward model are computationally expensive. Replacing the forward model with a low-cost, low-fidelity model often significantly reduces computational cost; however, employing a low-fidelity model alone means that the stationary distribution of the MCMC ch… ▽ More Markov chain Monte Carlo (MCMC) sampling of posterior distributions arising in Bayesian inverse problems is challenging when evaluations of the forward model are computationally expensive. Replacing the forward model with a low-cost, low-fidelity model often significantly reduces computational cost; however, employing a low-fidelity model alone means that the stationary distribution of the MCMC chain is the posterior distribution corresponding to the low-fidelity model, rather than the original posterior distribution corresponding to the high-fidelity model. We propose a multifidelity approach that combines, rather than replaces, the high-fidelity model with a low-fidelity model. First, the low-fidelity model is used to construct a transport map that deterministically couples a reference Gaussian distribution with an approximation of the low-fidelity posterior. Then, the high-fidelity posterior distribution is explored using a non-Gaussian proposal distribution derived from the transport map. This multifidelity "preconditioned" MCMC approach seeks efficient sampling via a proposal that is explicitly tailored to the posterior at hand and that is constructed efficiently with the low-fidelity model. By relying on the low-fidelity model only to construct the proposal distribution, our approach guarantees that the stationary distribution of the MCMC chain is the high-fidelity posterior. In our numerical examples, our multifidelity approach achieves significant speedups compared to single-fidelity MCMC sampling methods. △ Less

Submitted 28 August, 2018; originally announced August 2018.

arXiv:1806.10761 [pdf, other]

Survey of multifidelity methods in uncertainty propagation, inference, and optimization

Authors: Benjamin Peherstorfer, Karen Willcox, Max Gunzburger

Abstract: In many situations across computational science and engineering, multiple computational models are available that describe a system of interest. These different models have varying evaluation costs and varying fidelities. Typically, a computationally expensive high-fidelity model describes the system with the accuracy required by the current application at hand, while lower-fidelity models are les… ▽ More In many situations across computational science and engineering, multiple computational models are available that describe a system of interest. These different models have varying evaluation costs and varying fidelities. Typically, a computationally expensive high-fidelity model describes the system with the accuracy required by the current application at hand, while lower-fidelity models are less accurate but computationally cheaper than the high-fidelity model. Outer-loop applications, such as optimization, inference, and uncertainty quantification, require multiple model evaluations at many different inputs, which often leads to computational demands that exceed available resources if only the high-fidelity model is used. This work surveys multifidelity methods that accelerate the solution of outer-loop applications by combining high-fidelity and low-fidelity model evaluations, where the low-fidelity evaluations arise from an explicit low-fidelity model (e.g., a simplified physics approximation, a reduced model, a data-fit surrogate, etc.) that approximates the same output quantity as the high-fidelity model. The overall premise of these multifidelity methods is that low-fidelity models are leveraged for speedup while the high-fidelity model is kept in the loop to establish accuracy and/or convergence guarantees. We categorize multifidelity methods according to three classes of strategies: adaptation, fusion, and filtering. The paper reviews multifidelity methods in the outer-loop contexts of uncertainty propagation, inference, and optimization. △ Less

Submitted 28 June, 2018; originally announced June 2018.

Comments: will appear in SIAM Review

MSC Class: 65-02; 62-02; 49-02

Showing 1–37 of 37 results for author: Peherstorfer, B