Search | arXiv e-print repository

Efficient, Multimodal, and Derivative-Free Bayesian Inference With Fisher-Rao Gradient Flows

Authors: Yifan Chen, Daniel Zhengyu Huang, Jiaoyang Huang, Sebastian Reich, Andrew M. Stuart

Abstract: In this paper, we study efficient approximate sampling for probability distributions known up to normalization constants. We specifically focus on a problem class arising in Bayesian inference for large-scale inverse problems in science and engineering applications. The computational challenges we address with the proposed methodology are: (i) the need for repeated evaluations of expensive forward… ▽ More In this paper, we study efficient approximate sampling for probability distributions known up to normalization constants. We specifically focus on a problem class arising in Bayesian inference for large-scale inverse problems in science and engineering applications. The computational challenges we address with the proposed methodology are: (i) the need for repeated evaluations of expensive forward models; (ii) the potential existence of multiple modes; and (iii) the fact that gradient of, or adjoint solver for, the forward model might not be feasible. While existing Bayesian inference methods meet some of these challenges individually, we propose a framework that tackles all three systematically. Our approach builds upon the Fisher-Rao gradient flow in probability space, yielding a dynamical system for probability densities that converges towards the target distribution at a uniform exponential rate. This rapid convergence is advantageous for the computational burden outlined in (i). We apply Gaussian mixture approximations with operator splitting techniques to simulate the flow numerically; the resulting approximation can capture multiple modes thus addressing (ii). Furthermore, we employ the Kalman methodology to facilitate a derivative-free update of these Gaussian components and their respective weights, addressing the issue in (iii). The proposed methodology results in an efficient derivative-free sampler flexible enough to handle multi-modal distributions: Gaussian Mixture Kalman Inversion (GMKI). The effectiveness of GMKI is demonstrated both theoretically and numerically in several experiments with multimodal target distributions, including proof-of-concept and two-dimensional examples, as well as a large-scale application: recovering the Navier-Stokes initial condition from solution data at positive times. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 42 pages, 9 figures

arXiv:2404.09730 [pdf, other]

Convergence Analysis of Probability Flow ODE for Score-based Generative Models

Authors: Daniel Zhengyu Huang, Jiaoyang Huang, Zhengjiang Lin

Abstract: Score-based generative models have emerged as a powerful approach for sampling high-dimensional probability distributions. Despite their effectiveness, their theoretical underpinnings remain relatively underdeveloped. In this work, we study the convergence properties of deterministic samplers based on probability flow ODEs from both theoretical and numerical perspectives. Assuming access to $L^2$-… ▽ More Score-based generative models have emerged as a powerful approach for sampling high-dimensional probability distributions. Despite their effectiveness, their theoretical underpinnings remain relatively underdeveloped. In this work, we study the convergence properties of deterministic samplers based on probability flow ODEs from both theoretical and numerical perspectives. Assuming access to $L^2$-accurate estimates of the score function, we prove the total variation between the target and the generated data distributions can be bounded above by $\mathcal{O}(d\sqrtδ)$ in the continuous time level, where $d$ denotes the data dimension and $δ$ represents the $L^2$-score matching error. For practical implementations using a $p$-th order Runge-Kutta integrator with step size $h$, we establish error bounds of $\mathcal{O}(d(\sqrtδ + (dh)^p))$ at the discrete level. Finally, we present numerical studies on problems up to $128$ dimensions to verify our theory, which indicate a better score matching error and dimension dependence. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 33 pages, 7 figures

arXiv:2402.06031 [pdf, other]

An operator learning perspective on parameter-to-observable maps

Authors: Daniel Zhengyu Huang, Nicholas H. Nelsen, Margaret Trautner

Abstract: Computationally efficient surrogates for parametrized physical models play a crucial role in science and engineering. Operator learning provides data-driven surrogates that map between function spaces. However, instead of full-field measurements, often the available data are only finite-dimensional parametrizations of model inputs or finite observables of model outputs. Building on Fourier Neural… ▽ More Computationally efficient surrogates for parametrized physical models play a crucial role in science and engineering. Operator learning provides data-driven surrogates that map between function spaces. However, instead of full-field measurements, often the available data are only finite-dimensional parametrizations of model inputs or finite observables of model outputs. Building on Fourier Neural Operators, this paper introduces the Fourier Neural Map**s (FNMs) framework that is able to accommodate such finite-dimensional vector inputs or outputs. The paper develops universal approximation theorems for the method. Moreover, in many applications the underlying parameter-to-observable (PtO) map is defined implicitly through an infinite-dimensional operator, such as the solution operator of a partial differential equation. A natural question is whether it is more data-efficient to learn the PtO map end-to-end or first learn the solution operator and subsequently compute the observable from the full-field solution. A theoretical analysis of Bayesian nonparametric regression of linear functionals, which is of independent interest, suggests that the end-to-end approach can actually have worse sample complexity. Extending beyond the theory, numerical results for the FNM approximation of three nonlinear PtO maps demonstrate the benefits of the operator learning perspective that this paper adopts. △ Less

Submitted 6 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

Comments: 63 pages, 10 figures, 1 table

MSC Class: 68T07; 62G20; 65J15

arXiv:2310.03597 [pdf, other]

Sampling via Gradient Flows in the Space of Probability Measures

Authors: Yifan Chen, Daniel Zhengyu Huang, Jiaoyang Huang, Sebastian Reich, Andrew M Stuart

Abstract: Sampling a target probability distribution with an unknown normalization constant is a fundamental challenge in computational science and engineering. Recent work shows that algorithms derived by considering gradient flows in the space of probability measures open up new avenues for algorithm development. This paper makes three contributions to this sampling approach by scrutinizing the design com… ▽ More Sampling a target probability distribution with an unknown normalization constant is a fundamental challenge in computational science and engineering. Recent work shows that algorithms derived by considering gradient flows in the space of probability measures open up new avenues for algorithm development. This paper makes three contributions to this sampling approach by scrutinizing the design components of such gradient flows. Any instantiation of a gradient flow for sampling needs an energy functional and a metric to determine the flow, as well as numerical approximations of the flow to derive algorithms. Our first contribution is to show that the Kullback-Leibler divergence, as an energy functional, has the unique property (among all f-divergences) that gradient flows resulting from it do not depend on the normalization constant of the target distribution. Our second contribution is to study the choice of metric from the perspective of invariance. The Fisher-Rao metric is known as the unique choice (up to scaling) that is diffeomorphism invariant. As a computationally tractable alternative, we introduce a relaxed, affine invariance property for the metrics and gradient flows. In particular, we construct various affine invariant Wasserstein and Stein gradient flows. Affine invariant gradient flows are shown to behave more favorably than their non-affine-invariant counterparts when sampling highly anisotropic distributions, in theory and by using particle methods. Our third contribution is to study, and develop efficient algorithms based on Gaussian approximations of the gradient flows; this leads to an alternative to particle methods. We establish connections between various Gaussian approximate gradient flows, discuss their relation to gradient methods arising from parametric variational inference, and study their convergence properties both theoretically and numerically. △ Less

Submitted 9 March, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

Comments: Related and text overlap with arXiv:2302.11024

arXiv:2210.08095 [pdf, other]

Bayesian Spline Learning for Equation Discovery of Nonlinear Dynamics with Quantified Uncertainty

Authors: Luning Sun, Daniel Zhengyu Huang, Hao Sun, Jian-Xun Wang

Abstract: Nonlinear dynamics are ubiquitous in science and engineering applications, but the physics of most complex systems is far from being fully understood. Discovering interpretable governing equations from measurement data can help us understand and predict the behavior of complex dynamic systems. Although extensive work has recently been done in this field, robustly distilling explicit model forms fr… ▽ More Nonlinear dynamics are ubiquitous in science and engineering applications, but the physics of most complex systems is far from being fully understood. Discovering interpretable governing equations from measurement data can help us understand and predict the behavior of complex dynamic systems. Although extensive work has recently been done in this field, robustly distilling explicit model forms from very sparse data with considerable noise remains intractable. Moreover, quantifying and propagating the uncertainty of the identified system from noisy data is challenging, and relevant literature is still limited. To bridge this gap, we develop a novel Bayesian spline learning framework to identify parsimonious governing equations of nonlinear (spatio)temporal dynamics from sparse, noisy data with quantified uncertainty. The proposed method utilizes spline basis to handle the data scarcity and measurement noise, upon which a group of derivatives can be accurately computed to form a library of candidate model terms. The equation residuals are used to inform the spline learning in a Bayesian manner, where approximate Bayesian uncertainty calibration techniques are employed to approximate posterior distributions of the trainable parameters. To promote the sparsity, an iterative sequential-threshold Bayesian learning approach is developed, using the alternative direction optimization strategy to systematically approximate L0 sparsity constraints. The proposed algorithm is evaluated on multiple nonlinear dynamical systems governed by canonical ordinary and partial differential equations, and the merit/superiority of the proposed method is demonstrated by comparison with state-of-the-art methods. △ Less

Submitted 14 October, 2022; originally announced October 2022.

Comments: 28 pages, 11 figures

arXiv:2207.05209 [pdf, other]

doi 10.5555/3648699.3649087

Fourier Neural Operator with Learned Deformations for PDEs on General Geometries

Authors: Zongyi Li, Daniel Zhengyu Huang, Burigede Liu, Anima Anandkumar

Abstract: Deep learning surrogate models have shown promise in solving partial differential equations (PDEs). Among them, the Fourier neural operator (FNO) achieves good accuracy, and is significantly faster compared to numerical solvers, on a variety of PDEs, such as fluid flows. However, the FNO uses the Fast Fourier transform (FFT), which is limited to rectangular domains with uniform grids. In this work… ▽ More Deep learning surrogate models have shown promise in solving partial differential equations (PDEs). Among them, the Fourier neural operator (FNO) achieves good accuracy, and is significantly faster compared to numerical solvers, on a variety of PDEs, such as fluid flows. However, the FNO uses the Fast Fourier transform (FFT), which is limited to rectangular domains with uniform grids. In this work, we propose a new framework, viz., geo-FNO, to solve PDEs on arbitrary geometries. Geo-FNO learns to deform the input (physical) domain, which may be irregular, into a latent space with a uniform grid. The FNO model with the FFT is applied in the latent space. The resulting geo-FNO model has both the computation efficiency of FFT and the flexibility of handling arbitrary geometries. Our geo-FNO is also flexible in terms of its input formats, viz., point clouds, meshes, and design parameters are all valid inputs. We consider a variety of PDEs such as the Elasticity, Plasticity, Euler's, and Navier-Stokes equations, and both forward modeling and inverse design problems. Geo-FNO is $10^5$ times faster than the standard numerical solvers and twice more accurate compared to direct interpolation on existing ML-based PDE solvers such as the standard FNO. △ Less

Submitted 2 May, 2024; v1 submitted 11 July, 2022; originally announced July 2022.

Journal ref: Journal of Machine Learning Research (2023) Volume 24, Issue 1, Article No. 388, pp 18593-18618

arXiv:2110.10210 [pdf, other]

Long Random Matrices and Tensor Unfolding

Authors: Gérard Ben Arous, Daniel Zhengyu Huang, Jiaoyang Huang

Abstract: In this paper, we consider the singular values and singular vectors of low rank perturbations of large rectangular random matrices, in the regime the matrix is "long": we allow the number of rows (columns) to grow polynomially in the number of columns (rows). We prove there exists a critical signal-to-noise ratio (depending on the dimensions of the matrix), and the extreme singular values and sing… ▽ More In this paper, we consider the singular values and singular vectors of low rank perturbations of large rectangular random matrices, in the regime the matrix is "long": we allow the number of rows (columns) to grow polynomially in the number of columns (rows). We prove there exists a critical signal-to-noise ratio (depending on the dimensions of the matrix), and the extreme singular values and singular vectors exhibit a BBP type phase transition. As a main application, we investigate the tensor unfolding algorithm for the asymmetric rank-one spiked tensor model, and obtain an exact threshold, which is independent of the procedure of tensor unfolding. If the signal-to-noise ratio is above the threshold, tensor unfolding detects the signals; otherwise, it fails to capture the signals. △ Less

Submitted 19 October, 2021; originally announced October 2021.

Comments: 29 pages, 4 figures

arXiv:2007.05877 [pdf, other]

doi 10.1002/nme.6634

A Computationally Tractable Framework for Nonlinear Dynamic Multiscale Modeling of Membrane Fabric

Authors: Philip Avery, Daniel Z. Huang, Wanli He, Johanna Ehlers, Armen Derkevorkian, Charbel Farhat

Abstract: A general-purpose computational homogenization framework is proposed for the nonlinear dynamic analysis of membranes exhibiting complex microscale and/or mesoscale heterogeneity characterized by in-plane periodicity that cannot be effectively treated by a conventional method, such as woven fabrics. The framework is a generalization of the "finite element squared" (or FE2) method in which a localiz… ▽ More A general-purpose computational homogenization framework is proposed for the nonlinear dynamic analysis of membranes exhibiting complex microscale and/or mesoscale heterogeneity characterized by in-plane periodicity that cannot be effectively treated by a conventional method, such as woven fabrics. The framework is a generalization of the "finite element squared" (or FE2) method in which a localized portion of the periodic subscale structure is modeled using finite elements. The numerical solution of displacement driven problems involving this model can be adapted to the context of membranes by a variant of the Klinkel-Govindjee method[1] originally proposed for using finite strain, three-dimensional material models in beam and shell elements. This approach relies on numerical enforcement of the plane stress constraint and is enabled by the principle of frame invariance. Computational tractability is achieved by introducing a regression-based surrogate model informed by a physics-inspired training regimen in which FE$^2$ is utilized to simulate a variety of numerical experiments including uniaxial, biaxial and shear straining of a material coupon. Several alternative surrogate models are evaluated including an artificial neural network. The framework is demonstrated and validated for a realistic Mars landing application involving supersonic inflation of a parachute canopy made of woven fabric. △ Less

Submitted 27 January, 2021; v1 submitted 11 July, 2020; originally announced July 2020.

Comments: 29 pages, 12 figures

Journal ref: International Journal for Numerical Methods in Engineering, 2021

arXiv:1912.01658 [pdf, other]

Modeling, simulation and validation of supersonic parachute inflation dynamics during Mars landing

Authors: Daniel Z. Huang, Philip Avery, Charbel Farhat, Jason Rabinovitch, Armen Derkevorkian, Lee D Peterson

Abstract: A high fidelity multi-physics Eulerian computational framework is presented for the simulation of supersonic parachute inflation during Mars landing. Unlike previous investigations in this area, the framework takes into account an initial folding pattern of the parachute, the flow compressibility effect on the fabric material porosity, and the interactions between supersonic fluid flows and the su… ▽ More A high fidelity multi-physics Eulerian computational framework is presented for the simulation of supersonic parachute inflation during Mars landing. Unlike previous investigations in this area, the framework takes into account an initial folding pattern of the parachute, the flow compressibility effect on the fabric material porosity, and the interactions between supersonic fluid flows and the suspension lines. Several adaptive mesh refinement (AMR)-enabled, large edge simulation (LES)-based, simulations of a full-size disk-gap-band (DGB) parachute inflating in the low-density, low-pressure, carbon dioxide (CO2) Martian atmosphere are reported. The comparison of the drag histories and the first peak forces between the simulation results and experimental data collected during the NASA Curiosity Rover's Mars atmospheric entry shows reasonable agreements. Furthermore, a rudimentary material failure analysis is performed to provide an estimate of the safety factor for the parachute decelerator system. The proposed framework demonstrates the potential of using Computational Fluid Dynamics (CFD) and Fluid-Structure Interaction (FSI)-based simulation tools for future supersonic parachute design. △ Less

Submitted 3 December, 2019; originally announced December 2019.

Comments: 24 pages, 12 figures

arXiv:1908.08382 [pdf, other]

An Embedded Boundary Approach for Resolving the Contribution of Cable Subsystems to Fully Coupled Fluid-Structure Interaction

Authors: Daniel Z. Huang, Philip Avery, Charbel Farhat

Abstract: Cable subsystems characterized by long, slender, and flexible structural elements are featured in numerous engineering systems. In each of them, interaction between an individual cable and the surrounding fluid is inevitable. Such a Fluid-Structure Interaction (FSI) has received little attention in the literature, possibly due to the inherent complexity associated with fluid and structural semi-di… ▽ More Cable subsystems characterized by long, slender, and flexible structural elements are featured in numerous engineering systems. In each of them, interaction between an individual cable and the surrounding fluid is inevitable. Such a Fluid-Structure Interaction (FSI) has received little attention in the literature, possibly due to the inherent complexity associated with fluid and structural semi-discretizations of disparate spatial dimensions. This paper proposes an embedded boundary approach for filling this gap, where the dynamics of the cable are captured by a standard finite element representation $\mathcal C$ of its centerline, while its geometry is represented by a discrete surface $Σ_h$ that is embedded in the fluid mesh. The proposed approach is built on master-slave kinematics between $\mathcal C$ and $Σ_h$, a simple algorithm for computing the motion/deformation of $Σ_h$ based on the dynamic state of $\mathcal C$, and an energy-conserving method for transferring to $\mathcal C$ the loads computed on $Σ_h$. Its effectiveness is demonstrated for two highly nonlinear applications featuring large deformations and/or motions of a cable subsystem and turbulent flows: an aerial refueling model problem, and a challenging supersonic parachute inflation problem. The proposed approach is verified using numerical data, and validated using real flight data. △ Less

Submitted 6 November, 2019; v1 submitted 11 August, 2019; originally announced August 2019.

Comments: 23 pages, 11 figures

Showing 1–10 of 10 results for author: Huang, D Z