Search | arXiv e-print repository

Efficient Prior Calibration From Indirect Data

Authors: O. Deniz Akyildiz, Mark Girolami, Andrew M. Stuart, Arnaud Vadeboncoeur

Abstract: Bayesian inversion is central to the quantification of uncertainty within problems arising from numerous applications in science and engineering. To formulate the approach, four ingredients are required: a forward model map** the unknown parameter to an element of a solution space, often the solution space for a differential equation; an observation operator map** an element of the solution sp… ▽ More Bayesian inversion is central to the quantification of uncertainty within problems arising from numerous applications in science and engineering. To formulate the approach, four ingredients are required: a forward model map** the unknown parameter to an element of a solution space, often the solution space for a differential equation; an observation operator map** an element of the solution space to the data space; a noise model describing how noise pollutes the observations; and a prior model describing knowledge about the unknown parameter before the data is acquired. This paper is concerned with learning the prior model from data; in particular, learning the prior from multiple realizations of indirect data obtained through the noisy observation process. The prior is represented, using a generative model, as the pushforward of a Gaussian in a latent space; the pushforward map is learned by minimizing an appropriate loss function. A metric that is well-defined under empirical approximation is used to define the loss function for the pushforward map to make an implementable methodology. Furthermore, an efficient residual-based neural operator approximation of the forward model is proposed and it is shown that this may be learned concurrently with the pushforward map, using a bilevel optimization formulation of the problem; this use of neural operator approximation has the potential to make prior learning from indirect data more computationally efficient, especially when the observation process is expensive, non-smooth or not known. The ideas are illustrated with the Darcy flow inverse problem of finding permeability from piezometric head measurements. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2311.13584 [pdf, other]

On diffusion-based generative models and their error bounds: The log-concave case with full convergence estimates

Authors: Stefano Bruno, Ying Zhang, Dong-Young Lim, Ömer Deniz Akyildiz, Sotirios Sabanis

Abstract: We provide full theoretical guarantees for the convergence behaviour of diffusion-based generative models under the assumption of strongly log-concave data distributions while our approximating class of functions used for score estimation is made of Lipschitz continuous functions. We demonstrate via a motivating example, sampling from a Gaussian distribution with unknown mean, the powerfulness of… ▽ More We provide full theoretical guarantees for the convergence behaviour of diffusion-based generative models under the assumption of strongly log-concave data distributions while our approximating class of functions used for score estimation is made of Lipschitz continuous functions. We demonstrate via a motivating example, sampling from a Gaussian distribution with unknown mean, the powerfulness of our approach. In this case, explicit estimates are provided for the associated optimization problem, i.e. score approximation, while these are combined with the corresponding sampling estimates. As a result, we obtain the best known upper bound estimates in terms of key quantities of interest, such as the dimension and rates of convergence, for the Wasserstein-2 distance between the data distribution (Gaussian with unknown mean) and our sampling algorithm. Beyond the motivating example and in order to allow for the use of a diverse range of stochastic optimizers, we present our results using an $L^2$-accurate score estimation assumption, which crucially is formed under an expectation with respect to the stochastic optimizer and our novel auxiliary process that uses only known information. This approach yields the best known convergence rate for our sampling algorithm. △ Less

Submitted 22 April, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

arXiv:2301.11040 [pdf, other]

Random Grid Neural Processes for Parametric Partial Differential Equations

Authors: Arnaud Vadeboncoeur, Ieva Kazlauskaite, Yanni Papandreou, Fehmi Cirak, Mark Girolami, Ömer Deniz Akyildiz

Abstract: We introduce a new class of spatially stochastic physics and data informed deep latent models for parametric partial differential equations (PDEs) which operate through scalable variational neural processes. We achieve this by assigning probability measures to the spatial domain, which allows us to treat collocation grids probabilistically as random variables to be marginalised out. Adapting this… ▽ More We introduce a new class of spatially stochastic physics and data informed deep latent models for parametric partial differential equations (PDEs) which operate through scalable variational neural processes. We achieve this by assigning probability measures to the spatial domain, which allows us to treat collocation grids probabilistically as random variables to be marginalised out. Adapting this spatial statistics view, we solve forward and inverse problems for parametric PDEs in a way that leads to the construction of Gaussian process models of solution fields. The implementation of these random grids poses a unique set of challenges for inverse physics informed deep learning frameworks and we propose a new architecture called Grid Invariant Convolutional Networks (GICNets) to overcome these challenges. We further show how to incorporate noisy data in a principled manner into our physics informed model to improve predictions for problems where data may be available but whose measurement location does not coincide with any fixed mesh or grid. The proposed method is tested on a nonlinear Poisson problem, Burgers equation, and Navier-Stokes equations, and we provide extensive numerical comparisons. We demonstrate significant computational advantages over current physics informed neural learning methods for parametric PDEs while improving the predictive capabilities and flexibility of these models. △ Less

Submitted 7 June, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

arXiv:2209.15609 [pdf, other]

$Φ$-DVAE: Physics-Informed Dynamical Variational Autoencoders for Unstructured Data Assimilation

Authors: Alex Glyn-Davies, Connor Duffin, Ö. Deniz Akyildiz, Mark Girolami

Abstract: Incorporating unstructured data into physical models is a challenging problem that is emerging in data assimilation. Traditional approaches focus on well-defined observation operators whose functional forms are typically assumed to be known. This prevents these methods from achieving a consistent model-data synthesis in configurations where the map** from data-space to model-space is unknown. To… ▽ More Incorporating unstructured data into physical models is a challenging problem that is emerging in data assimilation. Traditional approaches focus on well-defined observation operators whose functional forms are typically assumed to be known. This prevents these methods from achieving a consistent model-data synthesis in configurations where the map** from data-space to model-space is unknown. To address these shortcomings, in this paper we develop a physics-informed dynamical variational autoencoder ($Φ$-DVAE) to embed diverse data streams into time-evolving physical systems described by differential equations. Our approach combines a standard, possibly nonlinear, filter for the latent state-space model and a VAE, to assimilate the unstructured data into the latent dynamical system. Unstructured data, in our example systems, comes in the form of video data and velocity field measurements, however the methodology is suitably generic to allow for arbitrary unknown observation operators. A variational Bayesian framework is used for the joint estimation of the encoding, latent states, and unknown system parameters. To demonstrate the method, we provide case studies with the Lorenz-63 ordinary differential equation, and the advection and Korteweg-de Vries partial differential equations. Our results, with synthetic data, show that $Φ$-DVAE provides a data efficient dynamics encoding methodology which is competitive with standard approaches. Unknown parameters are recovered with uncertainty quantification, and unseen data are accurately predicted. △ Less

Submitted 14 July, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

Comments: 27 pages, 8 figures, updated version

arXiv:2208.04856 [pdf, other]

doi 10.1016/j.jcp.2023.112369

Fully probabilistic deep models for forward and inverse problems in parametric PDEs

Authors: Arnaud Vadeboncoeur, Ömer Deniz Akyildiz, Ieva Kazlauskaite, Mark Girolami, Fehmi Cirak

Abstract: We introduce a physics-driven deep latent variable model (PDDLVM) to learn simultaneously parameter-to-solution (forward) and solution-to-parameter (inverse) maps of parametric partial differential equations (PDEs). Our formulation leverages conventional PDE discretization techniques, deep neural networks, probabilistic modelling, and variational inference to assemble a fully probabilistic coheren… ▽ More We introduce a physics-driven deep latent variable model (PDDLVM) to learn simultaneously parameter-to-solution (forward) and solution-to-parameter (inverse) maps of parametric partial differential equations (PDEs). Our formulation leverages conventional PDE discretization techniques, deep neural networks, probabilistic modelling, and variational inference to assemble a fully probabilistic coherent framework. In the posited probabilistic model, both the forward and inverse maps are approximated as Gaussian distributions with a mean and covariance parameterized by deep neural networks. The PDE residual is assumed to be an observed random vector of value zero, hence we model it as a random vector with a zero mean and a user-prescribed covariance. The model is trained by maximizing the probability, that is the evidence or marginal likelihood, of observing a residual of zero by maximizing the evidence lower bound (ELBO). Consequently, the proposed methodology does not require any independent PDE solves and is physics-informed at training time, allowing the real-time solution of PDE forward and inverse problems after training. The proposed framework can be easily extended to seamlessly integrate observed data to solve inverse problems and to build generative models. We demonstrate the efficiency and robustness of our method on finite element discretized parametric PDE problems such as linear and nonlinear Poisson problems, elastic shells with complex 3D geometries, and time-dependent nonlinear and inhomogeneous PDEs using a physics-informed neural network (PINN) discretization. We achieve up to three orders of magnitude speed-up after training compared to traditional finite element method (FEM), while outputting coherent uncertainty estimates. △ Less

Submitted 14 July, 2023; v1 submitted 9 August, 2022; originally announced August 2022.

arXiv:2010.10436 [pdf, other]

VarGrad: A Low-Variance Gradient Estimator for Variational Inference

Authors: Lorenz Richter, Ayman Boustati, Nikolas Nüsken, Francisco J. R. Ruiz, Ömer Deniz Akyildiz

Abstract: We analyse the properties of an unbiased gradient estimator of the ELBO for variational inference, based on the score function method with leave-one-out control variates. We show that this gradient estimator can be obtained using a new loss, defined as the variance of the log-ratio between the exact posterior and the variational approximation, which we call the $\textit{log-variance loss}$. Under… ▽ More We analyse the properties of an unbiased gradient estimator of the ELBO for variational inference, based on the score function method with leave-one-out control variates. We show that this gradient estimator can be obtained using a new loss, defined as the variance of the log-ratio between the exact posterior and the variational approximation, which we call the $\textit{log-variance loss}$. Under certain conditions, the gradient of the log-variance loss equals the gradient of the (negative) ELBO. We show theoretically that this gradient estimator, which we call $\textit{VarGrad}$ due to its connection to the log-variance loss, exhibits lower variance than the score function method in certain settings, and that the leave-one-out control variate coefficients are close to the optimal ones. We empirically demonstrate that VarGrad offers a favourable variance versus computation trade-off compared to other state-of-the-art estimators on a discrete VAE. △ Less

Submitted 29 October, 2020; v1 submitted 20 October, 2020; originally announced October 2020.

arXiv:2002.09998 [pdf, other]

Generalized Bayesian Filtering via Sequential Monte Carlo

Authors: Ayman Boustati, Ömer Deniz Akyildiz, Theodoros Damoulas, Adam M. Johansen

Abstract: We introduce a framework for inference in general state-space hidden Markov models (HMMs) under likelihood misspecification. In particular, we leverage the loss-theoretic perspective of Generalized Bayesian Inference (GBI) to define generalised filtering recursions in HMMs, that can tackle the problem of inference under model misspecification. In doing so, we arrive at principled procedures for ro… ▽ More We introduce a framework for inference in general state-space hidden Markov models (HMMs) under likelihood misspecification. In particular, we leverage the loss-theoretic perspective of Generalized Bayesian Inference (GBI) to define generalised filtering recursions in HMMs, that can tackle the problem of inference under model misspecification. In doing so, we arrive at principled procedures for robust inference against observation contamination by utilising the $β$-divergence. Operationalising the proposed framework is made possible via sequential Monte Carlo methods (SMC), where most standard particle methods, and their associated convergence results, are readily adapted to the new setting. We apply our approach to object tracking and Gaussian process regression problems, and observe improved performance over both standard filtering algorithms and other robust filters. △ Less

Submitted 21 October, 2020; v1 submitted 23 February, 2020; originally announced February 2020.

arXiv:1910.03906 [pdf, other]

Probabilistic sequential matrix factorization

Authors: Ömer Deniz Akyildiz, Gerrit J. J. van den Burg, Theodoros Damoulas, Mark F. J. Steel

Abstract: We introduce the probabilistic sequential matrix factorization (PSMF) method for factorizing time-varying and non-stationary datasets consisting of high-dimensional time-series. In particular, we consider nonlinear Gaussian state-space models where sequential approximate inference results in the factorization of a data matrix into a dictionary and time-varying coefficients with potentially nonline… ▽ More We introduce the probabilistic sequential matrix factorization (PSMF) method for factorizing time-varying and non-stationary datasets consisting of high-dimensional time-series. In particular, we consider nonlinear Gaussian state-space models where sequential approximate inference results in the factorization of a data matrix into a dictionary and time-varying coefficients with potentially nonlinear Markovian dependencies. The assumed Markovian structure on the coefficients enables us to encode temporal dependencies into a low-dimensional feature space. The proposed inference method is solely based on an approximate extended Kalman filtering scheme, which makes the resulting method particularly efficient. PSMF can account for temporal nonlinearities and, more importantly, can be used to calibrate and estimate generic differentiable nonlinear subspace models. We also introduce a robust version of PSMF, called rPSMF, which uses Student-t filters to handle model misspecification. We show that PSMF can be used in multiple contexts: modeling time series with a periodic subspace, robustifying changepoint detection methods, and imputing missing data in several high-dimensional time-series, such as measurements of pollutants across London. △ Less

Submitted 18 March, 2021; v1 submitted 9 October, 2019; originally announced October 2019.

Comments: Accepted for publication at AISTATS 2021

arXiv:1910.02008 [pdf, ps, other]

Nonasymptotic estimates for Stochastic Gradient Langevin Dynamics under local conditions in nonconvex optimization

Authors: Ying Zhang, Ömer Deniz Akyildiz, Theodoros Damoulas, Sotirios Sabanis

Abstract: In this paper, we are concerned with a non-asymptotic analysis of sampling algorithms used in nonconvex optimization. In particular, we obtain non-asymptotic estimates in Wasserstein-1 and Wasserstein-2 distances for a popular class of algorithms called Stochastic Gradient Langevin Dynamics (SGLD). In addition, the aforementioned Wasserstein-2 convergence result can be applied to establish a non-a… ▽ More In this paper, we are concerned with a non-asymptotic analysis of sampling algorithms used in nonconvex optimization. In particular, we obtain non-asymptotic estimates in Wasserstein-1 and Wasserstein-2 distances for a popular class of algorithms called Stochastic Gradient Langevin Dynamics (SGLD). In addition, the aforementioned Wasserstein-2 convergence result can be applied to establish a non-asymptotic error bound for the expected excess risk. Crucially, these results are obtained under a local Lipschitz condition and a local dissipativity condition where we remove the uniform dependence in the data stream. We illustrate the importance of this relaxation by presenting examples from variational inference and from index tracking optimization. △ Less

Submitted 14 October, 2022; v1 submitted 4 October, 2019; originally announced October 2019.

Comments: 38 pages

MSC Class: 60J20; 60J22; 65C05; 65C40; 62D05

Showing 1–9 of 9 results for author: Akyildiz, Ö D