Search | arXiv e-print repository

Computation-Aware Kalman Filtering and Smoothing

Authors: Marvin Pförtner, Jonathan Wenger, Jon Cockayne, Philipp Hennig

Abstract: Kalman filtering and smoothing are the foundational mechanisms for efficient inference in Gauss-Markov models. However, their time and memory complexities scale prohibitively with the size of the state space. This is particularly problematic in spatiotemporal regression problems, where the state dimension scales with the number of spatial observations. Existing approximate frameworks leverage low-… ▽ More Kalman filtering and smoothing are the foundational mechanisms for efficient inference in Gauss-Markov models. However, their time and memory complexities scale prohibitively with the size of the state space. This is particularly problematic in spatiotemporal regression problems, where the state dimension scales with the number of spatial observations. Existing approximate frameworks leverage low-rank approximations of the covariance matrix. Since they do not model the error introduced by the computational approximation, their predictive uncertainty estimates can be overly optimistic. In this work, we propose a probabilistic numerical method for inference in high-dimensional Gauss-Markov models which mitigates these scaling issues. Our matrix-free iterative algorithm leverages GPU acceleration and crucially enables a tunable trade-off between computational cost and predictive uncertainty. Finally, we demonstrate the scalability of our method on a large-scale climate dataset. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2104.12587 [pdf, other]

doi 10.1007/s11222-021-10030-w

Bayesian Numerical Methods for Nonlinear Partial Differential Equations

Authors: Junyang Wang, Jon Cockayne, Oksana Chkrebtii, T. J. Sullivan, Chris. J. Oates

Abstract: The numerical solution of differential equations can be formulated as an inference problem to which formal statistical approaches can be applied. However, nonlinear partial differential equations (PDEs) pose substantial challenges from an inferential perspective, most notably the absence of explicit conditioning formula. This paper extends earlier work on linear PDEs to a general class of initial… ▽ More The numerical solution of differential equations can be formulated as an inference problem to which formal statistical approaches can be applied. However, nonlinear partial differential equations (PDEs) pose substantial challenges from an inferential perspective, most notably the absence of explicit conditioning formula. This paper extends earlier work on linear PDEs to a general class of initial value problems specified by nonlinear PDEs, motivated by problems for which evaluations of the right-hand-side, initial conditions, or boundary conditions of the PDE have a high computational cost. The proposed method can be viewed as exact Bayesian inference under an approximate likelihood, which is based on discretisation of the nonlinear differential operator. Proof-of-concept experimental results demonstrate that meaningful probabilistic uncertainty quantification for the unknown solution of the PDE can be performed, while controlling the number of times the right-hand-side, initial and boundary conditions are evaluated. A suitable prior model for the solution of the PDE is identified using novel theoretical analysis of the sample path properties of Matérn processes, which may be of independent interest. △ Less

Submitted 3 May, 2021; v1 submitted 22 April, 2021; originally announced April 2021.

Journal ref: Stat. Comput. 31(5):no. 55, 20pp., 2021

arXiv:2102.00877 [pdf, other]

A probabilistic Taylor expansion with Gaussian processes

Authors: Toni Karvonen, Jon Cockayne, Filip Tronarp, Simo Särkkä

Abstract: We study a class of Gaussian processes for which the posterior mean, for a particular choice of data, replicates a truncated Taylor expansion of any order. The data consist of derivative evaluations at the expansion point and the prior covariance kernel belongs to the class of Taylor kernels, which can be written in a certain power series form. We discuss and prove some results on maximum likeliho… ▽ More We study a class of Gaussian processes for which the posterior mean, for a particular choice of data, replicates a truncated Taylor expansion of any order. The data consist of derivative evaluations at the expansion point and the prior covariance kernel belongs to the class of Taylor kernels, which can be written in a certain power series form. We discuss and prove some results on maximum likelihood estimation of parameters of Taylor kernels. The proposed framework is a special case of Gaussian process regression based on data that is orthogonal in the reproducing kernel Hilbert space of the covariance kernel. △ Less

Submitted 28 August, 2023; v1 submitted 1 February, 2021; originally announced February 2021.

Comments: To appear in Transactions on Machine Learning Research

arXiv:2012.12615 [pdf, other]

Probabilistic Iterative Methods for Linear Systems

Authors: Jon Cockayne, Ilse C. F. Ipsen, Chris J. Oates, Tim W. Reid

Abstract: This paper presents a probabilistic perspective on iterative methods for approximating the solution $\mathbf{x}_* \in \mathbb{R}^d$ of a nonsingular linear system $\mathbf{A} \mathbf{x}_* = \mathbf{b}$. In the approach a standard iterative method on $\mathbb{R}^d$ is lifted to act on the space of probability distributions $\mathcal{P}(\mathbb{R}^d)$. Classically, an iterative method produces a seq… ▽ More This paper presents a probabilistic perspective on iterative methods for approximating the solution $\mathbf{x}_* \in \mathbb{R}^d$ of a nonsingular linear system $\mathbf{A} \mathbf{x}_* = \mathbf{b}$. In the approach a standard iterative method on $\mathbb{R}^d$ is lifted to act on the space of probability distributions $\mathcal{P}(\mathbb{R}^d)$. Classically, an iterative method produces a sequence $\mathbf{x}_m$ of approximations that converge to $\mathbf{x}_*$. The output of the iterative methods proposed in this paper is, instead, a sequence of probability distributions $μ_m \in \mathcal{P}(\mathbb{R}^d)$. The distributional output both provides a "best guess" for $\mathbf{x}_*$, for example as the mean of $μ_m$, and also probabilistic uncertainty quantification for the value of $\mathbf{x}_*$ when it has not been exactly determined. Theoretical analysis is provided in the prototypical case of a stationary linear iterative method. In this setting we characterise both the rate of contraction of $μ_m$ to an atomic measure on $\mathbf{x}_*$ and the nature of the uncertainty quantification being provided. We conclude with an empirical illustration that highlights the insight into solution uncertainty that can be provided by probabilistic iterative methods. △ Less

Submitted 11 January, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

arXiv:2009.04239 [pdf, other]

Probabilistic Gradients for Fast Calibration of Differential Equation Models

Authors: Jon Cockayne, Andrew B. Duncan

Abstract: Calibration of large-scale differential equation models to observational or experimental data is a widespread challenge throughout applied sciences and engineering. A crucial bottleneck in state-of-the art calibration methods is the calculation of local sensitivities, i.e. derivatives of the loss function with respect to the estimated parameters, which often necessitates several numerical solves o… ▽ More Calibration of large-scale differential equation models to observational or experimental data is a widespread challenge throughout applied sciences and engineering. A crucial bottleneck in state-of-the art calibration methods is the calculation of local sensitivities, i.e. derivatives of the loss function with respect to the estimated parameters, which often necessitates several numerical solves of the underlying system of partial or ordinary differential equations. In this paper we present a new probabilistic approach to computing local sensitivities. The proposed method has several advantages over classical methods. Firstly, it operates within a constrained computational budget and provides a probabilistic quantification of uncertainty incurred in the sensitivities from this constraint. Secondly, information from previous sensitivity estimates can be recycled in subsequent computations, reducing the overall computational effort for iterative gradient-based calibration methods. The methodology presented is applied to two challenging test problems and compared against classical methods. △ Less

Submitted 22 February, 2021; v1 submitted 3 September, 2020; originally announced September 2020.

arXiv:2005.03952 [pdf, other]

Optimal Thinning of MCMC Output

Authors: Marina Riabiz, Wilson Chen, Jon Cockayne, Pawel Swietach, Steven A. Niederer, Lester Mackey, Chris. J. Oates

Abstract: The use of heuristics to assess the convergence and compress the output of Markov chain Monte Carlo can be sub-optimal in terms of the empirical approximations that are produced. Typically a number of the initial states are attributed to "burn in" and removed, whilst the remainder of the chain is "thinned" if compression is also required. In this paper we consider the problem of retrospectively se… ▽ More The use of heuristics to assess the convergence and compress the output of Markov chain Monte Carlo can be sub-optimal in terms of the empirical approximations that are produced. Typically a number of the initial states are attributed to "burn in" and removed, whilst the remainder of the chain is "thinned" if compression is also required. In this paper we consider the problem of retrospectively selecting a subset of states, of fixed cardinality, from the sample path such that the approximation provided by their empirical distribution is close to optimal. A novel method is proposed, based on greedy minimisation of a kernel Stein discrepancy, that is suitable for problems where heavy compression is required. Theoretical results guarantee consistency of the method and its effectiveness is demonstrated in the challenging context of parameter inference for ordinary differential equations. Software is available in the Stein Thinning package in Python, R and MATLAB. △ Less

Submitted 11 January, 2022; v1 submitted 8 May, 2020; originally announced May 2020.

Comments: To appear in the Journal of the Royal Statistical Society, Series B, 2022+

arXiv:1906.10564 [pdf, other]

A Role for Symmetry in the Bayesian Solution of Differential Equations

Authors: Junyang Wang, Jon Cockayne, Chris J. Oates

Abstract: The interpretation of numerical methods, such as finite difference methods for differential equations, as point estimators suggests that formal uncertainty quantification can also be performed in this context. Competing statistical paradigms can be considered and Bayesian probabilistic numerical methods (PNMs) are obtained when Bayesian statistical principles are deployed. Bayesian PNM have the ap… ▽ More The interpretation of numerical methods, such as finite difference methods for differential equations, as point estimators suggests that formal uncertainty quantification can also be performed in this context. Competing statistical paradigms can be considered and Bayesian probabilistic numerical methods (PNMs) are obtained when Bayesian statistical principles are deployed. Bayesian PNM have the appealing property of being closed under composition, such that uncertainty due to different sources of discretisation in a numerical method can be jointly modelled and rigorously propagated. Despite recent attention, no exact Bayesian PNM for the numerical solution of ordinary differential equations (ODEs) has been proposed. This raises the fundamental question of whether exact Bayesian methods for (in general nonlinear) ODEs even exist. The purpose of this paper is to provide a positive answer for a limited class of ODE. To this end, we work at a foundational level, where a novel Bayesian PNM is proposed as a proof-of-concept. Our proposal is a synthesis of classical Lie group methods, to exploit underlying symmetries in the gradient field, and non-parametric regression in a transformed solution space for the ODE. The procedure is presented in detail for first and second order ODEs and relies on a certain strong technical condition -- existence of a solvable Lie algebra -- being satisfied. Numerical illustrations are provided. △ Less

Submitted 23 September, 2019; v1 submitted 24 June, 2019; originally announced June 2019.

Comments: A summary version of this manuscript appeared in the proceedings of MaxEnt 2018 in London, UK; see arXiv:1805.07109

arXiv:1901.04326 [pdf, other]

doi 10.1515/9783110635461-005

Optimality Criteria for Probabilistic Numerical Methods

Authors: Chris. J. Oates, Jon Cockayne, Dennis Prangle, T. J. Sullivan, Mark Girolami

Abstract: It is well understood that Bayesian decision theory and average case analysis are essentially identical. However, if one is interested in performing uncertainty quantification for a numerical task, it can be argued that standard approaches from the decision-theoretic framework are neither appropriate nor sufficient. Instead, we consider a particular optimality criterion from Bayesian experimental… ▽ More It is well understood that Bayesian decision theory and average case analysis are essentially identical. However, if one is interested in performing uncertainty quantification for a numerical task, it can be argued that standard approaches from the decision-theoretic framework are neither appropriate nor sufficient. Instead, we consider a particular optimality criterion from Bayesian experimental design and study its implied optimal information in the numerical context. This information is demonstrated to differ, in general, from the information that would be used in an average-case-optimal numerical method. The explicit connection to Bayesian experimental design suggests several distinct regimes in which optimal probabilistic numerical methods can be developed. △ Less

Submitted 10 May, 2019; v1 submitted 14 January, 2019; originally announced January 2019.

Comments: Prepared for the proceedings of the RICAM workshop on Multivariate Algorithms and Information-Based Complexity, November 2018

Journal ref: Multivariate Algorithms and Information-Based Complexity, Radon Series on Computational and Applied Mathematics 27:65--88, 2020

arXiv:1810.03398 [pdf, other]

Probabilistic Linear Solvers: A Unifying View

Authors: Simon Bartels, Jon Cockayne, Ilse C. F. Ipsen, Philipp Hennig

Abstract: Several recent works have developed a new, probabilistic interpretation for numerical algorithms solving linear systems in which the solution is inferred in a Bayesian framework, either directly or by inferring the unknown action of the matrix inverse. These approaches have typically focused on replicating the behavior of the conjugate gradient method as a prototypical iterative method. In this wo… ▽ More Several recent works have developed a new, probabilistic interpretation for numerical algorithms solving linear systems in which the solution is inferred in a Bayesian framework, either directly or by inferring the unknown action of the matrix inverse. These approaches have typically focused on replicating the behavior of the conjugate gradient method as a prototypical iterative method. In this work surprisingly general conditions for equivalence of these disparate methods are presented. We also describe connections between probabilistic linear solvers and projection methods for linear systems, providing a probabilistic interpretation of a far more general class of iterative methods. In particular, this provides such an interpretation of the generalised minimum residual method. A probabilistic view of preconditioning is also introduced. These developments unify the literature on probabilistic linear solvers, and provide foundational connections to the literature on iterative solvers for linear systems. △ Less

Submitted 17 October, 2018; v1 submitted 8 October, 2018; originally announced October 2018.

arXiv:1805.07109 [pdf, other]

On the Bayesian Solution of Differential Equations

Authors: Junyang Wang, Jon Cockayne, Chris Oates

Abstract: The interpretation of numerical methods, such as finite difference methods for differential equations, as point estimators allows for formal statistical quantification of the error due to discretisation in the numerical context. Competing statistical paradigms can be considered and Bayesian probabilistic numerical methods (PNMs) are obtained when Bayesian statistical principles are deployed. Bayes… ▽ More The interpretation of numerical methods, such as finite difference methods for differential equations, as point estimators allows for formal statistical quantification of the error due to discretisation in the numerical context. Competing statistical paradigms can be considered and Bayesian probabilistic numerical methods (PNMs) are obtained when Bayesian statistical principles are deployed. Bayesian PNM are closed under composition, such that uncertainty due to different sources of discretisation can be jointly modelled and rigorously propagated. However, we argue that no strictly Bayesian PNM for the numerical solution of ordinary differential equations (ODEs) have yet been developed. To address this gap, we work at a foundational level, where a novel Bayesian PNM is proposed as a proof-of-concept. Our proposal is a synthesis of classical Lie group methods, to exploit the underlying structure of the gradient field, and non-parametric regression in a transformed solution space for the ODE. The procedure is presented in detail for first order ODEs and relies on a certain technical condition -- existence of a solvable Lie algebra -- being satisfied. Numerical illustrations are provided. △ Less

Submitted 22 May, 2018; v1 submitted 18 May, 2018; originally announced May 2018.

arXiv:1801.05242 [pdf, other]

A Bayesian Conjugate Gradient Method

Authors: Jon Cockayne, Chris Oates, Ilse Ipsen, Mark Girolami

Abstract: A fundamental task in numerical computation is the solution of large linear systems. The conjugate gradient method is an iterative method which offers rapid convergence to the solution, particularly when an effective preconditioner is employed. However, for more challenging systems a substantial error can be present even after many iterations have been performed. The estimates obtained in this cas… ▽ More A fundamental task in numerical computation is the solution of large linear systems. The conjugate gradient method is an iterative method which offers rapid convergence to the solution, particularly when an effective preconditioner is employed. However, for more challenging systems a substantial error can be present even after many iterations have been performed. The estimates obtained in this case are of little value unless further information can be provided about the numerical error. In this paper we propose a novel statistical model for this numerical error set in a Bayesian framework. Our approach is a strict generalisation of the conjugate gradient method, which is recovered as the posterior mean for a particular choice of prior. The estimates obtained are analysed with Krylov subspace methods and a contraction result for the posterior is presented. The method is then analysed in a simulation study as well as being applied to a challenging problem in medical imaging. △ Less

Submitted 17 December, 2018; v1 submitted 16 January, 2018; originally announced January 2018.

arXiv:1707.06107 [pdf, other]

Bayesian Probabilistic Numerical Methods in Time-Dependent State Estimation for Industrial Hydrocyclone Equipment

Authors: Chris. J. Oates, Jon Cockayne, Robert G. Aykroyd, Mark Girolami

Abstract: The use of high-power industrial equipment, such as large-scale mixing equipment or a hydrocyclone for separation of particles in liquid suspension, demands careful monitoring to ensure correct operation. The fundamental task of state-estimation for the liquid suspension can be posed as a time-evolving inverse problem and solved with Bayesian statistical methods. In this paper, we extend Bayesian… ▽ More The use of high-power industrial equipment, such as large-scale mixing equipment or a hydrocyclone for separation of particles in liquid suspension, demands careful monitoring to ensure correct operation. The fundamental task of state-estimation for the liquid suspension can be posed as a time-evolving inverse problem and solved with Bayesian statistical methods. In this paper, we extend Bayesian methods to incorporate statistical models for the error that is incurred in the numerical solution of the physical governing equations. This enables full uncertainty quantification within a principled computation-precision trade-off, in contrast to the over-confident inferences that are obtained when all sources of numerical error are ignored. The method is cast within a sequential Monte Carlo framework and an optimised implementation is provided in Python. △ Less

Submitted 19 December, 2018; v1 submitted 19 July, 2017; originally announced July 2017.

arXiv:1706.03369 [pdf, other]

On the Sampling Problem for Kernel Quadrature

Authors: Francois-Xavier Briol, Chris J. Oates, Jon Cockayne, Wilson Ye Chen, Mark Girolami

Abstract: The standard Kernel Quadrature method for numerical integration with random point sets (also called Bayesian Monte Carlo) is known to converge in root mean square error at a rate determined by the ratio $s/d$, where $s$ and $d$ encode the smoothness and dimension of the integrand. However, an empirical investigation reveals that the rate constant $C$ is highly sensitive to the distribution of the… ▽ More The standard Kernel Quadrature method for numerical integration with random point sets (also called Bayesian Monte Carlo) is known to converge in root mean square error at a rate determined by the ratio $s/d$, where $s$ and $d$ encode the smoothness and dimension of the integrand. However, an empirical investigation reveals that the rate constant $C$ is highly sensitive to the distribution of the random points. In contrast to standard Monte Carlo integration, for which optimal importance sampling is well-understood, the sampling distribution that minimises $C$ for Kernel Quadrature does not admit a closed form. This paper argues that the practical choice of sampling distribution is an important open problem. One solution is considered; a novel automatic approach based on adaptive tempering and sequential Monte Carlo. Empirical results demonstrate a dramatic reduction in integration error of up to 4 orders of magnitude can be achieved with the proposed method. △ Less

Submitted 11 June, 2017; originally announced June 2017.

Comments: To appear at Thirty-fourth International Conference on Machine Learning (ICML 2017)

Journal ref: Proceedings of the 34th International Conference on Machine Learning, PMLR 70:586-595, 2017

arXiv:1702.03673 [pdf, other]

doi 10.1137/17M1139357

Bayesian Probabilistic Numerical Methods

Authors: Jon Cockayne, Chris Oates, Tim Sullivan, Mark Girolami

Abstract: The emergent field of probabilistic numerics has thus far lacked clear statistical principals. This paper establishes Bayesian probabilistic numerical methods as those which can be cast as solutions to certain inverse problems within the Bayesian framework. This allows us to establish general conditions under which Bayesian probabilistic numerical methods are well-defined, encompassing both non-li… ▽ More The emergent field of probabilistic numerics has thus far lacked clear statistical principals. This paper establishes Bayesian probabilistic numerical methods as those which can be cast as solutions to certain inverse problems within the Bayesian framework. This allows us to establish general conditions under which Bayesian probabilistic numerical methods are well-defined, encompassing both non-linear and non-Gaussian models. For general computation, a numerical approximation scheme is proposed and its asymptotic convergence established. The theoretical development is then extended to pipelines of computation, wherein probabilistic numerical methods are composed to solve more challenging numerical tasks. The contribution highlights an important research frontier at the interface of numerical analysis and uncertainty quantification, with a challenging industrial application presented. △ Less

Submitted 7 July, 2017; v1 submitted 13 February, 2017; originally announced February 2017.

Journal ref: SIAM Review 61(4):756--789, 2019

arXiv:1701.04006 [pdf, other]

doi 10.1063/1.4985359

Probabilistic Numerical Methods for PDE-constrained Bayesian Inverse Problems

Authors: Jon Cockayne, Chris Oates, Tim Sullivan, Mark Girolami

Abstract: This paper develops meshless methods for probabilistically describing discretisation error in the numerical solution of partial differential equations. This construction enables the solution of Bayesian inverse problems while accounting for the impact of the discretisation of the forward problem. In particular, this drives statistical inferences to be more conservative in the presence of significa… ▽ More This paper develops meshless methods for probabilistically describing discretisation error in the numerical solution of partial differential equations. This construction enables the solution of Bayesian inverse problems while accounting for the impact of the discretisation of the forward problem. In particular, this drives statistical inferences to be more conservative in the presence of significant solver error. Theoretical results are presented describing rates of convergence for the posteriors in both the forward and inverse problems. This method is tested on a challenging inverse problem with a nonlinear forward model. △ Less

Submitted 15 January, 2017; originally announced January 2017.

arXiv:1610.08363 [pdf, ps, other]

Comments on "Bayesian Solution Uncertainty Quantification for Differential Equations" by Chkrebtii, Campbell, Calderhead & Girolami

Authors: Jon Cockayne

Abstract: I would like to thank the authors for their interesting and very clearly presented paper discussing probabilistic solvers for ODEs and PDEs. I would like to thank the authors for their interesting and very clearly presented paper discussing probabilistic solvers for ODEs and PDEs. △ Less

Submitted 25 October, 2016; originally announced October 2016.

arXiv:1610.06752 [pdf, ps, other]

Comments on "Bayesian Solution Uncertainty Quantification for Differential Equations" by Chkrebtii, Campbell, Calderhead & Girolami

Authors: Francois-Xavier Briol, Jon Cockayne, Onur Teymur

Abstract: We commend the authors for an exciting paper which provides a strong contribution to the emerging field of probabilistic numerics (PN). Below, we discuss aspects of prior modelling which need to be considered thoroughly in future work. We commend the authors for an exciting paper which provides a strong contribution to the emerging field of probabilistic numerics (PN). Below, we discuss aspects of prior modelling which need to be considered thoroughly in future work. △ Less

Submitted 21 October, 2016; originally announced October 2016.

Journal ref: Bayesian Analysis, Vol 11, Num 4, pp1285-1293, 2016

arXiv:1605.07811 [pdf, other]

Probabilistic Numerical Methods for Partial Differential Equations and Bayesian Inverse Problems

Authors: Jon Cockayne, Chris Oates, Tim Sullivan, Mark Girolami

Abstract: This paper develops a probabilistic numerical method for solution of partial differential equations (PDEs) and studies application of that method to PDE-constrained inverse problems. This approach enables the solution of challenging inverse problems whilst accounting, in a statistically principled way, for the impact of discretisation error due to numerical solution of the PDE. In particular, the… ▽ More This paper develops a probabilistic numerical method for solution of partial differential equations (PDEs) and studies application of that method to PDE-constrained inverse problems. This approach enables the solution of challenging inverse problems whilst accounting, in a statistically principled way, for the impact of discretisation error due to numerical solution of the PDE. In particular, the approach confers robustness to failure of the numerical PDE solver, with statistical inferences driven to be more conservative in the presence of substantial discretisation error. Going further, the problem of choosing a PDE solver is cast as a problem in the Bayesian design of experiments, where the aim is to minimise the impact of solver error on statistical inferences; here the challenge of non-linear PDEs is also considered. The method is applied to parameter inference problems in which discretisation error in non-negligible and must be accounted for in order to reach conclusions that are statistically valid. △ Less

Submitted 11 July, 2017; v1 submitted 25 May, 2016; originally announced May 2016.

Showing 1–18 of 18 results for author: Cockayne, J