Search | arXiv e-print repository

Deep Learning as Ricci Flow

Authors: Anthony Baptista, Alessandro Barp, Tapabrata Chakraborti, Chris Harbron, Ben D. MacArthur, Christopher R. S. Banerji

Abstract: Deep neural networks (DNNs) are powerful tools for approximating the distribution of complex data. It is known that data passing through a trained DNN classifier undergoes a series of geometric and topological simplifications. While some progress has been made toward understanding these transformations in neural networks with smooth activation functions, an understanding in the more general settin… ▽ More Deep neural networks (DNNs) are powerful tools for approximating the distribution of complex data. It is known that data passing through a trained DNN classifier undergoes a series of geometric and topological simplifications. While some progress has been made toward understanding these transformations in neural networks with smooth activation functions, an understanding in the more general setting of non-smooth activation functions, such as the rectified linear unit (ReLU), which tend to perform better, is required. Here we propose that the geometric transformations performed by DNNs during classification tasks have parallels to those expected under Hamilton's Ricci flow - a tool from differential geometry that evolves a manifold by smoothing its curvature, in order to identify its topology. To illustrate this idea, we present a computational framework to quantify the geometric changes that occur as data passes through successive layers of a DNN, and use this framework to motivate a notion of `global Ricci network flow' that can be used to assess a DNN's ability to disentangle complex data geometries to solve classification problems. By training more than $1,500$ DNN classifiers of different widths and depths on synthetic and real-world data, we show that the strength of global Ricci network flow-like behaviour correlates with accuracy for well-trained DNNs, independently of depth, width and data set. Our findings motivate the use of tools from differential and discrete geometry to the problem of explainability in deep learning. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2311.17598 [pdf, other]

Improving embedding of graphs with missing data by soft manifolds

Authors: Andrea Marinoni, Pietro Lio', Alessandro Barp, Christian Jutten, Mark Girolami

Abstract: Embedding graphs in continous spaces is a key factor in designing and develo** algorithms for automatic information extraction to be applied in diverse tasks (e.g., learning, inferring, predicting). The reliability of graph embeddings directly depends on how much the geometry of the continuous space matches the graph structure. Manifolds are mathematical structure that can enable to incorporate… ▽ More Embedding graphs in continous spaces is a key factor in designing and develo** algorithms for automatic information extraction to be applied in diverse tasks (e.g., learning, inferring, predicting). The reliability of graph embeddings directly depends on how much the geometry of the continuous space matches the graph structure. Manifolds are mathematical structure that can enable to incorporate in their topological spaces the graph characteristics, and in particular nodes distances. State-of-the-art of manifold-based graph embedding algorithms take advantage of the assumption that the projection on a tangential space of each point in the manifold (corresponding to a node in the graph) would locally resemble a Euclidean space. Although this condition helps in achieving efficient analytical solutions to the embedding problem, it does not represent an adequate set-up to work with modern real life graphs, that are characterized by weighted connections across nodes often computed over sparse datasets with missing records. In this work, we introduce a new class of manifold, named soft manifold, that can solve this situation. In particular, soft manifolds are mathematical structures with spherical symmetry where the tangent spaces to each point are hypocycloids whose shape is defined according to the velocity of information propagation across the data points. Using soft manifolds for graph embedding, we can provide continuous spaces to pursue any task in data analysis over complex datasets. Experimental results on reconstruction tasks on synthetic and real datasets show how the proposed approach enable more accurate and reliable characterization of graphs in continuous spaces with respect to the state-of-the-art. △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2308.08305 [pdf, other]

Warped geometric information on the optimisation of Euclidean functions

Authors: Marcelo Hartmann, Bernardo Williams, Hanlin Yu, Mark Girolami, Alessandro Barp, Arto Klami

Abstract: We consider the fundamental task of optimising a real-valued function defined in a potentially high-dimensional Euclidean space, such as the loss function in many machine-learning tasks or the logarithm of the probability distribution in statistical inference. We use Riemannian geometry notions to redefine the optimisation problem of a function on the Euclidean space to a Riemannian manifold with… ▽ More We consider the fundamental task of optimising a real-valued function defined in a potentially high-dimensional Euclidean space, such as the loss function in many machine-learning tasks or the logarithm of the probability distribution in statistical inference. We use Riemannian geometry notions to redefine the optimisation problem of a function on the Euclidean space to a Riemannian manifold with a warped metric, and then find the function's optimum along this manifold. The warped metric chosen for the search domain induces a computational friendly metric-tensor for which optimal search directions associated with geodesic curves on the manifold becomes easier to compute. Performing optimization along geodesics is known to be generally infeasible, yet we show that in this specific manifold we can analytically derive Taylor approximations up to third-order. In general these approximations to the geodesic curve will not lie on the manifold, however we construct suitable retraction maps to pull them back onto the manifold. Therefore, we can efficiently optimize along the approximate geodesic curves. We cover the related theory, describe a practical optimization algorithm and empirically evaluate it on a collection of challenging optimisation benchmarks. Our proposed algorithm, using 3rd-order approximation of geodesics, tends to outperform standard Euclidean gradient-based counterparts in term of number of iterations until convergence. △ Less

Submitted 18 March, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

arXiv:2211.05408 [pdf, other]

Controlling Moments with Kernel Stein Discrepancies

Authors: Heishiro Kanagawa, Alessandro Barp, Arthur Gretton, Lester Mackey

Abstract: Kernel Stein discrepancies (KSDs) measure the quality of a distributional approximation and can be computed even when the target density has an intractable normalizing constant. Notable applications include the diagnosis of approximate MCMC samplers and goodness-of-fit tests for unnormalized statistical models. The present work analyzes the convergence control properties of KSDs. We first show tha… ▽ More Kernel Stein discrepancies (KSDs) measure the quality of a distributional approximation and can be computed even when the target density has an intractable normalizing constant. Notable applications include the diagnosis of approximate MCMC samplers and goodness-of-fit tests for unnormalized statistical models. The present work analyzes the convergence control properties of KSDs. We first show that standard KSDs used for weak convergence control fail to control moment convergence. To address this limitation, we next provide sufficient conditions under which alternative diffusion KSDs control both moment and weak convergence. As an immediate consequence we develop, for each $q > 0$, the first KSDs known to exactly characterize $q$-Wasserstein convergence. △ Less

Submitted 25 June, 2024; v1 submitted 10 November, 2022; originally announced November 2022.

Comments: 103 pages, 10 figures

arXiv:2209.12835 [pdf, ps, other]

Targeted Separation and Convergence with Kernel Discrepancies

Authors: Alessandro Barp, Carl-Johann Simon-Gabriel, Mark Girolami, Lester Mackey

Abstract: Maximum mean discrepancies (MMDs) like the kernel Stein discrepancy (KSD) have grown central to a wide range of applications, including hypothesis testing, sampler selection, distribution approximation, and variational inference. In each setting, these kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or even (ii) control weak convergence to… ▽ More Maximum mean discrepancies (MMDs) like the kernel Stein discrepancy (KSD) have grown central to a wide range of applications, including hypothesis testing, sampler selection, distribution approximation, and variational inference. In each setting, these kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or even (ii) control weak convergence to P. In this article we derive new sufficient and necessary conditions to ensure (i) and (ii). For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels and for controlling convergence with bounded kernels. We use these results on $\mathbb{R}^d$ to substantially broaden the known conditions for KSD separation and convergence control and to develop the first KSDs known to exactly metrize weak convergence to P. Along the way, we highlight the implications of our results for hypothesis testing, measuring and improving sample quality, and sampling with Stein variational gradient descent. △ Less

Submitted 6 December, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

arXiv:2203.10592 [pdf, other]

doi 10.1016/bs.host.2022.03.005

Geometric Methods for Sampling, Optimisation, Inference and Adaptive Agents

Authors: Alessandro Barp, Lancelot Da Costa, Guilherme França, Karl Friston, Mark Girolami, Michael I. Jordan, Grigorios A. Pavliotis

Abstract: In this chapter, we identify fundamental geometric structures that underlie the problems of sampling, optimisation, inference and adaptive decision-making. Based on this identification, we derive algorithms that exploit these geometric structures to solve these problems efficiently. We show that a wide range of geometric theories emerge naturally in these fields, ranging from measure-preserving pr… ▽ More In this chapter, we identify fundamental geometric structures that underlie the problems of sampling, optimisation, inference and adaptive decision-making. Based on this identification, we derive algorithms that exploit these geometric structures to solve these problems efficiently. We show that a wide range of geometric theories emerge naturally in these fields, ranging from measure-preserving processes, information divergences, Poisson geometry, and geometric integration. Specifically, we explain how (i) leveraging the symplectic geometry of Hamiltonian systems enable us to construct (accelerated) sampling and optimisation methods, (ii) the theory of Hilbertian subspaces and Stein operators provides a general methodology to obtain robust estimators, (iii) preserving the information geometry of decision-making yields adaptive agents that perform active inference. Throughout, we emphasise the rich connections between these fields; e.g., inference draws on sampling and optimisation, and adaptive decision-making assesses decisions by inferring their counterfactual consequences. Our exposition provides a conceptual overview of underlying ideas, rather than a technical discussion, which can be found in the references herein. △ Less

Submitted 25 July, 2022; v1 submitted 20 March, 2022; originally announced March 2022.

Comments: 30 pages, 4 figures; 42 pages including table of contents and references

Journal ref: Handbook of Statistics, vol. 46, pp. 21--78 (2022)

arXiv:2109.08944 [pdf, other]

Vector-Valued Control Variates

Authors: Zhuo Sun, Alessandro Barp, François-Xavier Briol

Abstract: Control variates are variance reduction tools for Monte Carlo estimators. They can provide significant variance reduction, but usually require a large number of samples, which can be prohibitive when sampling or evaluating the integrand is computationally expensive. Furthermore, there are many scenarios where we need to compute multiple related integrals simultaneously or sequentially, which can f… ▽ More Control variates are variance reduction tools for Monte Carlo estimators. They can provide significant variance reduction, but usually require a large number of samples, which can be prohibitive when sampling or evaluating the integrand is computationally expensive. Furthermore, there are many scenarios where we need to compute multiple related integrals simultaneously or sequentially, which can further exacerbate computational costs. In this paper, we propose vector-valued control variates, an extension of control variates which can be used to reduce the variance of multiple Monte Carlo estimators jointly. This allows for the transfer of information across integration tasks, and hence reduces the need for a large number of samples. We focus on control variates based on kernel interpolants and our novel construction is obtained through a generalised Stein identity and the development of novel matrix-valued Stein reproducing kernels. We demonstrate our methodology on a range of problems including multifidelity modelling, Bayesian inference for dynamical systems, and model evidence computation through thermodynamic integration. △ Less

Submitted 7 June, 2023; v1 submitted 18 September, 2021; originally announced September 2021.

Comments: Accepted for publication at ICML 2023

arXiv:2107.11231 [pdf, other]

Optimization on manifolds: A symplectic approach

Authors: Guilherme França, Alessandro Barp, Mark Girolami, Michael I. Jordan

Abstract: Optimization tasks are crucial in statistical machine learning. Recently, there has been great interest in leveraging tools from dynamical systems to derive accelerated and robust optimization methods via suitable discretizations of continuous-time systems. However, these ideas have mostly been limited to Euclidean spaces and unconstrained settings, or to Riemannian gradient flows. In this work, w… ▽ More Optimization tasks are crucial in statistical machine learning. Recently, there has been great interest in leveraging tools from dynamical systems to derive accelerated and robust optimization methods via suitable discretizations of continuous-time systems. However, these ideas have mostly been limited to Euclidean spaces and unconstrained settings, or to Riemannian gradient flows. In this work, we propose a dissipative extension of Dirac's theory of constrained Hamiltonian systems as a general framework for solving optimization problems over smooth manifolds, including problems with nonlinear constraints. We develop geometric/symplectic numerical integrators on manifolds that are "rate-matching," i.e., preserve the continuous-time rates of convergence. In particular, we introduce a dissipative RATTLE integrator able to achieve optimal convergence rate locally. Our class of (accelerated) algorithms are not only simple and efficient but also applicable to a broad range of contexts. △ Less

Submitted 4 July, 2023; v1 submitted 23 July, 2021; originally announced July 2021.

Comments: additional results, including rates for constrained optimization on manifolds

arXiv:2105.03481 [pdf, other]

Stein's Method Meets Computational Statistics: A Review of Some Recent Developments

Authors: Andreas Anastasiou, Alessandro Barp, François-Xavier Briol, Bruno Ebner, Robert E. Gaunt, Fatemeh Ghaderinezhad, Jackson Gorham, Arthur Gretton, Christophe Ley, Qiang Liu, Lester Mackey, Chris. J. Oates, Gesine Reinert, Yvik Swan

Abstract: Stein's method compares probability distributions through the study of a class of linear operators called Stein operators. While mainly studied in probability and used to underpin theoretical statistics, Stein's method has led to significant advances in computational statistics in recent years. The goal of this survey is to bring together some of these recent developments and, in doing so, to stim… ▽ More Stein's method compares probability distributions through the study of a class of linear operators called Stein operators. While mainly studied in probability and used to underpin theoretical statistics, Stein's method has led to significant advances in computational statistics in recent years. The goal of this survey is to bring together some of these recent developments and, in doing so, to stimulate further research into the successful field of Stein's method and statistics. The topics we discuss include tools to benchmark and compare sampling methods such as approximate Markov chain Monte Carlo, deterministic alternatives to sampling methods, control variate techniques, parameter estimation and goodness-of-fit testing. △ Less

Submitted 22 June, 2022; v1 submitted 7 May, 2021; originally announced May 2021.

Comments: Accepted for publication by "Statistical Science"

arXiv:2105.02845 [pdf, ps, other]

A Unifying and Canonical Description of Measure-Preserving Diffusions

Authors: Alessandro Barp, So Takao, Michael Betancourt, Alexis Arnaudon, Mark Girolami

Abstract: A complete recipe of measure-preserving diffusions in Euclidean space was recently derived unifying several MCMC algorithms into a single framework. In this paper, we develop a geometric theory that improves and generalises this construction to any manifold. We thereby demonstrate that the completeness result is a direct consequence of the topology of the underlying manifold and the geometry induc… ▽ More A complete recipe of measure-preserving diffusions in Euclidean space was recently derived unifying several MCMC algorithms into a single framework. In this paper, we develop a geometric theory that improves and generalises this construction to any manifold. We thereby demonstrate that the completeness result is a direct consequence of the topology of the underlying manifold and the geometry induced by the target measure $P$; there is no need to introduce other structures such as a Riemannian metric, local coordinates, or a reference measure. Instead, our framework relies on the intrinsic geometry of $P$ and in particular its canonical derivative, the deRham rotationnel, which allows us to parametrise the Fokker--Planck currents of measure-preserving diffusions using potentials. The geometric formalism can easily incorporate constraints and symmetries, and deliver new important insights, for example, a new complete recipe of Langevin-like diffusions that are suited to the construction of samplers. We also analyse the reversibility and dissipative properties of the diffusions, the associated deterministic flow on the space of measures, and the geometry of Langevin processes. Our article connects ideas from various literature and frames the theory of measure-preserving diffusions in its appropriate mathematical context. △ Less

Submitted 6 May, 2021; originally announced May 2021.

arXiv:2006.09268 [pdf, ps, other]

Metrizing Weak Convergence with Maximum Mean Discrepancies

Authors: Carl-Johann Simon-Gabriel, Alessandro Barp, Bernhard Schölkopf, Lester Mackey

Abstract: This paper characterizes the maximum mean discrepancies (MMD) that metrize the weak convergence of probability measures for a wide class of kernels. More precisely, we prove that, on a locally compact, non-compact, Hausdorff space, the MMD of a bounded continuous Borel measurable kernel k, whose reproducing kernel Hilbert space (RKHS) functions vanish at infinity, metrizes the weak convergence of… ▽ More This paper characterizes the maximum mean discrepancies (MMD) that metrize the weak convergence of probability measures for a wide class of kernels. More precisely, we prove that, on a locally compact, non-compact, Hausdorff space, the MMD of a bounded continuous Borel measurable kernel k, whose reproducing kernel Hilbert space (RKHS) functions vanish at infinity, metrizes the weak convergence of probability measures if and only if k is continuous and integrally strictly positive definite (i.s.p.d.) over all signed, finite, regular Borel measures. We also correct a prior result of Simon-Gabriel & Schölkopf (JMLR, 2018, Thm.12) by showing that there exist both bounded continuous i.s.p.d. kernels that do not metrize weak convergence and bounded continuous non-i.s.p.d. kernels that do metrize it. △ Less

Submitted 3 September, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

Comments: 14 pages. Corrects in particular Thm.12 of Simon-Gabriel and Schölkopf, JMLR, 19(44):1-29, 2018. See http://jmlr.org/papers/v19/16-291.html

MSC Class: 60B10 (Primary) 60F05; 60-08; 28-08 (Secondary) ACM Class: G.3; I.2.6; I.5.0

arXiv:1906.08283 [pdf, other]

Minimum Stein Discrepancy Estimators

Authors: Alessandro Barp, Francois-Xavier Briol, Andrew B. Duncan, Mark Girolami, Lester Mackey

Abstract: When maximum likelihood estimation is infeasible, one often turns to score matching, contrastive divergence, or minimum probability flow to obtain tractable parameter estimates. We provide a unifying perspective of these techniques as minimum Stein discrepancy estimators, and use this lens to design new diffusion kernel Stein discrepancy (DKSD) and diffusion score matching (DSM) estimators with co… ▽ More When maximum likelihood estimation is infeasible, one often turns to score matching, contrastive divergence, or minimum probability flow to obtain tractable parameter estimates. We provide a unifying perspective of these techniques as minimum Stein discrepancy estimators, and use this lens to design new diffusion kernel Stein discrepancy (DKSD) and diffusion score matching (DSM) estimators with complementary strengths. We establish the consistency, asymptotic normality, and robustness of DKSD and DSM estimators, then derive stochastic Riemannian gradient descent algorithms for their efficient optimisation. The main strength of our methodology is its flexibility, which allows us to design estimators with desirable properties for specific models at hand by carefully selecting a Stein discrepancy. We illustrate this advantage for several challenging problems for score matching, such as non-smooth, heavy-tailed or light-tailed densities. △ Less

Submitted 5 October, 2022; v1 submitted 19 June, 2019; originally announced June 2019.

Comments: Accepted for publication at NeurIPS 2019

arXiv:1906.05944 [pdf, other]

Statistical Inference for Generative Models with Maximum Mean Discrepancy

Authors: Francois-Xavier Briol, Alessandro Barp, Andrew B. Duncan, Mark Girolami

Abstract: While likelihood-based inference and its variants provide a statistically efficient and widely applicable approach to parametric inference, their application to models involving intractable likelihoods poses challenges. In this work, we study a class of minimum distance estimators for intractable generative models, that is, statistical models for which the likelihood is intractable, but simulation… ▽ More While likelihood-based inference and its variants provide a statistically efficient and widely applicable approach to parametric inference, their application to models involving intractable likelihoods poses challenges. In this work, we study a class of minimum distance estimators for intractable generative models, that is, statistical models for which the likelihood is intractable, but simulation is cheap. The distance considered, maximum mean discrepancy (MMD), is defined through the embedding of probability measures into a reproducing kernel Hilbert space. We study the theoretical properties of these estimators, showing that they are consistent, asymptotically normal and robust to model misspecification. A main advantage of these estimators is the flexibility offered by the choice of kernel, which can be used to trade-off statistical efficiency and robustness. On the algorithmic side, we study the geometry induced by MMD on the parameter space and use this to introduce a novel natural gradient descent-like algorithm for efficient implementation of these estimators. We illustrate the relevance of our theoretical results on several classes of models including a discrete-time latent Markov process and two multivariate stochastic differential equation models. △ Less

Submitted 13 June, 2019; originally announced June 2019.

arXiv:1905.03673 [pdf, other]

Stein Point Markov Chain Monte Carlo

Authors: Wilson Ye Chen, Alessandro Barp, François-Xavier Briol, Jackson Gorham, Mark Girolami, Lester Mackey, Chris. J. Oates

Abstract: An important task in machine learning and statistics is the approximation of a probability measure by an empirical measure supported on a discrete point set. Stein Points are a class of algorithms for this task, which proceed by sequentially minimising a Stein discrepancy between the empirical measure and the target and, hence, require the solution of a non-convex optimisation problem to obtain ea… ▽ More An important task in machine learning and statistics is the approximation of a probability measure by an empirical measure supported on a discrete point set. Stein Points are a class of algorithms for this task, which proceed by sequentially minimising a Stein discrepancy between the empirical measure and the target and, hence, require the solution of a non-convex optimisation problem to obtain each new point. This paper removes the need to solve this optimisation problem by, instead, selecting each new point based on a Markov chain sample path. This significantly reduces the computational cost of Stein Points and leads to a suite of algorithms that are straightforward to implement. The new algorithms are illustrated on a set of challenging Bayesian inference problems, and rigorous theoretical guarantees of consistency are established. △ Less

Submitted 14 September, 2020; v1 submitted 9 May, 2019; originally announced May 2019.

Comments: Minor bug fixed in Theorem 4 (result unchanged)

Journal ref: ICML 2019

arXiv:1903.08939 [pdf, other]

Irreversible Langevin MCMC on Lie Groups

Authors: Alexis Arnaudon, Alessandro Barp, So Takao

Abstract: It is well-known that irreversible MCMC algorithms converge faster to their stationary distributions than reversible ones. Using the special geometric structure of Lie groups $\mathcal G$ and dissipation fields compatible with the symplectic structure, we construct an irreversible HMC-like MCMC algorithm on $\mathcal G$, where we first update the momentum by solving an OU process on the correspond… ▽ More It is well-known that irreversible MCMC algorithms converge faster to their stationary distributions than reversible ones. Using the special geometric structure of Lie groups $\mathcal G$ and dissipation fields compatible with the symplectic structure, we construct an irreversible HMC-like MCMC algorithm on $\mathcal G$, where we first update the momentum by solving an OU process on the corresponding Lie algebra $\mathfrak g$, and then approximate the Hamiltonian system on $\mathcal G \times \mathfrak g$ with a reversible symplectic integrator followed by a Metropolis-Hastings correction step. In particular, when the OU process is simulated over sufficiently long times, we recover HMC as a special case. We illustrate this algorithm numerically using the example $\mathcal G = SO(3)$. △ Less

Submitted 21 March, 2019; originally announced March 2019.

arXiv:1903.04662 [pdf, ps, other]

Hamiltonian Monte Carlo On Lie Groups and Constrained Mechanics on Homogeneous Manifolds

Authors: Alessandro Barp

Abstract: In this paper we show that the Hamiltonian Monte Carlo method for compact Lie groups constructed in \cite{kennedy88b} using a symplectic structure can be recovered from canonical geometric mechanics with a bi-invariant metric. Hence we obtain the correspondence between the various formulations of Hamiltonian mechanics on Lie groups, and their induced HMC algorithms. Working on $\G\times \g$ we rec… ▽ More In this paper we show that the Hamiltonian Monte Carlo method for compact Lie groups constructed in \cite{kennedy88b} using a symplectic structure can be recovered from canonical geometric mechanics with a bi-invariant metric. Hence we obtain the correspondence between the various formulations of Hamiltonian mechanics on Lie groups, and their induced HMC algorithms. Working on $\G\times \g$ we recover the Euler-Arnold formulation of geodesic motion, and construct explicit HMC schemes that extend \cite{kennedy88b,Kennedy:2012} to non-compact Lie groups by choosing metrics with appropriate invariances. Finally we explain how mechanics on homogeneous spaces can be formulated as a constrained system over their associated Lie groups, and how in some important cases the constraints can be naturally handled by the symmetries of the Hamiltonian. △ Less

Submitted 14 March, 2019; v1 submitted 11 March, 2019; originally announced March 2019.

arXiv:1903.02699 [pdf, other]

Hamiltonian Monte Carlo on Symmetric and Homogeneous Spaces via Symplectic Reduction

Authors: Alessandro Barp, Anthony Kennedy, Mark Girolami

Abstract: The Hamiltonian Monte Carlo method generates samples by introducing a mechanical system that explores the target density. For distributions on manifolds it is not always simple to perform the mechanics as a result of the lack of global coordinates, the constraints of the manifold, and the requirement to compute the geodesic flow. In this paper we explain how to construct the Hamiltonian system on… ▽ More The Hamiltonian Monte Carlo method generates samples by introducing a mechanical system that explores the target density. For distributions on manifolds it is not always simple to perform the mechanics as a result of the lack of global coordinates, the constraints of the manifold, and the requirement to compute the geodesic flow. In this paper we explain how to construct the Hamiltonian system on naturally reductive homogeneous spaces using symplectic reduction, which lifts the HMC scheme to a matrix Lie group with global coordinates and constant metric. This provides a general framework that is applicable to many manifolds that arise in applications, such as hyperspheres, hyperbolic spaces, symmetric positive-definite matrices, Grassmannian, and Stiefel manifolds. △ Less

Submitted 18 April, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

arXiv:1810.04946 [pdf, other]

A Riemann-Stein Kernel Method

Authors: Alessandro Barp, Chris. J. Oates, Emilio Porcu, Mark Girolami

Abstract: This paper proposes and studies a numerical method for approximation of posterior expectations based on interpolation with a Stein reproducing kernel. Finite-sample-size bounds on the approximation error are established for posterior distributions supported on a compact Riemannian manifold, and we relate these to a kernel Stein discrepancy (KSD). Moreover, we prove in our setting that the KSD is e… ▽ More This paper proposes and studies a numerical method for approximation of posterior expectations based on interpolation with a Stein reproducing kernel. Finite-sample-size bounds on the approximation error are established for posterior distributions supported on a compact Riemannian manifold, and we relate these to a kernel Stein discrepancy (KSD). Moreover, we prove in our setting that the KSD is equivalent to Sobolev discrepancy and, in doing so, we completely characterise the convergence-determining properties of KSD. Our contribution is rooted in a novel combination of Stein's method, the theory of reproducing kernels, and existence and regularity results for partial differential equations on a Riemannian manifold. △ Less

Submitted 11 January, 2022; v1 submitted 11 October, 2018; originally announced October 2018.

arXiv:1712.01793 [pdf, other]

Posterior Integration on a Riemannian Manifold

Authors: Chris. J. Oates, Alessandro Barp, Mark Girolami

Abstract: The geodesic Markov chain Monte Carlo method and its variants enable computation of integrals with respect to a posterior supported on a manifold. However, for regular integrals, the convergence rate of the ergodic average will be sub-optimal. To fill this gap, this paper extends the efficient posterior integration method of Oates et al. (2017) to the case of a Riemannian manifold. In contrast to… ▽ More The geodesic Markov chain Monte Carlo method and its variants enable computation of integrals with respect to a posterior supported on a manifold. However, for regular integrals, the convergence rate of the ergodic average will be sub-optimal. To fill this gap, this paper extends the efficient posterior integration method of Oates et al. (2017) to the case of a Riemannian manifold. In contrast to the original Euclidean case, no non-trivial boundary conditions are needed for a closed manifold. The method is assessed through simulation and deployed to compute posterior integrals for an Australian Mesozoic paleomagnetic pole model, whose parameters are constrained to lie on the manifold $M = \mathbb{S}^2 \times \mathbb{R}_+$. △ Less

Submitted 14 October, 2018; v1 submitted 5 December, 2017; originally announced December 2017.

Comments: This paper was superseded by arXiv:1810.04946

arXiv:1705.02891 [pdf, other]

Geometry and Dynamics for Markov Chain Monte Carlo

Authors: Alessandro Barp, Francois-Xavier Briol, Anthony D. Kennedy, Mark Girolami

Abstract: Markov Chain Monte Carlo methods have revolutionised mathematical computation and enabled statistical inference within many previously intractable models. In this context, Hamiltonian dynamics have been proposed as an efficient way of building chains which can explore probability densities efficiently. The method emerges from physics and geometry and these links have been extensively studied by a… ▽ More Markov Chain Monte Carlo methods have revolutionised mathematical computation and enabled statistical inference within many previously intractable models. In this context, Hamiltonian dynamics have been proposed as an efficient way of building chains which can explore probability densities efficiently. The method emerges from physics and geometry and these links have been extensively studied by a series of authors through the last thirty years. However, there is currently a gap between the intuitions and knowledge of users of the methodology and our deep understanding of these theoretical foundations. The aim of this review is to provide a comprehensive introduction to the geometric tools used in Hamiltonian Monte Carlo at a level accessible to statisticians, machine learners and other users of the methodology with only a basic understanding of Monte Carlo methods. This will be complemented with some discussion of the most recent advances in the field which we believe will become increasingly relevant to applied scientists. △ Less

Submitted 8 May, 2017; originally announced May 2017.

Comments: Submitted to "Annual Review of Statistics and Its Applications"

arXiv:1505.00983 [pdf, other]

doi 10.1088/1751-8113/48/34/345002

A numerical study of the 3D random interchange and random loop models

Authors: Alessandro Barp, Edoardo Gabriele Barp, Francois-Xavier Briol, Daniel Ueltschi

Abstract: We have studied numerically the random interchange model and related loop models on the three-dimensional cubic lattice. We have determined the transition time for the occurrence of long loops. The joint distribution of the lengths of long loops is Poisson-Dirichlet with parameter 1 or 1/2. We have studied numerically the random interchange model and related loop models on the three-dimensional cubic lattice. We have determined the transition time for the occurrence of long loops. The joint distribution of the lengths of long loops is Poisson-Dirichlet with parameter 1 or 1/2. △ Less

Submitted 14 July, 2015; v1 submitted 5 May, 2015; originally announced May 2015.

Comments: 11 pages

MSC Class: 60K35; 82B20; 82B26

Journal ref: J. Phys. A: Math. Theor. 48 (2015) 345002

Showing 1–21 of 21 results for author: Barp, A