-
Minimizing $f$-Divergences by Interpolating Velocity Fields
Authors:
Song Liu,
Jiahao Yu,
Jack Simons,
Mingxuan Yi,
Mark Beaumont
Abstract:
Many machine learning problems can be seen as approximating a \textit{target} distribution using a \textit{particle} distribution by minimizing their statistical discrepancy. Wasserstein Gradient Flow can move particles along a path that minimizes the $f$-divergence between the target and particle distributions. To move particles, we need to calculate the corresponding velocity fields derived from…
▽ More
Many machine learning problems can be seen as approximating a \textit{target} distribution using a \textit{particle} distribution by minimizing their statistical discrepancy. Wasserstein Gradient Flow can move particles along a path that minimizes the $f$-divergence between the target and particle distributions. To move particles, we need to calculate the corresponding velocity fields derived from a density ratio function between these two distributions. Previous works estimated such density ratio functions and then differentiated the estimated ratios. These approaches may suffer from overfitting, leading to a less accurate estimate of the velocity fields. Inspired by non-parametric curve fitting, we directly estimate these velocity fields using interpolation techniques. We prove that our estimators are consistent under mild conditions. We validate their effectiveness using novel applications on domain adaptation and missing data imputation.
△ Less
Submitted 6 June, 2024; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Inference on eigenvectors of non-symmetric matrices
Authors:
Jerome R. Simons
Abstract:
This paper argues that the symmetrisability condition in Tyler (1981) is not necessary to establish asymptotic inference procedures for eigenvectors. We establish distribution theory for a Wald and t-test for full-vector and individual coefficient hypotheses, respectively. Our test statistics originate from eigenprojections of non-symmetric matrices. Representing projections as a map** from the…
▽ More
This paper argues that the symmetrisability condition in Tyler (1981) is not necessary to establish asymptotic inference procedures for eigenvectors. We establish distribution theory for a Wald and t-test for full-vector and individual coefficient hypotheses, respectively. Our test statistics originate from eigenprojections of non-symmetric matrices. Representing projections as a map** from the underlying matrix to its spectral data, we find derivatives through analytic perturbation theory. These results demonstrate how the analytic perturbation theory of Sun (1991) is a useful tool in multivariate statistics and are of independent interest. As an application, we define confidence sets for Bonacich centralities estimated from adjacency matrices induced by directed graphs.
△ Less
Submitted 4 April, 2023; v1 submitted 31 March, 2023;
originally announced March 2023.
-
Sequential Neural Score Estimation: Likelihood-Free Inference with Conditional Score Based Diffusion Models
Authors:
Louis Sharrock,
Jack Simons,
Song Liu,
Mark Beaumont
Abstract:
We introduce Sequential Neural Posterior Score Estimation (SNPSE), a score-based method for Bayesian inference in simulator-based models. Our method, inspired by the remarkable success of score-based methods in generative modelling, leverages conditional score-based diffusion models to generate samples from the posterior distribution of interest. The model is trained using an objective function wh…
▽ More
We introduce Sequential Neural Posterior Score Estimation (SNPSE), a score-based method for Bayesian inference in simulator-based models. Our method, inspired by the remarkable success of score-based methods in generative modelling, leverages conditional score-based diffusion models to generate samples from the posterior distribution of interest. The model is trained using an objective function which directly estimates the score of the posterior. We embed the model into a sequential training procedure, which guides simulations using the current approximation of the posterior at the observation of interest, thereby reducing the simulation cost. We also introduce several alternative sequential approaches, and discuss their relative merits. We then validate our method, as well as its amortised, non-sequential, variant on several numerical examples, demonstrating comparable or superior performance to existing state-of-the-art methods such as Sequential Neural Posterior Estimation (SNPE).
△ Less
Submitted 3 June, 2024; v1 submitted 10 October, 2022;
originally announced October 2022.
-
The Debiased Spatial Whittle Likelihood
Authors:
Arthur P. Guillaumin,
Adam M. Sykulski,
Sofia C. Olhede,
Frederik J. Simons
Abstract:
We provide a computationally and statistically efficient method for estimating the parameters of a stochastic covariance model observed on a regular spatial grid in any number of dimensions. Our proposed method, which we call the Debiased Spatial Whittle likelihood, makes important corrections to the well-known Whittle likelihood to account for large sources of bias caused by boundary effects and…
▽ More
We provide a computationally and statistically efficient method for estimating the parameters of a stochastic covariance model observed on a regular spatial grid in any number of dimensions. Our proposed method, which we call the Debiased Spatial Whittle likelihood, makes important corrections to the well-known Whittle likelihood to account for large sources of bias caused by boundary effects and aliasing. We generalise the approach to flexibly allow for significant volumes of missing data including those with lower-dimensional substructure, and for irregular sampling boundaries. We build a theoretical framework under relatively weak assumptions which ensures consistency and asymptotic normality in numerous practical settings including missing data and non-Gaussian processes. We also extend our consistency results to multivariate processes. We provide detailed implementation guidelines which ensure the estimation procedure can be conducted in O(n log n) operations, where n is the number of points of the encapsulating rectangular grid, thus kee** the computational scalability of Fourier and Whittle-based methods for large data sets. We validate our procedure over a range of simulated and real-world settings, and compare with state-of-the-art alternatives, demonstrating the enduring practical appeal of Fourier-based methods, provided they are corrected by the procedures developed in this paper.
△ Less
Submitted 26 April, 2022; v1 submitted 4 July, 2019;
originally announced July 2019.