-
Improving Gradient-guided Nested Sampling for Posterior Inference
Authors:
Pablo Lemos,
Nikolay Malkin,
Will Handley,
Yoshua Bengio,
Yashar Hezaveh,
Laurence Perreault-Levasseur
Abstract:
We present a performant, general-purpose gradient-guided nested sampling algorithm, ${\tt GGNS}$, combining the state of the art in differentiable programming, Hamiltonian slice sampling, clustering, mode separation, dynamic nested sampling, and parallelization. This unique combination allows ${\tt GGNS}$ to scale well with dimensionality and perform competitively on a variety of synthetic and rea…
▽ More
We present a performant, general-purpose gradient-guided nested sampling algorithm, ${\tt GGNS}$, combining the state of the art in differentiable programming, Hamiltonian slice sampling, clustering, mode separation, dynamic nested sampling, and parallelization. This unique combination allows ${\tt GGNS}$ to scale well with dimensionality and perform competitively on a variety of synthetic and real-world problems. We also show the potential of combining nested sampling with generative flow networks to obtain large amounts of high-quality samples from the posterior distribution. This combination leads to faster mode discovery and more accurate estimates of the partition function.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
aeons: approximating the end of nested sampling
Authors:
Zixiao Hu,
Artem Baryshnikov,
Will Handley
Abstract:
This paper presents analytic results on the anatomy of nested sampling, from which a technique is developed to estimate the run-time of the algorithm that works for any nested sampling implementation. We test these methods on both toy models and true cosmological nested sampling runs. The method gives an order-of-magnitude prediction of the end point at all times, forecasting the true endpoint wit…
▽ More
This paper presents analytic results on the anatomy of nested sampling, from which a technique is developed to estimate the run-time of the algorithm that works for any nested sampling implementation. We test these methods on both toy models and true cosmological nested sampling runs. The method gives an order-of-magnitude prediction of the end point at all times, forecasting the true endpoint within standard error around the halfway point.
△ Less
Submitted 30 November, 2023;
originally announced December 2023.
-
Kernel-, mean- and noise-marginalised Gaussian processes for exoplanet transits and $H_0$ inference
Authors:
Namu Kroupa,
David Yallup,
Will Handley,
Michael Hobson
Abstract:
Using a fully Bayesian approach, Gaussian Process regression is extended to include marginalisation over the kernel choice and kernel hyperparameters. In addition, Bayesian model comparison via the evidence enables direct kernel comparison. The calculation of the joint posterior was implemented with a transdimensional sampler which simultaneously samples over the discrete kernel choice and their h…
▽ More
Using a fully Bayesian approach, Gaussian Process regression is extended to include marginalisation over the kernel choice and kernel hyperparameters. In addition, Bayesian model comparison via the evidence enables direct kernel comparison. The calculation of the joint posterior was implemented with a transdimensional sampler which simultaneously samples over the discrete kernel choice and their hyperparameters by embedding these in a higher-dimensional space, from which samples are taken using nested sampling. Kernel recovery and mean function inference were explored on synthetic data from exoplanet transit light curve simulations. Subsequently, the method was extended to marginalisation over mean functions and noise models and applied to the inference of the present-day Hubble parameter, $H_0$, from real measurements of the Hubble parameter as a function of redshift, derived from the cosmologically model-independent cosmic chronometer and $Λ$CDM-dependent baryon acoustic oscillation observations. The inferred $H_0$ values from the cosmic chronometers, baryon acoustic oscillations and combined datasets are $H_0= 66 \pm 6\, \mathrm{km}\,\mathrm{s}^{-1}\,\mathrm{Mpc}^{-1}$, $H_0= 67 \pm 10\, \mathrm{km}\,\mathrm{s}^{-1}\,\mathrm{Mpc}^{-1}$ and $H_0= 69 \pm 6\, \mathrm{km}\,\mathrm{s}^{-1}\,\mathrm{Mpc}^{-1}$, respectively. The kernel posterior of the cosmic chronometers dataset prefers a non-stationary linear kernel. Finally, the datasets are shown to be not in tension with $\ln R=12.17\pm 0.02$.
△ Less
Submitted 12 February, 2024; v1 submitted 7 November, 2023;
originally announced November 2023.
-
Piecewise Normalizing Flows
Authors:
Harry Bevins,
Will Handley,
Thomas Gessey-Jones
Abstract:
Normalizing flows are an established approach for modelling complex probability densities through invertible transformations from a base distribution. However, the accuracy with which the target distribution can be captured by the normalizing flow is strongly influenced by the topology of the base distribution. A mismatch between the topology of the target and the base can result in a poor perform…
▽ More
Normalizing flows are an established approach for modelling complex probability densities through invertible transformations from a base distribution. However, the accuracy with which the target distribution can be captured by the normalizing flow is strongly influenced by the topology of the base distribution. A mismatch between the topology of the target and the base can result in a poor performance, as is typically the case for multi-modal problems. A number of different works have attempted to modify the topology of the base distribution to better match the target, either through the use of Gaussian Mixture Models (Izmailov et al., 2020; Ardizzone et al., 2020; Hagemann & Neumayer, 2021) or learned accept/reject sampling (Stimper et al., 2022). We introduce piecewise normalizing flows which divide the target distribution into clusters, with topologies that better match the standard normal base distribution, and train a series of flows to model complex multi-modal targets. We demonstrate the performance of the piecewise flows using some standard benchmarks and compare the accuracy of the flows to the approach taken in Stimper et al. (2022) for modelling multi-modal distributions. We find that our approach consistently outperforms the approach in Stimper et al. (2022) with a higher emulation accuracy on the standard benchmarks.
△ Less
Submitted 1 February, 2024; v1 submitted 4 May, 2023;
originally announced May 2023.
-
Nested sampling for physical scientists
Authors:
Greg Ashton,
Noam Bernstein,
Johannes Buchner,
Xi Chen,
Gábor Csányi,
Andrew Fowlie,
Farhan Feroz,
Matthew Griffiths,
Will Handley,
Michael Habeck,
Edward Higson,
Michael Hobson,
Anthony Lasenby,
David Parkinson,
Livia B. Pártay,
Matthew Pitkin,
Doris Schneider,
Joshua S. Speagle,
Leah South,
John Veitch,
Philipp Wacker,
David J. Wales,
David Yallup
Abstract:
We review Skilling's nested sampling (NS) algorithm for Bayesian inference and more broadly multi-dimensional integration. After recapitulating the principles of NS, we survey developments in implementing efficient NS algorithms in practice in high-dimensions, including methods for sampling from the so-called constrained prior. We outline the ways in which NS may be applied and describe the applic…
▽ More
We review Skilling's nested sampling (NS) algorithm for Bayesian inference and more broadly multi-dimensional integration. After recapitulating the principles of NS, we survey developments in implementing efficient NS algorithms in practice in high-dimensions, including methods for sampling from the so-called constrained prior. We outline the ways in which NS may be applied and describe the application of NS in three scientific fields in which the algorithm has proved to be useful: cosmology, gravitational-wave astronomy, and materials science. We close by making recommendations for best practice when using NS and by summarizing potential limitations and optimizations of NS.
△ Less
Submitted 31 May, 2022;
originally announced May 2022.
-
Split personalities in Bayesian Neural Networks: the case for full marginalisation
Authors:
David Yallup,
Will Handley,
Mike Hobson,
Anthony Lasenby,
Pablo Lemos
Abstract:
The true posterior distribution of a Bayesian neural network is massively multimodal. Whilst most of these modes are functionally equivalent, we demonstrate that there remains a level of real multimodality that manifests in even the simplest neural network setups. It is only by fully marginalising over all posterior modes, using appropriate Bayesian sampling tools, that we can capture the split pe…
▽ More
The true posterior distribution of a Bayesian neural network is massively multimodal. Whilst most of these modes are functionally equivalent, we demonstrate that there remains a level of real multimodality that manifests in even the simplest neural network setups. It is only by fully marginalising over all posterior modes, using appropriate Bayesian sampling tools, that we can capture the split personalities of the network. The ability of a network trained in this manner to reason between multiple candidate solutions dramatically improves the generalisability of the model, a feature we contend is not consistently captured by alternative approaches to the training of Bayesian neural networks. We provide a concise minimal example of this, which can provide lessons and a future path forward for correctly utilising the explainability and interpretability of Bayesian neural networks.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
Nested sampling for frequentist computation: fast estimation of small $p$-values
Authors:
Andrew Fowlie,
Sebastian Hoof,
Will Handley
Abstract:
We propose a novel method for computing $p$-values based on nested sampling (NS) applied to the sampling space rather than the parameter space of the problem, in contrast to its usage in Bayesian computation. The computational cost of NS scales as $\log^2{1/p}$, which compares favorably to the $1/p$ scaling for Monte Carlo (MC) simulations. For significances greater than about $4σ$ in both a toy p…
▽ More
We propose a novel method for computing $p$-values based on nested sampling (NS) applied to the sampling space rather than the parameter space of the problem, in contrast to its usage in Bayesian computation. The computational cost of NS scales as $\log^2{1/p}$, which compares favorably to the $1/p$ scaling for Monte Carlo (MC) simulations. For significances greater than about $4σ$ in both a toy problem and a simplified resonance search, we show that NS requires orders of magnitude fewer simulations than ordinary MC estimates. This is particularly relevant for high-energy physics, which adopts a $5σ$ gold standard for discovery. We conclude with remarks on new connections between Bayesian and frequentist computation and possibilities for tuning NS implementations for still better performance in this setting.
△ Less
Submitted 13 January, 2022; v1 submitted 27 May, 2021;
originally announced May 2021.
-
Nested sampling with any prior you like
Authors:
Justin Alsing,
Will Handley
Abstract:
Nested sampling is an important tool for conducting Bayesian analysis in Astronomy and other fields, both for sampling complicated posterior distributions for parameter inference, and for computing marginal likelihoods for model comparison. One technical obstacle to using nested sampling in practice is the requirement (for most common implementations) that prior distributions be provided in the fo…
▽ More
Nested sampling is an important tool for conducting Bayesian analysis in Astronomy and other fields, both for sampling complicated posterior distributions for parameter inference, and for computing marginal likelihoods for model comparison. One technical obstacle to using nested sampling in practice is the requirement (for most common implementations) that prior distributions be provided in the form of transformations from the unit hyper-cube to the target prior density. For many applications - particularly when using the posterior from one experiment as the prior for another - such a transformation is not readily available. In this letter we show that parametric bijectors trained on samples from a desired prior density provide a general-purpose method for constructing transformations from the uniform base density to a target prior, enabling the practical use of nested sampling under arbitrary priors. We demonstrate the use of trained bijectors in conjunction with nested sampling on a number of examples from cosmology.
△ Less
Submitted 28 June, 2021; v1 submitted 24 February, 2021;
originally announced February 2021.
-
Nested sampling with plateaus
Authors:
Andrew Fowlie,
Will Handley,
Liangliang Su
Abstract:
It was recently emphasised by Riley (2019); Schittenhelm & Wacker (2020) that that in the presence of plateaus in the likelihood function nested sampling (NS) produces faulty estimates of the evidence and posterior densities. After informally explaining the cause of the problem, we present a modified version of NS that handles plateaus and can be applied retrospectively to NS runs from popular NS…
▽ More
It was recently emphasised by Riley (2019); Schittenhelm & Wacker (2020) that that in the presence of plateaus in the likelihood function nested sampling (NS) produces faulty estimates of the evidence and posterior densities. After informally explaining the cause of the problem, we present a modified version of NS that handles plateaus and can be applied retrospectively to NS runs from popular NS software using anesthetic. In the modified NS, live points in a plateau are evicted one by one without replacement, with ordinary NS compression of the prior volume after each eviction but taking into account the dynamic number of live points. The live points are replenished once all points in the plateau are removed. We demonstrate it on a number of examples. Since the modification is simple, we propose that it becomes the canonical version of Skilling's NS algorithm.
△ Less
Submitted 24 February, 2021; v1 submitted 26 October, 2020;
originally announced October 2020.
-
Nested sampling cross-checks using order statistics
Authors:
Andrew Fowlie,
Will Handley,
Liangliang Su
Abstract:
Nested sampling (NS) is an invaluable tool in data analysis in modern astrophysics, cosmology, gravitational wave astronomy and particle physics. We identify a previously unused property of NS related to order statistics: the insertion indexes of new live points into the existing live points should be uniformly distributed. This observation enabled us to create a novel cross-check of single NS run…
▽ More
Nested sampling (NS) is an invaluable tool in data analysis in modern astrophysics, cosmology, gravitational wave astronomy and particle physics. We identify a previously unused property of NS related to order statistics: the insertion indexes of new live points into the existing live points should be uniformly distributed. This observation enabled us to create a novel cross-check of single NS runs. The tests can detect when an NS run failed to sample new live points from the constrained prior and plateaus in the likelihood function, which break an assumption of NS and thus leads to unreliable results. We applied our cross-check to NS runs on toy functions with known analytic results in 2 - 50 dimensions, showing that our approach can detect problematic runs on a variety of likelihoods, settings and dimensions. As an example of a realistic application, we cross-checked NS runs performed in the context of cosmological model selection. Since the cross-check is simple, we recommend that it become a mandatory test for every applicable NS run.
△ Less
Submitted 23 August, 2020; v1 submitted 5 June, 2020;
originally announced June 2020.
-
Compromise-free Bayesian neural networks
Authors:
Kamran Javid,
Will Handley,
Mike Hobson,
Anthony Lasenby
Abstract:
We conduct a thorough analysis of the relationship between the out-of-sample performance and the Bayesian evidence (marginal likelihood) of Bayesian neural networks (BNNs), as well as looking at the performance of ensembles of BNNs, both using the Boston housing dataset. Using the state-of-the-art in nested sampling, we numerically sample the full (non-Gaussian and multimodal) network posterior an…
▽ More
We conduct a thorough analysis of the relationship between the out-of-sample performance and the Bayesian evidence (marginal likelihood) of Bayesian neural networks (BNNs), as well as looking at the performance of ensembles of BNNs, both using the Boston housing dataset. Using the state-of-the-art in nested sampling, we numerically sample the full (non-Gaussian and multimodal) network posterior and obtain numerical estimates of the Bayesian evidence, considering network models with up to 156 trainable parameters. The networks have between zero and four hidden layers, either $\tanh$ or $ReLU$ activation functions, and with and without hierarchical priors. The ensembles of BNNs are obtained by determining the posterior distribution over networks, from the posterior samples of individual BNNs re-weighted by the associated Bayesian evidence values. There is good correlation between out-of-sample performance and evidence, as well as a remarkable symmetry between the evidence versus model size and out-of-sample performance versus model size planes. Networks with $ReLU$ activation functions have consistently higher evidences than those with $\tanh$ functions, and this is reflected in their out-of-sample performance. Ensembling over architectures acts to further improve performance relative to the individual BNNs.
△ Less
Submitted 13 June, 2020; v1 submitted 25 April, 2020;
originally announced April 2020.
-
Bayesian sparse reconstruction: a brute-force approach to astronomical imaging and machine learning
Authors:
Edward Higson,
Will Handley,
Michael Hobson,
Anthony Lasenby
Abstract:
We present a principled Bayesian framework for signal reconstruction, in which the signal is modelled by basis functions whose number (and form, if required) is determined by the data themselves. This approach is based on a Bayesian interpretation of conventional sparse reconstruction and regularisation techniques, in which sparsity is imposed through priors via Bayesian model selection. We demons…
▽ More
We present a principled Bayesian framework for signal reconstruction, in which the signal is modelled by basis functions whose number (and form, if required) is determined by the data themselves. This approach is based on a Bayesian interpretation of conventional sparse reconstruction and regularisation techniques, in which sparsity is imposed through priors via Bayesian model selection. We demonstrate our method for noisy 1- and 2-dimensional signals, including astronomical images. Furthermore, by using a product-space approach, the number and type of basis functions can be treated as integer parameters and their posterior distributions sampled directly. We show that order-of-magnitude increases in computational efficiency are possible from this technique compared to calculating the Bayesian evidences separately, and that further computational gains are possible using it in combination with dynamic nested sampling. Our approach can also be readily applied to neural networks, where it allows the network architecture to be determined by the data in a principled Bayesian manner by treating the number of nodes and hidden layers as parameters.
△ Less
Submitted 25 November, 2018; v1 submitted 12 September, 2018;
originally announced September 2018.
-
nestcheck: diagnostic tests for nested sampling calculations
Authors:
Edward Higson,
Will Handley,
Mike Hobson,
Anthony Lasenby
Abstract:
Nested sampling is an increasingly popular technique for Bayesian computation, in particular for multimodal, degenerate problems of moderate to high dimensionality. Without appropriate settings, however, nested sampling software may fail to explore such posteriors correctly; for example producing correlated samples or missing important modes. This paper introduces new diagnostic tests to assess th…
▽ More
Nested sampling is an increasingly popular technique for Bayesian computation, in particular for multimodal, degenerate problems of moderate to high dimensionality. Without appropriate settings, however, nested sampling software may fail to explore such posteriors correctly; for example producing correlated samples or missing important modes. This paper introduces new diagnostic tests to assess the reliability both of parameter estimation and evidence calculations using nested sampling software, and demonstrates them empirically. We present two new diagnostic plots for nested sampling, and give practical advice for nested sampling software users in astronomy and beyond. Our diagnostic tests and diagrams are implemented in nestcheck: a publicly available Python package for analysing nested sampling calculations, which is compatible with output from MultiNest, PolyChord and dyPolyChord.
△ Less
Submitted 6 October, 2018; v1 submitted 16 April, 2018;
originally announced April 2018.
-
Dynamic nested sampling: an improved algorithm for parameter estimation and evidence calculation
Authors:
Edward Higson,
Will Handley,
Mike Hobson,
Anthony Lasenby
Abstract:
We introduce dynamic nested sampling: a generalisation of the nested sampling algorithm in which the number of "live points" varies to allocate samples more efficiently. In empirical tests the new method significantly improves calculation accuracy compared to standard nested sampling with the same number of samples; this increase in accuracy is equivalent to speeding up the computation by factors…
▽ More
We introduce dynamic nested sampling: a generalisation of the nested sampling algorithm in which the number of "live points" varies to allocate samples more efficiently. In empirical tests the new method significantly improves calculation accuracy compared to standard nested sampling with the same number of samples; this increase in accuracy is equivalent to speeding up the computation by factors of up to ~72 for parameter estimation and ~7 for evidence calculations. We also show that the accuracy of both parameter estimation and evidence calculations can be improved simultaneously. In addition, unlike in standard nested sampling, more accurate results can be obtained by continuing the calculation for longer. Popular standard nested sampling implementations can be easily adapted to perform dynamic nested sampling, and several dynamic nested sampling software packages are now publicly available.
△ Less
Submitted 7 October, 2018; v1 submitted 11 April, 2017;
originally announced April 2017.
-
Sampling Errors in Nested Sampling Parameter Estimation
Authors:
Edward Higson,
Will Handley,
Mike Hobson,
Anthony Lasenby
Abstract:
Sampling errors in nested sampling parameter estimation differ from those in Bayesian evidence calculation, but have been little studied in the literature. This paper provides the first explanation of the two main sources of sampling errors in nested sampling parameter estimation, and presents a new diagrammatic representation for the process. We find no current method can accurately measure the p…
▽ More
Sampling errors in nested sampling parameter estimation differ from those in Bayesian evidence calculation, but have been little studied in the literature. This paper provides the first explanation of the two main sources of sampling errors in nested sampling parameter estimation, and presents a new diagrammatic representation for the process. We find no current method can accurately measure the parameter estimation errors of a single nested sampling run, and propose a method for doing so using a new algorithm for dividing nested sampling runs. We empirically verify our conclusions and the accuracy of our new method.
△ Less
Submitted 11 January, 2018; v1 submitted 28 March, 2017;
originally announced March 2017.