-
Deep learning insights into non-universality in the halo mass function
Authors:
Ningyuan Guo,
Luisa Lucie-Smith,
Hiranya V. Peiris,
Andrew Pontzen,
Davide Piras
Abstract:
The abundance of dark matter haloes is a key cosmological probe in forthcoming galaxy surveys. The theoretical understanding of the halo mass function (HMF) is limited by our incomplete knowledge of the origin of non-universality and its cosmological parameter dependence. We present a deep learning model which compresses the linear matter power spectrum into three independent factors which are nec…
▽ More
The abundance of dark matter haloes is a key cosmological probe in forthcoming galaxy surveys. The theoretical understanding of the halo mass function (HMF) is limited by our incomplete knowledge of the origin of non-universality and its cosmological parameter dependence. We present a deep learning model which compresses the linear matter power spectrum into three independent factors which are necessary and sufficient to describe the $z=0$ HMF from the state-of-the-art AEMULUS emulator to sub-per cent accuracy in a $w$CDM$+N_\mathrm{eff}$ parameter space. Additional information about growth history does not improve the accuracy of HMF predictions if the matter power spectrum is already provided as input, because required aspects of the former can be inferred from the latter. The three factors carry information about the universal and non-universal aspects of the HMF, which we interrogate via the information-theoretic measure of mutual information. We find that non-universality is captured by recent growth history after matter-dark-energy equality and $N_\mathrm{eff}$ for $M\sim 10^{13} \, \mathrm{M_\odot}\, h^{-1}$ haloes, and by $Ω_{\rm m}$ for $M\sim 10^{15} \, \mathrm{M_\odot}\, h^{-1}$. The compact representation learnt by our model can inform the design of emulator training sets to achieve high emulator accuracy with fewer simulations.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
The future of cosmological likelihood-based inference: accelerated high-dimensional parameter estimation and model comparison
Authors:
Davide Piras,
Alicja Polanska,
Alessio Spurio Mancini,
Matthew A. Price,
Jason D. McEwen
Abstract:
We advocate for a new paradigm of cosmological likelihood-based inference, leveraging recent developments in machine learning and its underlying technology, to accelerate Bayesian inference in high-dimensional settings. Specifically, we combine (i) emulation, where a machine learning model is trained to mimic cosmological observables, e.g. CosmoPower-JAX; (ii) differentiable and probabilistic prog…
▽ More
We advocate for a new paradigm of cosmological likelihood-based inference, leveraging recent developments in machine learning and its underlying technology, to accelerate Bayesian inference in high-dimensional settings. Specifically, we combine (i) emulation, where a machine learning model is trained to mimic cosmological observables, e.g. CosmoPower-JAX; (ii) differentiable and probabilistic programming, e.g. JAX and NumPyro, respectively; (iii) scalable Markov chain Monte Carlo (MCMC) sampling techniques that exploit gradients, e.g. Hamiltonian Monte Carlo; and (iv) decoupled and scalable Bayesian model selection techniques that compute the Bayesian evidence purely from posterior samples, e.g. the learned harmonic mean implemented in harmonic. This paradigm allows us to carry out a complete Bayesian analysis, including both parameter estimation and model selection, in a fraction of the time of traditional approaches. First, we demonstrate the application of this paradigm on a simulated cosmic shear analysis for a Stage IV survey in 37- and 39-dimensional parameter spaces, comparing $Λ$CDM and a dynamical dark energy model ($w_0w_a$CDM). We recover posterior contours and evidence estimates that are in excellent agreement with those computed by the traditional nested sampling approach while reducing the computational cost from 8 months on 48 CPU cores to 2 days on 12 GPUs. Second, we consider a joint analysis between three simulated next-generation surveys, each performing a 3x2pt analysis, resulting in 157- and 159-dimensional parameter spaces. Standard nested sampling techniques are simply not feasible in this high-dimensional setting, requiring a projected 12 years of compute time on 48 CPU cores; on the other hand, the proposed approach only requires 8 days of compute time on 24 GPUs. All packages used in our analyses are publicly available.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Learned harmonic mean estimation of the Bayesian evidence with normalizing flows
Authors:
Alicja Polanska,
Matthew A. Price,
Davide Piras,
Alessio Spurio Mancini,
Jason D. McEwen
Abstract:
We present the learned harmonic mean estimator with normalizing flows - a robust, scalable and flexible estimator of the Bayesian evidence for model comparison. Since the estimator is agnostic to sampling strategy and simply requires posterior samples, it can be applied to compute the evidence using any Markov chain Monte Carlo (MCMC) sampling technique, including saved down MCMC chains, or any va…
▽ More
We present the learned harmonic mean estimator with normalizing flows - a robust, scalable and flexible estimator of the Bayesian evidence for model comparison. Since the estimator is agnostic to sampling strategy and simply requires posterior samples, it can be applied to compute the evidence using any Markov chain Monte Carlo (MCMC) sampling technique, including saved down MCMC chains, or any variational inference approach. The learned harmonic mean estimator was recently introduced, where machine learning techniques were developed to learn a suitable internal importance sampling target distribution to solve the issue of exploding variance of the original harmonic mean estimator. In this article we present the use of normalizing flows as the internal machine learning technique within the learned harmonic mean estimator. Normalizing flows can be elegantly coupled with the learned harmonic mean to provide an approach that is more robust, flexible and scalable than the machine learning models considered previously. We perform a series of numerical experiments, applying our method to benchmark problems and to a cosmological example in up to 21 dimensions. We find the learned harmonic mean estimator is in agreement with ground truth values and nested sampling estimates. The open-source harmonic Python package implementing the learned harmonic mean, now with normalizing flows included, is publicly available.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
A representation learning approach to probe for dynamical dark energy in matter power spectra
Authors:
Davide Piras,
Lucas Lombriser
Abstract:
We present DE-VAE, a variational autoencoder (VAE) architecture to search for a compressed representation of dynamical dark energy (DE) models in observational studies of the cosmic large-scale structure. DE-VAE is trained on matter power spectra boosts generated at wavenumbers $k\in(0.01-2.5) \ h/\rm{Mpc}$ and at four redshift values $z\in(0.1,0.48,0.78,1.5)$ for the most typical dynamical DE par…
▽ More
We present DE-VAE, a variational autoencoder (VAE) architecture to search for a compressed representation of dynamical dark energy (DE) models in observational studies of the cosmic large-scale structure. DE-VAE is trained on matter power spectra boosts generated at wavenumbers $k\in(0.01-2.5) \ h/\rm{Mpc}$ and at four redshift values $z\in(0.1,0.48,0.78,1.5)$ for the most typical dynamical DE parametrization with two extra parameters describing an evolving DE equation of state. The boosts are compressed to a lower-dimensional representation, which is concatenated with standard cold dark matter (CDM) parameters and then mapped back to reconstructed boosts; both the compression and the reconstruction components are parametrized as neural networks. Remarkably, we find that a single latent parameter is sufficient to predict 95% (99%) of DE power spectra generated over a broad range of cosmological parameters within $1σ$ ($2σ$) of a Gaussian error which includes cosmic variance, shot noise and systematic effects for a Stage IV-like survey. This single parameter shows a high mutual information with the two DE parameters, and these three variables can be linked together with an explicit equation through symbolic regression. Considering a model with two latent variables only marginally improves the accuracy of the predictions, and adding a third latent variable has no significant impact on the model's performance. We discuss how the DE-VAE architecture can be extended from a proof of concept to a general framework to be employed in the search for a common lower-dimensional parametrization of a wide range of beyond-$Λ$CDM models and for different cosmological datasets. Such a framework could then both inform the development of cosmological surveys by targeting optimal probes, and provide theoretical insight into the common phenomenological aspects of beyond-$Λ$CDM models.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
CosmoPower-JAX: high-dimensional Bayesian inference with differentiable cosmological emulators
Authors:
D. Piras,
A. Spurio Mancini
Abstract:
We present CosmoPower-JAX, a JAX-based implementation of the CosmoPower framework, which accelerates cosmological inference by building neural emulators of cosmological power spectra. We show how, using the automatic differentiation, batch evaluation and just-in-time compilation features of JAX, and running the inference pipeline on graphics processing units (GPUs), parameter estimation can be acc…
▽ More
We present CosmoPower-JAX, a JAX-based implementation of the CosmoPower framework, which accelerates cosmological inference by building neural emulators of cosmological power spectra. We show how, using the automatic differentiation, batch evaluation and just-in-time compilation features of JAX, and running the inference pipeline on graphics processing units (GPUs), parameter estimation can be accelerated by orders of magnitude with advanced gradient-based sampling techniques. These can be used to efficiently explore high-dimensional parameter spaces, such as those needed for the analysis of next-generation cosmological surveys. We showcase the accuracy and computational efficiency of CosmoPower-JAX on two simulated Stage IV configurations. We first consider a single survey performing a cosmic shear analysis totalling 37 model parameters. We validate the contours derived with CosmoPower-JAX and a Hamiltonian Monte Carlo sampler against those derived with a nested sampler and without emulators, obtaining a speed-up factor of $\mathcal{O}(10^3)$. We then consider a combination of three Stage IV surveys, each performing a joint cosmic shear and galaxy clustering (3x2pt) analysis, for a total of 157 model parameters. Even with such a high-dimensional parameter space, CosmoPower-JAX provides converged posterior contours in 3 days, as opposed to the estimated 6 years required by standard methods. CosmoPower-JAX is fully written in Python, and we make it publicly available to help the cosmological community meet the accuracy requirements set by next-generation surveys.
△ Less
Submitted 22 June, 2023; v1 submitted 10 May, 2023;
originally announced May 2023.
-
A robust estimator of mutual information for deep learning interpretability
Authors:
Davide Piras,
Hiranya V. Peiris,
Andrew Pontzen,
Luisa Lucie-Smith,
Ningyuan Guo,
Brian Nord
Abstract:
We develop the use of mutual information (MI), a well-established metric in information theory, to interpret the inner workings of deep learning models. To accurately estimate MI from a finite number of samples, we present GMM-MI (pronounced $``$Jimmie$"$), an algorithm based on Gaussian mixture models that can be applied to both discrete and continuous settings. GMM-MI is computationally efficien…
▽ More
We develop the use of mutual information (MI), a well-established metric in information theory, to interpret the inner workings of deep learning models. To accurately estimate MI from a finite number of samples, we present GMM-MI (pronounced $``$Jimmie$"$), an algorithm based on Gaussian mixture models that can be applied to both discrete and continuous settings. GMM-MI is computationally efficient, robust to the choice of hyperparameters and provides the uncertainty on the MI estimate due to the finite sample size. We extensively validate GMM-MI on toy data for which the ground truth MI is known, comparing its performance against established mutual information estimators. We then demonstrate the use of our MI estimator in the context of representation learning, working with synthetic data and physical datasets describing highly non-linear processes. We train deep learning models to encode high-dimensional data within a meaningful compressed (latent) representation, and use GMM-MI to quantify both the level of disentanglement between the latent variables, and their association with relevant physical quantities, thus unlocking the interpretability of the latent representation. We make GMM-MI publicly available.
△ Less
Submitted 23 March, 2023; v1 submitted 31 October, 2022;
originally announced November 2022.
-
Fast and realistic large-scale structure from machine-learning-augmented random field simulations
Authors:
Davide Piras,
Benjamin Joachimi,
Francisco Villaescusa-Navarro
Abstract:
Producing thousands of simulations of the dark matter distribution in the Universe with increasing precision is a challenging but critical task to facilitate the exploitation of current and forthcoming cosmological surveys. Many inexpensive substitutes to full $N$-body simulations have been proposed, even though they often fail to reproduce the statistics of the smaller, non-linear scales. Among t…
▽ More
Producing thousands of simulations of the dark matter distribution in the Universe with increasing precision is a challenging but critical task to facilitate the exploitation of current and forthcoming cosmological surveys. Many inexpensive substitutes to full $N$-body simulations have been proposed, even though they often fail to reproduce the statistics of the smaller, non-linear scales. Among these alternatives, a common approximation is represented by the lognormal distribution, which comes with its own limitations as well, while being extremely fast to compute even for high-resolution density fields. In this work, we train a generative deep learning model, mainly made of convolutional layers, to transform projected lognormal dark matter density fields to more realistic dark matter maps, as obtained from full $N$-body simulations. We detail the procedure that we follow to generate highly correlated pairs of lognormal and simulated maps, which we use as our training data, exploiting the information of the Fourier phases. We demonstrate the performance of our model comparing various statistical tests with different field resolutions, redshifts and cosmological parameters, proving its robustness and explaining its current limitations. When evaluated on 100 test maps, the augmented lognormal random fields reproduce the power spectrum up to wavenumbers of $1 \ h \ \rm{Mpc}^{-1}$, and the bispectrum within 10%, and always within the error bars, of the fiducial target simulations. Finally, we describe how we plan to integrate our proposed model with existing tools to yield more accurate spherical random fields for weak lensing analysis.
△ Less
Submitted 1 February, 2023; v1 submitted 16 May, 2022;
originally announced May 2022.
-
Discovering the building blocks of dark matter halo density profiles with neural networks
Authors:
Luisa Lucie-Smith,
Hiranya V. Peiris,
Andrew Pontzen,
Brian Nord,
Jeyan Thiyagalingam,
Davide Piras
Abstract:
The density profiles of dark matter halos are typically modeled using empirical formulae fitted to the density profiles of relaxed halo populations. We present a neural network model that is trained to learn the map** from the raw density field containing each halo to the dark matter density profile. We show that the model recovers the widely-used Navarro-Frenk-White (NFW) profile out to the vir…
▽ More
The density profiles of dark matter halos are typically modeled using empirical formulae fitted to the density profiles of relaxed halo populations. We present a neural network model that is trained to learn the map** from the raw density field containing each halo to the dark matter density profile. We show that the model recovers the widely-used Navarro-Frenk-White (NFW) profile out to the virial radius, and can additionally describe the variability in the outer profile of the halos. The neural network architecture consists of a supervised encoder-decoder framework, which first compresses the density inputs into a low-dimensional latent representation, and then outputs $ρ(r)$ for any desired value of radius $r$. The latent representation contains all the information used by the model to predict the density profiles. This allows us to interpret the latent representation by quantifying the mutual information between the representation and the halos' ground-truth density profiles. A two-dimensional representation is sufficient to accurately model the density profiles up to the virial radius; however, a three-dimensional representation is required to describe the outer profiles beyond the virial radius. The additional dimension in the representation contains information about the infalling material in the outer profiles of dark matter halos, thus discovering the splashback boundary of halos without prior knowledge of the halos' dynamical history.
△ Less
Submitted 13 May, 2022; v1 submitted 16 March, 2022;
originally announced March 2022.
-
Towards Machine Learning-Based Meta-Studies: Applications to Cosmological Parameters
Authors:
Tom Crossland,
Pontus Stenetorp,
Daisuke Kawata,
Sebastian Riedel,
Thomas D. Kitching,
Anurag Deshpande,
Tom Kimpson,
Choong Ling Liew-Cain,
Christian Pedersen,
Davide Piras,
Monu Sharma
Abstract:
We develop a new model for automatic extraction of reported measurement values from the astrophysical literature, utilising modern Natural Language Processing techniques. We use this model to extract measurements present in the abstracts of the approximately 248,000 astrophysics articles from the arXiv repository, yielding a database containing over 231,000 astrophysical numerical measurements. Fu…
▽ More
We develop a new model for automatic extraction of reported measurement values from the astrophysical literature, utilising modern Natural Language Processing techniques. We use this model to extract measurements present in the abstracts of the approximately 248,000 astrophysics articles from the arXiv repository, yielding a database containing over 231,000 astrophysical numerical measurements. Furthermore, we present an online interface (Numerical Atlas) to allow users to query and explore this database, based on parameter names and symbolic representations, and download the resulting datasets for their own research uses. To illustrate potential use cases we then collect values for nine different cosmological parameters using this tool. From these results we can clearly observe the historical trends in the reported values of these quantities over the past two decades, and see the impacts of landmark publications on our understanding of cosmology.
△ Less
Submitted 1 July, 2021;
originally announced July 2021.
-
COSMOPOWER: emulating cosmological power spectra for accelerated Bayesian inference from next-generation surveys
Authors:
A. Spurio Mancini,
D. Piras,
J. Alsing,
B. Joachimi,
M. P. Hobson
Abstract:
We present $\it{CosmoPower}$, a suite of neural cosmological power spectrum emulators providing orders-of-magnitude acceleration for parameter estimation from two-point statistics analyses of Large-Scale Structure (LSS) and Cosmic Microwave Background (CMB) surveys. The emulators replace the computation of matter and CMB power spectra from Boltzmann codes; thus, they do not need to be re-trained f…
▽ More
We present $\it{CosmoPower}$, a suite of neural cosmological power spectrum emulators providing orders-of-magnitude acceleration for parameter estimation from two-point statistics analyses of Large-Scale Structure (LSS) and Cosmic Microwave Background (CMB) surveys. The emulators replace the computation of matter and CMB power spectra from Boltzmann codes; thus, they do not need to be re-trained for different choices of astrophysical nuisance parameters or redshift distributions. The matter power spectrum emulation error is less than $0.4\%$ in the wavenumber range $k \in [10^{-5}, 10] \, \mathrm{Mpc}^{-1}$, for redshift $z \in [0, 5]$. $\it{CosmoPower}$ emulates CMB temperature, polarisation and lensing potential power spectra in the $5σ$ region of parameter space around the $\it{Planck}$ best fit values with an error $\lesssim 10\%$ of the expected shot noise for the forthcoming Simons Observatory. $\it{CosmoPower}$ is showcased on a joint cosmic shear and galaxy clustering analysis from the Kilo-Degree Survey, as well as on a Stage IV $\it{Euclid}$-like simulated cosmic shear analysis. For the CMB case, $\it{CosmoPower}$ is tested on a $\it{Planck}$ 2018 CMB temperature and polarisation analysis. The emulators always recover the fiducial cosmological constraints with differences in the posteriors smaller than sampling noise, while providing a speed-up factor up to $O(10^4)$ to the complete inference pipeline. This acceleration allows posterior distributions to be recovered in just a few seconds, as we demonstrate in the $\it{Planck}$ likelihood case. $\it{CosmoPower}$ is written entirely in Python, can be interfaced with all commonly used cosmological samplers and is publicly available at https://github.com/alessiospuriomancini/cosmopower .
△ Less
Submitted 31 January, 2022; v1 submitted 7 June, 2021;
originally announced June 2021.
-
The mass dependence of dark matter halo alignments with large-scale structure
Authors:
Davide Piras,
Benjamin Joachimi,
Björn Malte Schäfer,
Mario Bonamigo,
Stefan Hilbert,
Edo van Uitert
Abstract:
Tidal gravitational forces can modify the shape of galaxies and clusters of galaxies, thus correlating their orientation with the surrounding matter density field. We study the dependence of this phenomenon, known as intrinsic alignment (IA), on the mass of the dark matter haloes that host these bright structures, analysing the Millennium and Millennium-XXL $N$-body simulations. We closely follow…
▽ More
Tidal gravitational forces can modify the shape of galaxies and clusters of galaxies, thus correlating their orientation with the surrounding matter density field. We study the dependence of this phenomenon, known as intrinsic alignment (IA), on the mass of the dark matter haloes that host these bright structures, analysing the Millennium and Millennium-XXL $N$-body simulations. We closely follow the observational approach, measuring the halo position-halo shape alignment and subsequently dividing out the dependence on halo bias. We derive a theoretical scaling of the IA amplitude with mass in a dark matter universe, and predict a power-law with slope $β_{\mathrm{M}}$ in the range $1/3$ to $1/2$, depending on mass scale. We find that the simulation data agree with each other and with the theoretical prediction remarkably well over three orders of magnitude in mass, with the joint analysis yielding an estimate of $β_{\mathrm{M}} = 0.36^{+0.01}_{-0.01}$. This result does not depend on redshift or on the details of the halo shape measurement. The analysis is repeated on observational data, obtaining a significantly higher value, $β_{\mathrm{M}} = 0.56^{+0.05}_{-0.05}$. There are also small but significant deviations from our simple model in the simulation signals at both the high- and low-mass end. We discuss possible reasons for these discrepancies, and argue that they can be attributed to physical processes not captured in the model or in the dark matter-only simulations.
△ Less
Submitted 31 October, 2017; v1 submitted 20 July, 2017;
originally announced July 2017.