Search | arXiv e-print repository

Batch and match: black-box variational inference with a score-based divergence

Authors: Diana Cai, Chirag Modi, Loucas Pillaud-Vivien, Charles C. Margossian, Robert M. Gower, David M. Blei, Lawrence K. Saul

Abstract: Most leading implementations of black-box variational inference (BBVI) are based on optimizing a stochastic evidence lower bound (ELBO). But such approaches to BBVI often converge slowly due to the high variance of their gradient estimates and their sensitivity to hyperparameters. In this work, we propose batch and match (BaM), an alternative approach to BBVI based on a score-based divergence. Not… ▽ More Most leading implementations of black-box variational inference (BBVI) are based on optimizing a stochastic evidence lower bound (ELBO). But such approaches to BBVI often converge slowly due to the high variance of their gradient estimates and their sensitivity to hyperparameters. In this work, we propose batch and match (BaM), an alternative approach to BBVI based on a score-based divergence. Notably, this score-based divergence can be optimized by a closed-form proximal update for Gaussian variational families with full covariance matrices. We analyze the convergence of BaM when the target distribution is Gaussian, and we prove that in the limit of infinite batch size the variational parameter updates converge exponentially quickly to the target mean and covariance. We also evaluate the performance of BaM on Gaussian and non-Gaussian target distributions that arise from posterior inference in hierarchical and deep generative models. In these experiments, we find that BaM typically converges in fewer (and sometimes significantly fewer) gradient evaluations than leading implementations of BBVI based on ELBO maximization. △ Less

Submitted 12 June, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

Comments: 49 pages, 14 figures. To appear in the Proceedings of the 41st International Conference on Machine Learning (ICML), 2024

arXiv:2402.05137 [pdf, other]

doi 10.33232/001c.120559

LtU-ILI: An All-in-One Framework for Implicit Inference in Astrophysics and Cosmology

Authors: Matthew Ho, Deaglan J. Bartlett, Nicolas Chartier, Carolina Cuesta-Lazaro, Simon Ding, Axel Lapel, Pablo Lemos, Christopher C. Lovell, T. Lucas Makinen, Chirag Modi, Viraj Pandya, Shivam Pandey, Lucia A. Perez, Benjamin Wandelt, Greg L. Bryan

Abstract: This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It i… ▽ More This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It includes comprehensive validation metrics to assess posterior estimate coverage, enhancing the reliability of inferred results. Additionally, the pipeline is easily parallelizable and is designed for efficient exploration of modeling hyperparameters. To demonstrate its capabilities, we present real applications across a range of astrophysics and cosmology problems, such as: estimating galaxy cluster masses from X-ray photometry; inferring cosmology from matter power spectra and halo point clouds; characterizing progenitors in gravitational wave signals; capturing physical dust parameters from galaxy colors and luminosities; and establishing properties of semi-analytic models of galaxy formation. We also include exhaustive benchmarking and comparisons of all implemented methods as well as discussions about the challenges and pitfalls of ML inference in astronomical sciences. All code and examples are made publicly available at https://github.com/maho3/ltu-ili. △ Less

Submitted 2 July, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

Comments: 22 pages, 10 figures, accepted in the Open Journal of Astrophysics. Code available at https://github.com/maho3/ltu-ili

Journal ref: 2024 OJA, Vol. 7

arXiv:2310.15256 [pdf, other]

SimBIG: Field-level Simulation-Based Inference of Galaxy Clustering

Authors: Pablo Lemos, Liam Parker, ChangHoon Hahn, Shirley Ho, Michael Eickenberg, Jiamin Hou, Elena Massara, Chirag Modi, Azadeh Moradinezhad Dizgah, Bruno Regaldo-Saint Blancard, David Spergel

Abstract: We present the first simulation-based inference (SBI) of cosmological parameters from field-level analysis of galaxy clustering. Standard galaxy clustering analyses rely on analyzing summary statistics, such as the power spectrum, $P_\ell$, with analytic models based on perturbation theory. Consequently, they do not fully exploit the non-linear and non-Gaussian features of the galaxy distribution.… ▽ More We present the first simulation-based inference (SBI) of cosmological parameters from field-level analysis of galaxy clustering. Standard galaxy clustering analyses rely on analyzing summary statistics, such as the power spectrum, $P_\ell$, with analytic models based on perturbation theory. Consequently, they do not fully exploit the non-linear and non-Gaussian features of the galaxy distribution. To address these limitations, we use the {\sc SimBIG} forward modelling framework to perform SBI using normalizing flows. We apply SimBIG to a subset of the BOSS CMASS galaxy sample using a convolutional neural network with stochastic weight averaging to perform massive data compression of the galaxy field. We infer constraints on $Ω_m = 0.267^{+0.033}_{-0.029}$ and $σ_8=0.762^{+0.036}_{-0.035}$. While our constraints on $Ω_m$ are in-line with standard $P_\ell$ analyses, those on $σ_8$ are $2.65\times$ tighter. Our analysis also provides constraints on the Hubble constant $H_0=64.5 \pm 3.8 \ {\rm km / s / Mpc}$ from galaxy clustering alone. This higher constraining power comes from additional non-Gaussian cosmological information, inaccessible with $P_\ell$. We demonstrate the robustness of our analysis by showcasing our ability to infer unbiased cosmological constraints from a series of test simulations that are constructed using different forward models than the one used in our training dataset. This work not only presents competitive cosmological constraints but also introduces novel methods for leveraging additional cosmological information in upcoming galaxy surveys like DESI, PFS, and Euclid. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: 14 pages, 4 figures. A previous version of the paper was published in the ICML 2023 Workshop on Machine Learning for Astrophysics

arXiv:2307.07849 [pdf, other]

Variational Inference with Gaussian Score Matching

Authors: Chirag Modi, Charles Margossian, Yuling Yao, Robert Gower, David Blei, Lawrence Saul

Abstract: Variational inference (VI) is a method to approximate the computationally intractable posterior distributions that arise in Bayesian statistics. Typically, VI fits a simple parametric distribution to the target posterior by minimizing an appropriate objective such as the evidence lower bound (ELBO). In this work, we present a new approach to VI based on the principle of score matching, that if two… ▽ More Variational inference (VI) is a method to approximate the computationally intractable posterior distributions that arise in Bayesian statistics. Typically, VI fits a simple parametric distribution to the target posterior by minimizing an appropriate objective such as the evidence lower bound (ELBO). In this work, we present a new approach to VI based on the principle of score matching, that if two distributions are equal then their score functions (i.e., gradients of the log density) are equal at every point on their support. With this, we develop score matching VI, an iterative algorithm that seeks to match the scores between the variational approximation and the exact posterior. At each iteration, score matching VI solves an inner optimization, one that minimally adjusts the current variational estimate to match the scores at a newly sampled value of the latent variables. We show that when the variational family is a Gaussian, this inner optimization enjoys a closed form solution, which we call Gaussian score matching VI (GSM-VI). GSM-VI is also a ``black box'' variational algorithm in that it only requires a differentiable joint distribution, and as such it can be applied to a wide class of models. We compare GSM-VI to black box variational inference (BBVI), which has similar requirements but instead optimizes the ELBO. We study how GSM-VI behaves as a function of the problem dimensionality, the condition number of the target covariance matrix (when the target is Gaussian), and the degree of mismatch between the approximating and exact posterior distribution. We also study GSM-VI on a collection of real-world Bayesian inference problems from the posteriorDB database of datasets and models. In all of our studies we find that GSM-VI is faster than BBVI, but without sacrificing accuracy. It requires 10-100x fewer gradient evaluations to obtain a comparable quality of approximation. △ Less

Submitted 15 July, 2023; originally announced July 2023.

Comments: A Python code for GSM-VI algorithm is at https://github.com/modichirag/GSM-VI

arXiv:2110.00610 [pdf, other]

doi 10.1214/23-BA1360

Delayed rejection Hamiltonian Monte Carlo for sampling multiscale distributions

Authors: Chirag Modi, Alex Barnett, Bob Carpenter

Abstract: The efficiency of Hamiltonian Monte Carlo (HMC) can suffer when sampling a distribution with a wide range of length scales, because the small step sizes needed for stability in high-curvature regions are inefficient elsewhere. To address this we present a delayed rejection variant: if an initial HMC trajectory is rejected, we make one or more subsequent proposals each using a step size geometrical… ▽ More The efficiency of Hamiltonian Monte Carlo (HMC) can suffer when sampling a distribution with a wide range of length scales, because the small step sizes needed for stability in high-curvature regions are inefficient elsewhere. To address this we present a delayed rejection variant: if an initial HMC trajectory is rejected, we make one or more subsequent proposals each using a step size geometrically smaller than the last. We extend the standard delayed rejection framework by allowing the probability of a retry to depend on the probability of accepting the previous proposal. We test the scheme in several sampling tasks, including multiscale model distributions such as Neal's funnel, and statistical applications. Delayed rejection enables up to five-fold performance gains over optimally-tuned HMC, as measured by effective sample size per gradient evaluation. Even for simpler distributions, delayed rejection provides increased robustness to step size misspecification. Along the way, we provide an accessible but rigorous review of detailed balance for HMC. △ Less

Submitted 1 October, 2021; originally announced October 2021.

Comments: 30 pages, 10 figures

arXiv:1910.07178 [pdf, other]

Generative Learning of Counterfactual for Synthetic Control Applications in Econometrics

Authors: Chirag Modi, Uros Seljak

Abstract: A common statistical problem in econometrics is to estimate the impact of a treatment on a treated unit given a control sample with untreated outcomes. Here we develop a generative learning approach to this problem, learning the probability distribution of the data, which can be used for downstream tasks such as post-treatment counterfactual prediction and hypothesis testing. We use control sample… ▽ More A common statistical problem in econometrics is to estimate the impact of a treatment on a treated unit given a control sample with untreated outcomes. Here we develop a generative learning approach to this problem, learning the probability distribution of the data, which can be used for downstream tasks such as post-treatment counterfactual prediction and hypothesis testing. We use control samples to transform the data to a Gaussian and homoschedastic form and then perform Gaussian process analysis in Fourier space, evaluating the optimal Gaussian kernel via non-parametric power spectrum estimation. We combine this Gaussian prior with the data likelihood given by the pre-treatment data of the single unit, to obtain the synthetic prediction of the unit post-treatment, which minimizes the error variance of synthetic prediction. Given the generative model the minimum variance counterfactual is unique, and comes with an associated error covariance matrix. We extend this basic formalism to include correlations of primary variable with other covariates of interest. Given the probabilistic description of generative model we can compare synthetic data prediction with real data to address the question of whether the treatment had a statistically significant impact. For this purpose we develop a hypothesis testing approach and evaluate the Bayes factor. We apply the method to the well studied example of California (CA) tobacco sales tax of 1988. We also perform a placebo analysis using control states to validate our methodology. Our hypothesis testing method suggests 5.8:1 odds in favor of CA tobacco sales tax having an impact on the tobacco sales, a value that is at least three times higher than any of the 38 control states. △ Less

Submitted 16 October, 2019; originally announced October 2019.

Comments: 6 pages, 3 figures. Accepted at NeurIPS 2019 Workshop on Causal Machine Learning

arXiv:1607.03224 [pdf, other]

A fast algorithm for identifying Friends-of-Friends halos

Authors: Yu Feng, Chirag Modi

Abstract: We describe a simple and fast algorithm for identifying friends-of-friends features and prove its correctness. The algorithm avoids unnecessary expensive neighbor queries, uses minimal memory overhead, and rejects slowdown in high over-density regions. We define our algorithm formally based on pair enumeration, a problem that has been heavily studied in fast 2-point correlation codes and our refer… ▽ More We describe a simple and fast algorithm for identifying friends-of-friends features and prove its correctness. The algorithm avoids unnecessary expensive neighbor queries, uses minimal memory overhead, and rejects slowdown in high over-density regions. We define our algorithm formally based on pair enumeration, a problem that has been heavily studied in fast 2-point correlation codes and our reference implementation employs a dual KD-tree correlation function code. We construct features in a hierarchical tree structure, and use a splay operation to reduce the average cost of identifying the root of a feature from $O[\log L]$ to $O[1]$ ($L$ is the size of a feature) without additional memory costs. This reduces the overall time complexity of merging trees from $O[L\log L]$ to $O[L]$, reducing the number of operations per splay by orders of magnitude. We next introduce a pruning operation that skips merge operations between two fully self-connected KD-tree nodes. This improves the robustness of the algorithm, reducing the number of merge operations in high density peaks from $O[δ^2]$ to $O[δ]$. We show that for cosmological data set the algorithm eliminates more than half of merge operations for typically used linking lengths $b \sim 0.2$ (relative to mean separation). Furthermore, our algorithm is extremely simple and easy to implement on top of an existing pair enumeration code, reusing the optimization effort that has been invested in fast correlation function codes. △ Less

Submitted 31 May, 2017; v1 submitted 11 July, 2016; originally announced July 2016.

Comments: 11 pages, 6 figures. Published in Astronomy and Computing

arXiv:1208.3859 [pdf]

A novel approach for e-payment using virtual password system

Authors: Vishal Vadgama, Bhavin Tanti, Chirag Modi, Nishant Doshi

Abstract: In today's world of E-Commerce everything comes online like Music, E-Books, Shop** all most everything is online. If you are using some service or buying things online then you have to pay for that. For that you have to do Net Banking or you have to use Credit card which will do online payment for you. In today's environment when everything is online, the service you are using for E-Payment must… ▽ More In today's world of E-Commerce everything comes online like Music, E-Books, Shop** all most everything is online. If you are using some service or buying things online then you have to pay for that. For that you have to do Net Banking or you have to use Credit card which will do online payment for you. In today's environment when everything is online, the service you are using for E-Payment must be secure and you must protect your banking information like debit card or credit card information from possible threat of hacking. There were lots way to threat like Key logger, Forgery Detection, Phishing, Shoulder surfing. Therefore, we reveal our actual information of Bank and Credit Card then there will be a chance to lose data and same credit card and hackers can use banking information for malicious purpose. In this paper we discuss available E-Payment protocols, examine its advantages and delimitation's and shows that there are steel needs to design a more secure E-Payment protocol. The suggested protocol is based on using hash function and using dynamic or virtual password, which protects your banking or credit card information from possible threat of hacking when doing online transactions. △ Less

Submitted 19 August, 2012; originally announced August 2012.

Comments: http://airccse.org/journal/ijcis/papers/1111ijcis01.pdf, 2012

Showing 1–8 of 8 results for author: Modi, C