-
Recursive Monte Carlo and Variational Inference with Auxiliary Variables
Authors:
Alexander K. Lew,
Marco Cusumano-Towner,
Vikash K. Mansinghka
Abstract:
A key design constraint when implementing Monte Carlo and variational inference algorithms is that it must be possible to cheaply and exactly evaluate the marginal densities of proposal distributions and variational families. This takes many interesting proposals off the table, such as those based on involved simulations or stochastic optimization. This paper broadens the design space, by presenti…
▽ More
A key design constraint when implementing Monte Carlo and variational inference algorithms is that it must be possible to cheaply and exactly evaluate the marginal densities of proposal distributions and variational families. This takes many interesting proposals off the table, such as those based on involved simulations or stochastic optimization. This paper broadens the design space, by presenting a framework for applying Monte Carlo and variational inference algorithms when proposal densities cannot be exactly evaluated. Our framework, recursive auxiliary-variable inference (RAVI), instead approximates the necessary densities using meta-inference: an additional layer of Monte Carlo or variational inference, that targets the proposal, rather than the model. RAVI generalizes and unifies several existing methods for inference with expressive approximating families, which we show correspond to specific choices of meta-inference algorithm, and provides new theory for analyzing their bias and variance. We illustrate RAVI's design framework and theorems by using them to analyze and improve upon Salimans et al.'s Markov Chain Variational Inference, and to design a novel sampler for Dirichlet process mixtures, achieving state-of-the-art results on a standard benchmark dataset from astronomy and on a challenging datacleaning task with Medicare hospital data.
△ Less
Submitted 18 November, 2022; v1 submitted 5 March, 2022;
originally announced March 2022.
-
Estimators of Entropy and Information via Inference in Probabilistic Models
Authors:
Feras A. Saad,
Marco Cusumano-Towner,
Vikash K. Mansinghka
Abstract:
Estimating information-theoretic quantities such as entropy and mutual information is central to many problems in statistics and machine learning, but challenging in high dimensions. This paper presents estimators of entropy via inference (EEVI), which deliver upper and lower bounds on many information quantities for arbitrary variables in a probabilistic generative model. These estimators use imp…
▽ More
Estimating information-theoretic quantities such as entropy and mutual information is central to many problems in statistics and machine learning, but challenging in high dimensions. This paper presents estimators of entropy via inference (EEVI), which deliver upper and lower bounds on many information quantities for arbitrary variables in a probabilistic generative model. These estimators use importance sampling with proposal distribution families that include amortized variational inference and sequential Monte Carlo, which can be tailored to the target model and used to squeeze true information values with high accuracy. We present several theoretical properties of EEVI and demonstrate scalability and efficacy on two problems from the medical domain: (i) in an expert system for diagnosing liver disorders, we rank medical tests according to how informative they are about latent diseases, given a pattern of observed symptoms and patient attributes; and (ii) in a differential equation model of carbohydrate metabolism, we find optimal times to take blood glucose measurements that maximize information about a diabetic patient's insulin sensitivity, given their meal and medication schedule.
△ Less
Submitted 12 December, 2022; v1 submitted 24 February, 2022;
originally announced February 2022.
-
3DP3: 3D Scene Perception via Probabilistic Programming
Authors:
Nishad Gothoskar,
Marco Cusumano-Towner,
Ben Zinberg,
Matin Ghavamizadeh,
Falk Pollok,
Austin Garrett,
Joshua B. Tenenbaum,
Dan Gutfreund,
Vikash K. Mansinghka
Abstract:
We present 3DP3, a framework for inverse graphics that uses inference in a structured generative model of objects, scenes, and images. 3DP3 uses (i) voxel models to represent the 3D shape of objects, (ii) hierarchical scene graphs to decompose scenes into objects and the contacts between them, and (iii) depth image likelihoods based on real-time graphics. Given an observed RGB-D image, 3DP3's infe…
▽ More
We present 3DP3, a framework for inverse graphics that uses inference in a structured generative model of objects, scenes, and images. 3DP3 uses (i) voxel models to represent the 3D shape of objects, (ii) hierarchical scene graphs to decompose scenes into objects and the contacts between them, and (iii) depth image likelihoods based on real-time graphics. Given an observed RGB-D image, 3DP3's inference algorithm infers the underlying latent 3D scene, including the object poses and a parsimonious joint parametrization of these poses, using fast bottom-up pose proposals, novel involutive MCMC updates of the scene graph structure, and, optionally, neural object detectors and pose estimators. We show that 3DP3 enables scene understanding that is aware of 3D shape, occlusion, and contact structure. Our results demonstrate that 3DP3 is more accurate at 6DoF object pose estimation from real images than deep learning baselines and shows better generalization to challenging scenes with novel viewpoints, contact, and partial observability.
△ Less
Submitted 30 October, 2021;
originally announced November 2021.
-
Automating Involutive MCMC using Probabilistic and Differentiable Programming
Authors:
Marco Cusumano-Towner,
Alexander K. Lew,
Vikash K. Mansinghka
Abstract:
Involutive MCMC is a unifying mathematical construction for MCMC kernels that generalizes many classic and state-of-the-art MCMC algorithms, from reversible jump MCMC to kernels based on deep neural networks. But as with MCMC samplers more generally, implementing involutive MCMC kernels is often tedious and error-prone, especially when sampling on complex state spaces. This paper describes a techn…
▽ More
Involutive MCMC is a unifying mathematical construction for MCMC kernels that generalizes many classic and state-of-the-art MCMC algorithms, from reversible jump MCMC to kernels based on deep neural networks. But as with MCMC samplers more generally, implementing involutive MCMC kernels is often tedious and error-prone, especially when sampling on complex state spaces. This paper describes a technique for automating the implementation of involutive MCMC kernels given (i) a pair of probabilistic programs defining the target distribution and an auxiliary distribution respectively and (ii) a differentiable program that transforms the execution traces of these probabilistic programs. The technique, which is implemented as part of the Gen probabilistic programming system, also automatically detects user errors in the specification of involutive MCMC kernels and exploits sparsity in the kernels for improved efficiency. The paper shows example Gen code for a split-merge reversible jump move in an infinite Gaussian mixture model and a state-dependent mixture of proposals on a combinatorial space of covariance functions for a Gaussian process.
△ Less
Submitted 20 July, 2020; v1 submitted 19 July, 2020;
originally announced July 2020.
-
Bayesian Synthesis of Probabilistic Programs for Automatic Data Modeling
Authors:
Feras A. Saad,
Marco F. Cusumano-Towner,
Ulrich Schaechtle,
Martin C. Rinard,
Vikash K. Mansinghka
Abstract:
We present new techniques for automatically constructing probabilistic programs for data analysis, interpretation, and prediction. These techniques work with probabilistic domain-specific data modeling languages that capture key properties of a broad class of data generating processes, using Bayesian inference to synthesize probabilistic programs in these modeling languages given observed data. We…
▽ More
We present new techniques for automatically constructing probabilistic programs for data analysis, interpretation, and prediction. These techniques work with probabilistic domain-specific data modeling languages that capture key properties of a broad class of data generating processes, using Bayesian inference to synthesize probabilistic programs in these modeling languages given observed data. We provide a precise formulation of Bayesian synthesis for automatic data modeling that identifies sufficient conditions for the resulting synthesis procedure to be sound. We also derive a general class of synthesis algorithms for domain-specific languages specified by probabilistic context-free grammars and establish the soundness of our approach for these languages. We apply the techniques to automatically synthesize probabilistic programs for time series data and multivariate tabular data. We show how to analyze the structure of the synthesized programs to compute, for key qualitative properties of interest, the probability that the underlying data generating process exhibits each of these properties. Second, we translate probabilistic programs in the domain-specific language into probabilistic programs in Venture, a general-purpose probabilistic programming system. The translated Venture programs are then executed to obtain predictions of new time series data and new multivariate data records. Experimental results show that our techniques can accurately infer qualitative structure in multiple real-world data sets and outperform standard data analysis methods in forecasting and predicting new data.
△ Less
Submitted 14 July, 2019;
originally announced July 2019.
-
Using probabilistic programs as proposals
Authors:
Marco F. Cusumano-Towner,
Vikash K. Mansinghka
Abstract:
Monte Carlo inference has asymptotic guarantees, but can be slow when using generic proposals. Handcrafted proposals that rely on user knowledge about the posterior distribution can be efficient, but are difficult to derive and implement. This paper proposes to let users express their posterior knowledge in the form of proposal programs, which are samplers written in probabilistic programming lang…
▽ More
Monte Carlo inference has asymptotic guarantees, but can be slow when using generic proposals. Handcrafted proposals that rely on user knowledge about the posterior distribution can be efficient, but are difficult to derive and implement. This paper proposes to let users express their posterior knowledge in the form of proposal programs, which are samplers written in probabilistic programming languages. One strategy for writing good proposal programs is to combine domain-specific heuristic algorithms with neural network models. The heuristics identify high probability regions, and the neural networks model the posterior uncertainty around the outputs of the algorithm. Proposal programs can be used as proposal distributions in importance sampling and Metropolis-Hastings samplers without sacrificing asymptotic consistency, and can be optimized offline using inference compilation. Support for optimizing and using proposal programs is easily implemented in a sampling-based probabilistic programming runtime. The paper illustrates the proposed technique with a proposal program that combines RANSAC and neural networks to accelerate inference in a Bayesian linear regression with outliers model.
△ Less
Submitted 13 January, 2018; v1 submitted 10 January, 2018;
originally announced January 2018.
-
AIDE: An algorithm for measuring the accuracy of probabilistic inference algorithms
Authors:
Marco F. Cusumano-Towner,
Vikash K. Mansinghka
Abstract:
Approximate probabilistic inference algorithms are central to many fields. Examples include sequential Monte Carlo inference in robotics, variational inference in machine learning, and Markov chain Monte Carlo inference in statistics. A key problem faced by practitioners is measuring the accuracy of an approximate inference algorithm on a specific data set. This paper introduces the auxiliary infe…
▽ More
Approximate probabilistic inference algorithms are central to many fields. Examples include sequential Monte Carlo inference in robotics, variational inference in machine learning, and Markov chain Monte Carlo inference in statistics. A key problem faced by practitioners is measuring the accuracy of an approximate inference algorithm on a specific data set. This paper introduces the auxiliary inference divergence estimator (AIDE), an algorithm for measuring the accuracy of approximate inference algorithms. AIDE is based on the observation that inference algorithms can be treated as probabilistic models and the random variables used within the inference algorithm can be viewed as auxiliary variables. This view leads to a new estimator for the symmetric KL divergence between the approximating distributions of two inference algorithms. The paper illustrates application of AIDE to algorithms for inference in regression, hidden Markov, and Dirichlet process mixture models. The experiments show that AIDE captures the qualitative behavior of a broad class of inference algorithms and can detect failure modes of inference algorithms that are missed by standard heuristics.
△ Less
Submitted 4 November, 2017; v1 submitted 19 May, 2017;
originally announced May 2017.
-
Probabilistic programs for inferring the goals of autonomous agents
Authors:
Marco F. Cusumano-Towner,
Alexey Radul,
David Wingate,
Vikash K. Mansinghka
Abstract:
Intelligent systems sometimes need to infer the probable goals of people, cars, and robots, based on partial observations of their motion. This paper introduces a class of probabilistic programs for formulating and solving these problems. The formulation uses randomized path planning algorithms as the basis for probabilistic models of the process by which autonomous agents plan to achieve their go…
▽ More
Intelligent systems sometimes need to infer the probable goals of people, cars, and robots, based on partial observations of their motion. This paper introduces a class of probabilistic programs for formulating and solving these problems. The formulation uses randomized path planning algorithms as the basis for probabilistic models of the process by which autonomous agents plan to achieve their goals. Because these path planning algorithms do not have tractable likelihood functions, new inference algorithms are needed. This paper proposes two Monte Carlo techniques for these "likelihood-free" models, one of which can use likelihood estimates from neural networks to accelerate inference. The paper demonstrates efficacy on three simple examples, each using under 50 lines of probabilistic code.
△ Less
Submitted 18 April, 2017; v1 submitted 17 April, 2017;
originally announced April 2017.
-
Encapsulating models and approximate inference programs in probabilistic modules
Authors:
Marco F. Cusumano-Towner,
Vikash K. Mansinghka
Abstract:
This paper introduces the probabilistic module interface, which allows encapsulation of complex probabilistic models with latent variables alongside custom stochastic approximate inference machinery, and provides a platform-agnostic abstraction barrier separating the model internals from the host probabilistic inference system. The interface can be seen as a stochastic generalization of a standard…
▽ More
This paper introduces the probabilistic module interface, which allows encapsulation of complex probabilistic models with latent variables alongside custom stochastic approximate inference machinery, and provides a platform-agnostic abstraction barrier separating the model internals from the host probabilistic inference system. The interface can be seen as a stochastic generalization of a standard simulation and density interface for probabilistic primitives. We show that sound approximate inference algorithms can be constructed for networks of probabilistic modules, and we demonstrate that the interface can be implemented using learned stochastic inference networks and MCMC and SMC approximate inference programs.
△ Less
Submitted 6 May, 2017; v1 submitted 14 December, 2016;
originally announced December 2016.
-
Measuring the non-asymptotic convergence of sequential Monte Carlo samplers using probabilistic programming
Authors:
Marco F. Cusumano-Towner,
Vikash K. Mansinghka
Abstract:
A key limitation of sampling algorithms for approximate inference is that it is difficult to quantify their approximation error. Widely used sampling schemes, such as sequential importance sampling with resampling and Metropolis-Hastings, produce output samples drawn from a distribution that may be far from the target posterior distribution. This paper shows how to upper-bound the symmetric KL div…
▽ More
A key limitation of sampling algorithms for approximate inference is that it is difficult to quantify their approximation error. Widely used sampling schemes, such as sequential importance sampling with resampling and Metropolis-Hastings, produce output samples drawn from a distribution that may be far from the target posterior distribution. This paper shows how to upper-bound the symmetric KL divergence between the output distribution of a broad class of sequential Monte Carlo (SMC) samplers and their target posterior distributions, subject to assumptions about the accuracy of a separate gold-standard sampler. The proposed method applies to samplers that combine multiple particles, multinomial resampling, and rejuvenation kernels. The experiments show the technique being used to estimate bounds on the divergence of SMC samplers for posterior inference in a Bayesian linear regression model and a Dirichlet process mixture model.
△ Less
Submitted 6 May, 2017; v1 submitted 7 December, 2016;
originally announced December 2016.
-
Quantifying the probable approximation error of probabilistic inference programs
Authors:
Marco F Cusumano-Towner,
Vikash K Mansinghka
Abstract:
This paper introduces a new technique for quantifying the approximation error of a broad class of probabilistic inference programs, including ones based on both variational and Monte Carlo approaches. The key idea is to derive a subjective bound on the symmetrized KL divergence between the distribution achieved by an approximate inference program and its true target distribution. The bound's valid…
▽ More
This paper introduces a new technique for quantifying the approximation error of a broad class of probabilistic inference programs, including ones based on both variational and Monte Carlo approaches. The key idea is to derive a subjective bound on the symmetrized KL divergence between the distribution achieved by an approximate inference program and its true target distribution. The bound's validity (and subjectivity) rests on the accuracy of two auxiliary probabilistic programs: (i) a "reference" inference program that defines a gold standard of accuracy and (ii) a "meta-inference" program that answers the question "what internal random choices did the original approximate inference program probably make given that it produced a particular result?" The paper includes empirical results on inference problems drawn from linear regression, Dirichlet process mixture modeling, HMMs, and Bayesian networks. The experiments show that the technique is robust to the quality of the reference inference program and that it can detect implementation bugs that are not apparent from predictive performance.
△ Less
Submitted 31 May, 2016;
originally announced June 2016.