Search | arXiv e-print repository

Interpolating between sampling and variational inference with infinite stochastic mixtures

Authors: Richard D. Lange, Ari Benjamin, Ralf M. Haefner, Xaq Pitkow

Abstract: Sampling and Variational Inference (VI) are two large families of methods for approximate inference that have complementary strengths. Sampling methods excel at approximating arbitrary probability distributions, but can be inefficient. VI methods are efficient, but may misrepresent the true distribution. Here, we develop a general framework where approximations are stochastic mixtures of simple co… ▽ More Sampling and Variational Inference (VI) are two large families of methods for approximate inference that have complementary strengths. Sampling methods excel at approximating arbitrary probability distributions, but can be inefficient. VI methods are efficient, but may misrepresent the true distribution. Here, we develop a general framework where approximations are stochastic mixtures of simple component distributions. Both sampling and VI can be seen as special cases: in sampling, each mixture component is a delta-function and is chosen stochastically, while in standard VI a single component is chosen to minimize divergence. We derive a practical method that interpolates between sampling and VI by solving an optimization problem over a mixing distribution. Intermediate inference methods then arise by varying a single parameter. Our method provably improves on sampling (reducing variance) and on VI (reducing bias+variance despite increasing variance). We demonstrate our method's bias/variance trade-off in practice on reference problems, and we compare outcomes to commonly used sampling and VI methods. This work takes a step towards a highly flexible yet simple family of inference methods that combines the complementary strengths of sampling and VI. △ Less

Submitted 4 March, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

Comments: 9 pages, 4 figures. Submitted to UAI 2022; under double-blind review. Code available at https://github.com/wrongu/sampling-variational-demos

arXiv:1811.09739 [pdf, other]

A probabilistic population code based on neural samples

Authors: Sabyasachi Shivkumar, Richard D. Lange, Ankani Chattoraj, Ralf M. Haefner

Abstract: Sensory processing is often characterized as implementing probabilistic inference: networks of neurons compute posterior beliefs over unobserved causes given the sensory inputs. How these beliefs are computed and represented by neural responses is much-debated (Fiser et al. 2010, Pouget et al. 2013). A central debate concerns the question of whether neural responses represent samples of latent var… ▽ More Sensory processing is often characterized as implementing probabilistic inference: networks of neurons compute posterior beliefs over unobserved causes given the sensory inputs. How these beliefs are computed and represented by neural responses is much-debated (Fiser et al. 2010, Pouget et al. 2013). A central debate concerns the question of whether neural responses represent samples of latent variables (Hoyer & Hyvarinnen 2003) or parameters of their distributions (Ma et al. 2006) with efforts being made to distinguish between them (Grabska-Barwinska et al. 2013). A separate debate addresses the question of whether neural responses are proportionally related to the encoded probabilities (Barlow 1969), or proportional to the logarithm of those probabilities (Jazayeri & Movshon 2006, Ma et al. 2006, Beck et al. 2012). Here, we show that these alternatives - contrary to common assumptions - are not mutually exclusive and that the very same system can be compatible with all of them. As a central analytical result, we show that modeling neural responses in area V1 as samples from a posterior distribution over latents in a linear Gaussian model of the image implies that those neural responses form a linear Probabilistic Population Code (PPC, Ma et al. 2006). In particular, the posterior distribution over some experimenter-defined variable like "orientation" is part of the exponential family with sufficient statistics that are linear in the neural sampling-based firing rates. △ Less

Submitted 23 November, 2018; originally announced November 2018.

Comments: First three contributed equally to the work

arXiv:1501.03173 [pdf, other]

A note on choice and detect probabilities in the presence of choice bias

Authors: Ralf M. Haefner

Abstract: Recently we have presented the analytical relationship between choice probabilities, noise correlations and read-out weights in the classical feedforward decision-making framework (Haefner et al. 2013). The derivation assumed that behavioral reports are distributed evenly between the two possible choices. This assumption is often violated in empirical data - especially when computing so-called gra… ▽ More Recently we have presented the analytical relationship between choice probabilities, noise correlations and read-out weights in the classical feedforward decision-making framework (Haefner et al. 2013). The derivation assumed that behavioral reports are distributed evenly between the two possible choices. This assumption is often violated in empirical data - especially when computing so-called grand CPs combining data across stimulus conditions. Here, we extend our analytical results to situations when subjects show clear biases towards one choice over the other, e.g. in non-zero signal conditions. Importantly, this also extends our results from discrimination tasks to detection tasks and detect probabilities for which much empirical data is available. We find that CPs and DPs depend monotonously on the fraction, p, of choices assigned to the more likely option: CPs and DPs are smallest for p equal to 0.5 and increase as p increases, i.e. as the data deviates from the ideal, zero-signal, unbiased scenario. While this deviation is small, our results suggest a) an empirical test for the feedforward framework and b) a way in which to correct choice probability and detect probability measurements before combining different stimulus conditions to increase signal/noise. △ Less

Submitted 13 January, 2015; originally announced January 2015.

Comments: 6 pages, 1 figure

arXiv:1409.0257 [pdf, other]

The implications of perception as probabilistic inference for correlated neural variability during behavior

Authors: Ralf M. Haefner, Pietro Berkes, József Fiser

Abstract: This paper addresses two main challenges facing systems neuroscience today: understanding the nature and function of a) cortical feedback between sensory areas and b) correlated variability. Starting from the old idea of perception as probabilistic inference, we show how to use knowledge of the psychophysical task to make easily testable predictions for the impact that feedback signals have on ear… ▽ More This paper addresses two main challenges facing systems neuroscience today: understanding the nature and function of a) cortical feedback between sensory areas and b) correlated variability. Starting from the old idea of perception as probabilistic inference, we show how to use knowledge of the psychophysical task to make easily testable predictions for the impact that feedback signals have on early sensory representations. Applying our framework to the well-studied two-alternative forced choice task paradigm, we can explain multiple empirical findings that have been hard to account for by the traditional feedforward model of sensory processing, including the task-dependence of neural response correlations, and the diverging time courses of choice probabilities and psychophysical kernels. Our model makes a number of new predictions and, importantly, characterizes a component of correlated variability that represents task-related information rather than performance-degrading noise. It also demonstrates a normative way to integrate sensory and cognitive components into physiologically testable mathematical models of perceptual decision-making. △ Less

Submitted 19 November, 2015; v1 submitted 31 August, 2014; originally announced September 2014.

Comments: 25 pages, 9 figures; improved readability, added figure

arXiv:astro-ph/9905086 [pdf, ps, other]

doi 10.1046/j.1365-8711.2000.03242.x

A Dynamical Model of the Inner Galaxy

Authors: Ralf M. Hafner, N. Wyn Evans, Walter Dehnen, James Binney

Abstract: An extension of Schwarzschild's galaxy-building technique is presented that, for the first time, enables one to build Schwarzschild models with known distribution functions (DFs). The new extension makes it possible to combine a DF that depends only on classical integrals with orbits that respect non-classical integrals. With such a combination, Schwarzschild's orbits are used only to represent… ▽ More An extension of Schwarzschild's galaxy-building technique is presented that, for the first time, enables one to build Schwarzschild models with known distribution functions (DFs). The new extension makes it possible to combine a DF that depends only on classical integrals with orbits that respect non-classical integrals. With such a combination, Schwarzschild's orbits are used only to represent the difference between the true galaxy DF and an approximating classical DF. The new method is used to construct a dynamical model of the inner Galaxy. The model is based on an orbit library that contains 22168 regular orbits. The model aims to reproduce the three-dimensional mass density of Binney, Gerhard & Spergel (1997), which was obtained through deprojection of the COBE surface photometry, and to reproduce the observed kinematics in three windows - namely Baade's Window and two off-axis fields. The model fits essentially all the available data within the innermost 3 kpc. The axis ratio and the morphology of the projected density contours of the COBE bar are recovered to good accuracy within corotation. The kinematic quantities - the line-of-sight streaming velocity and velocity dispersion, as well as the proper motions when available - are recovered, not merely for the fitted fields, but also for three new fields. The dynamical model deviates most from the input density close to the Galactic plane just outside corotation, where the deprojection of the surface photometry is suspect. The dynamical model does not reproduce the kinematics at the most distant window, where disk contamination may be severe. △ Less

Submitted 7 May, 1999; originally announced May 1999.

Comments: 20 pages, 5 gif figures, 11 postscript figures, submitted to MNRAS. Zipped postscript available at http://www-thphys.physics.ox.ac.uk/users/RalfHafner/paper.ps.gz

arXiv:astro-ph/9611162 [pdf, ps, other]

doi 10.1093/mnras/286.2.315

Simple Three-Integral Scale-Free Galaxy Models

Authors: N. W. Evans, R. M. Häfner, P. T. de Zeeuw

Abstract: The Jeans equations give the second moments or stresses required to support a stellar population against the gravity field. A general solution of the Jeans equations for arbitrary axisymmetric scale-free densities in flattened scale-free potentials is given. A two-parameter subset of the solution for the second moments for the self-consistent density of the power-law models, which have exactly s… ▽ More The Jeans equations give the second moments or stresses required to support a stellar population against the gravity field. A general solution of the Jeans equations for arbitrary axisymmetric scale-free densities in flattened scale-free potentials is given. A two-parameter subset of the solution for the second moments for the self-consistent density of the power-law models, which have exactly spheroidal equipotentials, is examined in detail. In the spherical limit, the potential of these models reduces to that of the singular power-law spheres. We build the physical three-integral distribution functions that correspond to the flattened stellar components. Next, we attack the problem of finding distribution functions associated with the Jeans solutions in flattened scale-free potentials. The third or partial integral introduced by de Zeeuw, Evans and Schwarzschild for Binney's model is generalised to thin and near-thin orbits moving in arbitrary axisymmetric scale-free potentials. The partial integral is a modification of the total angular momentum. For the self-consistent power-law models, we show how this enables the construction of simple three-integral distribution functions. The connexion between these approximate distribution functions and the Jeans solutions is discussed in some detail. △ Less

Submitted 20 November, 1996; originally announced November 1996.

Comments: 14 pages, 7 postscript figures, to appear in Monthly Notices

Showing 1–6 of 6 results for author: Haefner, R M