Skip to main content

Showing 1–36 of 36 results for author: Neal, R M

.
  1. arXiv:2403.18054  [pdf, other

    stat.CO physics.comp-ph

    Modifying Gibbs sampling to avoid self transitions

    Authors: Radford M. Neal

    Abstract: Gibbs sampling repeatedly samples from the conditional distribution of one variable, x_i, given other variables, either choosing i randomly, or updating sequentially using some systematic or random order. When x_i is discrete, a Gibbs sampling update may choose a new value that is the same as the old value. A theorem of Peskun indicates that, when i is chosen randomly, a reversible method that red… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  2. arXiv:2305.18268  [pdf, ps, other

    math.PR

    Efficiency of reversible MCMC methods: elementary derivations and applications to composite methods

    Authors: Radford M. Neal, Jeffrey S. Rosenthal

    Abstract: We review criteria for comparing the efficiency of Markov chain Monte Carlo (MCMC) methods with respect to the asymptotic variance of estimates of expectations of functions of state, and show how such criteria can justify ways of combining improvements to MCMC methods. We say that a chain on a finite state space with transition matrix $P$ efficiency-dominates one with transition matrix $Q$ if for… ▽ More

    Submitted 27 March, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: 24 pages

  3. arXiv:2001.11950  [pdf, ps, other

    stat.CO cs.LG

    Non-reversibly updating a uniform [0,1] value for Metropolis accept/reject decisions

    Authors: Radford M. Neal

    Abstract: I show how it can be beneficial to express Metropolis accept/reject decisions in terms of comparison with a uniform [0,1] value, u, and to then update u non-reversibly, as part of the Markov chain state, rather than sampling it independently each iteration. This provides a small improvement for random walk Metropolis and Langevin updates in high dimensions. It produces a larger improvement when us… ▽ More

    Submitted 31 January, 2020; originally announced January 2020.

  4. arXiv:1711.04399  [pdf, ps, other

    stat.CO

    Circularly-Coupled Markov Chain Sampling

    Authors: Radford M. Neal

    Abstract: I show how to run an N-time-step Markov chain simulation in a circular fashion, so that the state at time 0 follows the state at time N-1 in the same way as states at times t follow those at times t-1 for 0<t<N. This wrap-around of the chain is achieved using a coupling procedure, and produces states that all have close to the equilibrium distribution of the Markov chain, under the assumption that… ▽ More

    Submitted 12 November, 2017; originally announced November 2017.

  5. arXiv:1602.06030  [pdf, other

    stat.CO

    Sampling latent states for high-dimensional non-linear state space models with the embedded HMM method

    Authors: Alexander Y. Shestopaloff, Radford M. Neal

    Abstract: We propose a new scheme for selecting pool states for the embedded Hidden Markov Model (HMM) Markov Chain Monte Carlo (MCMC) method. This new scheme allows the embedded HMM method to be used for efficient sampling in state space models where the state can be high-dimensional. Previously, embedded HMM methods were only applied to models with a one-dimensional state space. We demonstrate that using… ▽ More

    Submitted 11 July, 2016; v1 submitted 18 February, 2016; originally announced February 2016.

    Comments: Revision has some changes to the paper, and now includes the program used as ancillary information

  6. arXiv:1505.05571  [pdf, other

    math.NA cs.DC stat.CO

    Fast exact summation using small and large superaccumulators

    Authors: Radford M. Neal

    Abstract: I present two new methods for exactly summing a set of floating-point numbers, and then correctly rounding to the nearest floating-point number. Higher accuracy than simple summation (rounding after each addition) is important in many applications, such as finding the sample mean of data. Exact summation also guarantees identical results with parallel and serial implementations, since the exact su… ▽ More

    Submitted 20 May, 2015; originally announced May 2015.

    ACM Class: G.1.0

  7. arXiv:1504.02914  [pdf, other

    stat.CO cs.MS math.NA

    Representing numeric data in 32 bits while preserving 64-bit precision

    Authors: Radford M. Neal

    Abstract: Data files often consist of numbers having only a few significant decimal digits, whose information content would allow storage in only 32 bits. However, we may require that arithmetic operations involving these numbers be done with 64-bit floating-point precision, which precludes simply representing the data as 32-bit floating-point values. Decimal floating point gives a compact and exact represe… ▽ More

    Submitted 11 April, 2015; originally announced April 2015.

  8. arXiv:1412.3013  [pdf, other

    stat.CO

    Efficient Bayesian inference for stochastic volatility models with ensemble MCMC methods

    Authors: Alexander Y. Shestopaloff, Radford M. Neal

    Abstract: In this paper, we introduce efficient ensemble Markov Chain Monte Carlo (MCMC) sampling methods for Bayesian computations in the univariate stochastic volatility model. We compare the performance of our ensemble MCMC methods with an improved version of a recent sampler of Kastner and Fruwirth-Schnatter (2014). We show that ensemble samplers are more efficient than this state of the art sampler by… ▽ More

    Submitted 9 December, 2014; originally announced December 2014.

  9. arXiv:1401.5548  [pdf, ps, other

    stat.CO

    On Bayesian inference for the M/G/1 queue with efficient MCMC sampling

    Authors: Alexander Y. Shestopaloff, Radford M. Neal

    Abstract: We introduce an efficient MCMC sampling scheme to perform Bayesian inference in the M/G/1 queueing model given only observations of interdeparture times. Our MCMC scheme uses a combination of Gibbs sampling and simple Metropolis updates together with three novel "shift" and "scale" updates. We show that our novel updates improve the speed of sampling considerably, by factors of about 60 to about 1… ▽ More

    Submitted 21 January, 2014; originally announced January 2014.

  10. arXiv:1305.2235  [pdf, other

    stat.CO

    MCMC methods for Gaussian process models using fast approximations for the likelihood

    Authors: Chunyi Wang, Radford M. Neal

    Abstract: Gaussian Process (GP) models are a powerful and flexible tool for non-parametric regression and classification. Computation for GP models is intensive, since computing the posterior density, $π$, for covariance function parameters requires computation of the covariance matrix, C, a $pn^2$ operation, where p is the number of covariates and n is the number of training cases, and then inversion of C,… ▽ More

    Submitted 9 May, 2013; originally announced May 2013.

  11. arXiv:1305.0320  [pdf, other

    stat.CO

    MCMC for non-linear state space models using ensembles of latent sequences

    Authors: Alexander Y. Shestopaloff, Radford M. Neal

    Abstract: Non-linear state space models are a widely-used class of models for biological, economic, and physical processes. Fitting these models to observed data is a difficult inference problem that has no straightforward solution. We take a Bayesian approach to the inference of unknown parameters of a non-linear state model; this, in turn, requires the availability of efficient Markov Chain Monte Carlo (M… ▽ More

    Submitted 1 May, 2013; originally announced May 2013.

  12. arXiv:1301.3861  [pdf

    cs.AI cs.LG

    Inference for Belief Networks Using Coupling From the Past

    Authors: Michael Harvey, Radford M. Neal

    Abstract: Inference for belief networks using Gibbs sampling produces a distribution for unobserved variables that differs from the correct distribution by a (usually) unknown error, since convergence to the right distribution occurs only asymptotically. The method of "coupling from the past" samples from exactly the correct distribution by (conceptually) running dependent Gibbs sampling simulations from… ▽ More

    Submitted 16 January, 2013; originally announced January 2013.

    Comments: Appears in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI2000)

    Report number: UAI-P-2000-PG-256-263

  13. arXiv:1212.6246  [pdf, other

    stat.ML cs.LG

    Gaussian Process Regression with Heteroscedastic or Non-Gaussian Residuals

    Authors: Chunyi Wang, Radford M. Neal

    Abstract: Gaussian Process (GP) regression models typically assume that residuals are Gaussian and have the same variance for all observations. However, applications with input-dependent noise (heteroscedastic residuals) frequently arise in practice, as do applications in which the residuals do not have a Gaussian distribution. In this paper, we propose a GP Regression model with a latent variable that serv… ▽ More

    Submitted 26 December, 2012; originally announced December 2012.

  14. arXiv:1206.1901  [pdf, ps, other

    stat.CO physics.comp-ph

    MCMC using Hamiltonian dynamics

    Authors: Radford M. Neal

    Abstract: Hamiltonian dynamics can be used to produce distant proposals for the Metropolis algorithm, thereby avoiding the slow exploration of the state space that results from the diffusive behaviour of simple random-walk proposals. Though originating in physics, Hamiltonian dynamics can be applied to most problems with continuous state spaces by simply introducing fictitious "momentum" variables. A key to… ▽ More

    Submitted 8 June, 2012; originally announced June 2012.

  15. arXiv:1205.0070  [pdf, ps, other

    stat.CO physics.comp-ph

    How to view an MCMC simulation as a permutation, with applications to parallel simulation and improved importance sampling

    Authors: Radford M. Neal

    Abstract: Consider a Markov chain defined on a finite state space, X, that leaves invariant the uniform distribution on X, and whose transition probabilities are integer multiples of 1/Q, for some integer Q. I show how a simulation of n transitions of this chain starting at x_0 can be viewed as applying a random permutation on the space XxU, where U={0,1,...,Q-1}, to the start state (x_0,u_0), with u_0 draw… ▽ More

    Submitted 30 April, 2012; originally announced May 2012.

    Report number: Tech. Rep. No. 1201, Dept. of Statistics, University of Toronto

  16. arXiv:1106.5941  [pdf, other

    stat.CO

    Split Hamiltonian Monte Carlo

    Authors: Babak Shahbaba, Shiwei Lan, Wesley O. Johnson, Radford M. Neal

    Abstract: We show how the Hamiltonian Monte Carlo algorithm can sometimes be speeded up by "splitting" the Hamiltonian in a way that allows much of the movement around the state space to be done at low computational cost. One context where this is possible is when the log density of the distribution of interest (the potential energy function) can be written as the log of a Gaussian density, which is a quadr… ▽ More

    Submitted 14 July, 2012; v1 submitted 29 June, 2011; originally announced June 2011.

  17. On Deducing Conditional Independence from d-Separation in Causal Graphs with Feedback (Research Note)

    Authors: R. M. Neal

    Abstract: Pearl and Dechter (1996) claimed that the d-separation criterion for conditional independence in acyclic causal networks also applies to networks of discrete variables that have feedback cycles, provided that the variables of the system are uniquely determined by the random disturbances. I show by example that this is not true in general. Some condition stronger than uniqueness is… ▽ More

    Submitted 1 June, 2011; originally announced June 2011.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 12, pages 87-91, 2000

  18. arXiv:1101.0387  [pdf, ps, other

    stat.CO

    MCMC Using Ensembles of States for Problems with Fast and Slow Variables such as Gaussian Process Regression

    Authors: Radford M. Neal

    Abstract: I introduce a Markov chain Monte Carlo (MCMC) scheme in which sampling from a distribution with density pi(x) is done using updates operating on an "ensemble" of states. The current state x is first stochastically mapped to an ensemble, x^{(1)},...,x^{(K)}. This ensemble is then updated using MCMC updates that leave invariant a suitable ensemble density, rho(x^{(1)},...,x^{(K)}), defined in terms… ▽ More

    Submitted 2 January, 2011; originally announced January 2011.

  19. arXiv:1011.4722  [pdf, other

    stat.CO

    Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method

    Authors: Madeleine B. Thompson, Radford M. Neal

    Abstract: The shrinking rank method is a variation of slice sampling that is efficient at sampling from multivariate distributions with highly correlated parameters. It requires that the gradient of the log-density be computable. At each individual step, it approximates the current slice with a Gaussian occupying a shrinking-dimension subspace. The dimension of the approximation is shrunk orthogonally to th… ▽ More

    Submitted 21 November, 2010; originally announced November 2010.

    ACM Class: G.3

    Journal ref: Proceedings of the 2010 Joint Statistical Meetings, Section on Statistical Computing, pages 3890-3896

  20. arXiv:1003.3201  [pdf, other

    stat.CO

    Covariance-Adaptive Slice Sampling

    Authors: Madeleine Thompson, Radford M. Neal

    Abstract: We describe two slice sampling methods for taking multivariate steps using the crumb framework. These methods use the gradients at rejected proposals to adapt to the local curvature of the log-density surface, a technique that can produce much better proposals when parameters are highly correlated. We evaluate our methods on four distributions and compare their performance to that of a non-adapt… ▽ More

    Submitted 16 March, 2010; originally announced March 2010.

    Report number: Tech. Rep. 1002, Dept. of Statistics, Univ. of Toronto MSC Class: 65C05

  21. arXiv:0711.4983  [pdf, ps, other

    stat.ML stat.ME

    A Method for Compressing Parameters in Bayesian Models with Application to Logistic Sequence Prediction Models

    Authors: Longhai Li, Radford M. Neal

    Abstract: Bayesian classification and regression with high order interactions is largely infeasible because Markov chain Monte Carlo (MCMC) would need to be applied with a great many parameters, whose number increases rapidly with the order. In this paper we show how to make it feasible by effectively reducing the number of parameters, exploiting the fact that many interactions have the same values for al… ▽ More

    Submitted 30 November, 2007; originally announced November 2007.

    Comments: 29 pages

    Journal ref: Bayesian Analysis, 2008, 3(4), 793-822

  22. arXiv:math/0703292  [pdf, ps, other

    math.ST q-bio.QM

    Nonlinear Models Using Dirichlet Process Mixtures

    Authors: Babak Shahbaba, Radford M. Neal

    Abstract: We introduce a new nonlinear model for classification, in which we model the joint distribution of response variable, y, and covariates, x, non-parametrically using Dirichlet process mixtures. We keep the relationship between y and x linear within each component of the mixture. The overall relationship becomes nonlinear if the mixture contains more than one component. We use simulated data to co… ▽ More

    Submitted 10 March, 2007; originally announced March 2007.

    MSC Class: 62H30

  23. arXiv:math/0702591  [pdf, ps, other

    math.ST

    A Method for Avoiding Bias from Feature Selection with Application to Naive Bayes Classification Models

    Authors: Longhai Li, Jianguo Zhang, Radford M. Neal

    Abstract: For many classification and regression problems, a large number of features are available for possible use - this is typical of DNA microarray data on gene expression, for example. Often, for computational or other reasons, only a small subset of these features are selected for use in a model, based on some simple measure such as correlation with the response variable. This procedure may introdu… ▽ More

    Submitted 20 February, 2007; originally announced February 2007.

    MSC Class: 62H30

  24. arXiv:math/0608592  [pdf, ps, other

    math.ST astro-ph

    Puzzles of Anthropic Reasoning Resolved Using Full Non-indexical Conditioning

    Authors: Radford M. Neal

    Abstract: I consider the puzzles arising from four interrelated problems involving `anthropic' reasoning, and in particular the `Self-Sampling Assumption' (SSA) - that one should reason as if one were randomly chosen from the set of all observers in a suitable reference class. The problem of Freak Observers might appear to force acceptance of SSA if any empirical evidence is to be credited. The Slee** B… ▽ More

    Submitted 23 August, 2006; originally announced August 2006.

  25. arXiv:q-bio/0605015  [pdf, ps, other

    q-bio.GN

    Gene Function Classification Using Bayesian Models with Hierarchy-Based Priors

    Authors: Babak Shahbaba, Radford M. Neal

    Abstract: We investigate the application of hierarchical classification schemes to the annotation of gene function based on several characteristics of protein sequences including phylogenic descriptors, sequence based attributes, and predicted secondary structure. We discuss three Bayesian models and compare their performance in terms of predictive accuracy. These models are the ordinary multinomial logit… ▽ More

    Submitted 10 May, 2006; originally announced May 2006.

  26. arXiv:math/0511216  [pdf, ps, other

    math.ST cond-mat.stat-mech physics.comp-ph

    Estimating Ratios of Normalizing Constants Using Linked Importance Sampling

    Authors: Radford M. Neal

    Abstract: Ratios of normalizing constants for two distributions are needed in both Bayesian statistics, where they are used to compare models, and in statistical physics, where they correspond to differences in free energy. Two approaches have long been used to estimate ratios of normalizing constants. The `simple importance sampling' (SIS) or `free energy perturbation' method uses a sample drawn from jus… ▽ More

    Submitted 8 November, 2005; originally announced November 2005.

  27. arXiv:math/0510449  [pdf, ps, other

    math.ST

    Improving Classification When a Class Hierarchy is Available Using a Hierarchy-Based Prior

    Authors: Babak Shahbaba, Radford M. Neal

    Abstract: We introduce a new method for building classification models when we have prior knowledge of how the classes can be arranged in a hierarchy, based on how easily they can be distinguished. The new method uses a Bayesian form of the multinomial logit (MNL, a.k.a. ``softmax'') model, with a prior that introduces correlations between the parameters for classes that are nearby in the tree. We compare… ▽ More

    Submitted 20 October, 2005; originally announced October 2005.

  28. arXiv:math/0508060  [pdf, ps, other

    math.ST

    The Short-Cut Metropolis Method

    Authors: Radford M. Neal

    Abstract: I show how one can modify the random-walk Metropolis MCMC method in such a way that a sequence of modified Metropolis updates takes little computation time when the rejection rate is outside a desired interval. This allows one to effectively adapt the scale of the Metropolis proposal distribution, by performing several such "short-cut" Metropolis sequences with varying proposal stepsizes. Unlike… ▽ More

    Submitted 2 August, 2005; originally announced August 2005.

  29. arXiv:math/0502099  [pdf, ps, other

    math.ST math.PR

    Taking Bigger Metropolis Steps by Dragging Fast Variables

    Authors: Radford M. Neal

    Abstract: I show how Markov chain sampling with the Metropolis-Hastings algorithm can be modified so as to take bigger steps when the distribution being sampled from has the characteristic that its density can be quickly recomputed for a new point if this point differs from a previous point only with respect to a subset of 'fast' variables. I show empirically that when using this method, the efficiency of… ▽ More

    Submitted 6 February, 2005; originally announced February 2005.

    MSC Class: 65C05; 65C60

  30. arXiv:math/0407281  [pdf, ps, other

    math.PR math.ST

    Improving Asymptotic Variance of MCMC Estimators: Non-reversible Chains are Better

    Authors: Radford M. Neal

    Abstract: I show how any reversible Markov chain on a finite state space that is irreducible, and hence suitable for estimating expectations with respect to its invariant distribution, can be used to construct a non-reversible Markov chain on a related state space that can also be used to estimate these expectations, with asymptotic variance at least as small as that using the reversible chain (typically… ▽ More

    Submitted 15 July, 2004; originally announced July 2004.

  31. arXiv:math/0305039  [pdf, ps, other

    math.PR

    Markov Chain Sampling for Non-linear State Space Models Using Embedded Hidden Markov Models

    Authors: Radford M. Neal

    Abstract: I describe a new Markov chain method for sampling from the distribution of the state sequences in a non-linear state space model, given the observation sequence. This method updates all states in the sequence simultaneously using an embedded Hidden Markov model (HMM). An update begins with the creation of a ``pool'' of K states at each time, by applying some Markov chain update to the current st… ▽ More

    Submitted 1 May, 2003; originally announced May 2003.

  32. arXiv:physics/0009028  [pdf, ps, other

    physics.data-an physics.comp-ph

    Slice Sampling

    Authors: Radford M. Neal

    Abstract: Markov chain sampling methods that automatically adapt to characteristics of the distribution being sampled can be constructed by exploiting the principle that one can sample from a distribution by sampling uniformly from the region under the plot of its density function. A Markov chain that converges to this uniform distribution can be constructed by alternating uniform sampling in the vertical… ▽ More

    Submitted 7 September, 2000; originally announced September 2000.

    Comments: 40 pages. Written for statisticians, but of interest to physicists who use Monte Carlo methods

  33. arXiv:physics/9803008  [pdf, ps, other

    physics.comp-ph physics.data-an

    Annealed Importance Sampling

    Authors: Radford M. Neal

    Abstract: Simulated annealing - moving from a tractable distribution to a distribution of interest via a sequence of intermediate distributions - has traditionally been used as an inexact method of handling isolated modes in Markov chain samplers. Here, it is shown how one can use the Markov chain transitions for such an annealing sequence to define an importance sampler. The Markov chain aspect allows th… ▽ More

    Submitted 4 September, 1998; v1 submitted 8 March, 1998; originally announced March 1998.

    Report number: TR 9805, Dept. of Statistics, Toronto

  34. arXiv:physics/9701026  [pdf, ps, other

    physics.data-an

    Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification

    Authors: Radford M. Neal

    Abstract: Gaussian processes are a natural way of defining prior distributions over functions of one or more input variables. In a simple nonparametric regression problem, where such a function gives the mean of a Gaussian distribution for an observed response, a Gaussian process model can easily be implemented using matrix computations that are feasible for datasets of up to about a thousand cases. Hyper… ▽ More

    Submitted 27 January, 1997; v1 submitted 27 January, 1997; originally announced January 1997.

    Report number: 9702

  35. arXiv:bayes-an/9506004  [pdf, ps

    physics.data-an hep-lat

    Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation

    Authors: R. M. Neal

    Abstract: Markov chain Monte Carlo methods such as Gibbs sampling and simple forms of the Metropolis algorithm typically move about the distribution being sampled via a random walk. For the complex, high-dimensional distributions commonly encountered in Bayesian inference and statistical physics, the distance moved in each iteration of these algorithms will usually be small, because it is difficult or imp… ▽ More

    Submitted 22 June, 1995; originally announced June 1995.

    Comments: uuencoded compressed postscript (with instructions on decoding)

    Report number: Technical Report 9508

  36. arXiv:hep-lat/9208011  [pdf, ps, other

    hep-lat

    An Improved Acceptance Procedure for the Hybrid Monte Carlo Algorithm

    Authors: R. M. Neal

    Abstract: The probability of accepting a candidate move in the hybrid Monte Carlo algorithm can be increased by considering a transition to be between windows of several states at the beginning and end of the trajectory, with a state within the selected window being chosen according to the Boltzmann probabilities. The detailed balance condition used to justify the algorithm still holds with this procedure… ▽ More

    Submitted 20 August, 1992; v1 submitted 12 August, 1992; originally announced August 1992.

    Comments: 15 pages, 4 figures (only one of which is present), New version with corrected LaTex, Submitted to J. of Comp. Physics