-
On integral priors for multiple comparison in Bayesian model selection
Authors:
Diego Salmerón,
Juan Antonio Cano,
Christian P. Robert
Abstract:
Noninformative priors constructed for estimation purposes are usually not appropriate for model selection and testing. The methodology of integral priors was developed to get prior distributions for Bayesian model selection when comparing two models, modifying initial improper reference priors. We propose a generalization of this methodology to more than two models. Our approach adds an artificial…
▽ More
Noninformative priors constructed for estimation purposes are usually not appropriate for model selection and testing. The methodology of integral priors was developed to get prior distributions for Bayesian model selection when comparing two models, modifying initial improper reference priors. We propose a generalization of this methodology to more than two models. Our approach adds an artificial copy of each model under comparison by compactifying the parametric space and creating an ergodic Markov chain across all models that returns the integral priors as marginals of the stationary distribution. Besides the garantee of their existance and the lack of paradoxes attached to estimation reference priors, an additional advantage of this methodology is that the simulation of this Markov chain is straightforward as it only requires simulations of imaginary training samples for all models and from the corresponding posterior distributions. This renders its implementation automatic and generic, both in the nested case and in the nonnested case.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
A discussion of the paper "Safe testing" by Grünwald, de Heide, and Koolen
Authors:
Joshua Bon,
Christian P Robert
Abstract:
This is a discussion of the paper "Safe testing" by Grünwald, de Heide, and Koolen, Read before The Royal Statistical Society at a meeting organized by the Research Section on Wednesday, 24 January, 2024
This is a discussion of the paper "Safe testing" by Grünwald, de Heide, and Koolen, Read before The Royal Statistical Society at a meeting organized by the Research Section on Wednesday, 24 January, 2024
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Simulating signed mixtures
Authors:
Julien Stoehr,
Christian P. Robert
Abstract:
Simulating mixtures of distributions with signed weights proves a challenge as standard simulation algorithms are inefficient in handling the negative weights. In particular, the natural representation of mixture variates as associated with latent component indicators is no longer available. We propose here an exact accept-reject algorithm in the general case of finite signed mixtures that relies…
▽ More
Simulating mixtures of distributions with signed weights proves a challenge as standard simulation algorithms are inefficient in handling the negative weights. In particular, the natural representation of mixture variates as associated with latent component indicators is no longer available. We propose here an exact accept-reject algorithm in the general case of finite signed mixtures that relies on optimaly pairing positive and negative components and designing a stratified sampling scheme on pairs. We analyze the performances of our approach, relative to the inverse cdf approach, since the cdf of the distribution remains available for standard signed mixtures.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Asymptotics of approximate Bayesian computation when summary statistics converge at heterogeneous rates
Authors:
Caroline Lawless,
Christian P. Robert,
Judith Rousseau,
Robin J. Ryder
Abstract:
We consider the asymptotic properties of Approximate Bayesian Computation (ABC) for the realistic case of summary statistics with heterogeneous rates of convergence. We allow some statistics to converge faster than the ABC tolerance, other statistics to converge slower, and cover the case where some statistics do not converge at all. We give conditions for the ABC posterior to converge, and provid…
▽ More
We consider the asymptotic properties of Approximate Bayesian Computation (ABC) for the realistic case of summary statistics with heterogeneous rates of convergence. We allow some statistics to converge faster than the ABC tolerance, other statistics to converge slower, and cover the case where some statistics do not converge at all. We give conditions for the ABC posterior to converge, and provide an explicit representation of the shape of the ABC posterior distribution in our general setting; in particular, we show how the shape of the posterior depends on the number of slow statistics. We then quantify the gain brought by the local linear post-processing step.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Insufficient Gibbs Sampling
Authors:
Antoine Luciano,
Christian P. Robert,
Robin J. Ryder
Abstract:
In some applied scenarios, the availability of complete data is restricted, often due to privacy concerns; only aggregated, robust and inefficient statistics derived from the data are made accessible. These robust statistics are not sufficient, but they demonstrate reduced sensitivity to outliers and offer enhanced data protection due to their higher breakdown point. We consider a parametric frame…
▽ More
In some applied scenarios, the availability of complete data is restricted, often due to privacy concerns; only aggregated, robust and inefficient statistics derived from the data are made accessible. These robust statistics are not sufficient, but they demonstrate reduced sensitivity to outliers and offer enhanced data protection due to their higher breakdown point. We consider a parametric framework and propose a method to sample from the posterior distribution of parameters conditioned on various robust and inefficient statistics: specifically, the pairs (median, MAD) or (median, IQR), or a collection of quantiles. Our approach leverages a Gibbs sampler and simulates latent augmented data, which facilitates simulation from the posterior distribution of parameters belonging to specific families of distributions. A by-product of these samples from the joint posterior distribution of parameters and data given the observed statistics is that we can estimate Bayes factors based on observed statistics via bridge sampling. We validate and outline the limitations of the proposed methods through toy examples and an application to real-world income data.
△ Less
Submitted 22 February, 2024; v1 submitted 27 July, 2023;
originally announced July 2023.
-
Sampling using Adaptive Regenerative Processes
Authors:
Hector McKimm,
Andi Q Wang,
Murray Pollock,
Christian P Robert,
Gareth O Roberts
Abstract:
Enriching Brownian motion with regenerations from a fixed regeneration distribution $μ$ at a particular regeneration rate $κ$ results in a Markov process that has a target distribution $π$ as its invariant distribution. For the purpose of Monte Carlo inference, implementing such a scheme requires firstly selection of regeneration distribution $μ$, and secondly computation of a specific constant…
▽ More
Enriching Brownian motion with regenerations from a fixed regeneration distribution $μ$ at a particular regeneration rate $κ$ results in a Markov process that has a target distribution $π$ as its invariant distribution. For the purpose of Monte Carlo inference, implementing such a scheme requires firstly selection of regeneration distribution $μ$, and secondly computation of a specific constant $C$. Both of these tasks can be very difficult in practice for good performance. We introduce a method for adapting the regeneration distribution, by adding point masses to it. This allows the process to be simulated with as few regenerations as possible and obviates the need to find said constant $C$. Moreover, the choice of fixed $μ$ is replaced with the choice of the initial regeneration distribution, which is considerably less difficult. We establish convergence of this resulting self-reinforcing process and explore its effectiveness at sampling from a number of target distributions. The examples show that adapting the regeneration distribution guards against poor choices of fixed regeneration distribution and can reduce the error of Monte Carlo estimates of expectations of interest, especially when $π$ is skewed.
△ Less
Submitted 20 February, 2024; v1 submitted 18 October, 2022;
originally announced October 2022.
-
Computing Bayes: From Then 'Til Now'
Authors:
Gael M. Martin,
David T. Frazier,
Christian P. Robert
Abstract:
This paper takes the reader on a journey through the history of Bayesian computation, from the 18th century to the present day. Beginning with the one-dimensional integral first confronted by Bayes in 1763, we highlight the key contributions of: Laplace, Metropolis (and, importantly, his co-authors!), Hammersley and Handscomb, and Hastings, all of which set the foundations for the computational re…
▽ More
This paper takes the reader on a journey through the history of Bayesian computation, from the 18th century to the present day. Beginning with the one-dimensional integral first confronted by Bayes in 1763, we highlight the key contributions of: Laplace, Metropolis (and, importantly, his co-authors!), Hammersley and Handscomb, and Hastings, all of which set the foundations for the computational revolution in the late 20th century -- led, primarily, by Markov chain Monte Carlo (MCMC) algorithms. A very short outline of 21st century computational methods -- including pseudo-marginal MCMC, Hamiltonian Monte Carlo, sequential Monte Carlo, and the various `approximate' methods -- completes the paper.
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
-
The Importance Markov Chain
Authors:
Charly Andral,
Randal Douc,
Hugo Marival,
Christian P. Robert
Abstract:
The Importance Markov chain is a novel algorithm bridging the gap between rejection sampling and importance sampling, moving from one to the other through a tuning parameter. Based on a modified sample of an instrumental Markov chain targeting an instrumental distribution (typically via a MCMC kernel), the Importance Markov chain produces an extended Markov chain where the marginal distribution of…
▽ More
The Importance Markov chain is a novel algorithm bridging the gap between rejection sampling and importance sampling, moving from one to the other through a tuning parameter. Based on a modified sample of an instrumental Markov chain targeting an instrumental distribution (typically via a MCMC kernel), the Importance Markov chain produces an extended Markov chain where the marginal distribution of the first component converges to the target distribution. For example, when targeting a multimodal distribution, the instrumental distribution can be chosen as a tempered version of the target which allows the algorithm to explore its modes more efficiently. We obtain a Law of Large Numbers and a Central Limit Theorem as well as geometric ergodicity for this extended kernel under mild assumptions on the instrumental kernel. Computationally, the algorithm is easy to implement and preexisting libraries can be used to sample from the instrumental distribution.
△ Less
Submitted 26 February, 2024; v1 submitted 17 July, 2022;
originally announced July 2022.
-
50 shades of Bayesian testing of hypotheses
Authors:
Christian P Robert
Abstract:
Hypothesis testing and model choice are quintessential questions for statistical inference and while the Bayesian paradigm seems ideally suited for answering these questions, it faces difficulties of its own ranging from prior modelling to calibration, to numerical implementation. This c
Hypothesis testing and model choice are quintessential questions for statistical inference and while the Bayesian paradigm seems ideally suited for answering these questions, it faces difficulties of its own ranging from prior modelling to calibration, to numerical implementation. This c
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
Evidence estimation in finite and infinite mixture models and applications
Authors:
Adrien Hairault,
Christian P. Robert,
Judith Rousseau
Abstract:
Estimating the model evidence - or mariginal likelihood of the data - is a notoriously difficult task for finite and infinite mixture models and we reexamine here different Monte Carlo techniques advocated in the recent literature, as well as novel approaches based on Geyer (1994) reverse logistic regression technique, Chib (1995) algorithm, and Sequential Monte Carlo (SMC). Applications are numer…
▽ More
Estimating the model evidence - or mariginal likelihood of the data - is a notoriously difficult task for finite and infinite mixture models and we reexamine here different Monte Carlo techniques advocated in the recent literature, as well as novel approaches based on Geyer (1994) reverse logistic regression technique, Chib (1995) algorithm, and Sequential Monte Carlo (SMC). Applications are numerous. In particular, testing for the number of components in a finite mixture model or against the fit of a finite mixture model for a given dataset has long been and still is an issue of much interest, albeit yet missing a fully satisfactory resolution. Using a Bayes factor to find the right number of components K in a finite mixture model is known to provide a consistent procedure. We furthermore establish the consistence of the Bayes factor when comparing a parametric family of finite mixtures against the nonparametric 'strongly identifiable' Dirichlet Process Mixture (DPM) model.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
AntBO: Towards Real-World Automated Antibody Design with Combinatorial Bayesian Optimisation
Authors:
Asif Khan,
Alexander I. Cowen-Rivers,
Antoine Grosnit,
Derrick-Goh-Xin Deik,
Philippe A. Robert,
Victor Greiff,
Eva Smorodina,
Puneet Rawat,
Kamil Dreczkowski,
Rahmad Akbar,
Rasul Tutunov,
Dany Bou-Ammar,
Jun Wang,
Amos Storkey,
Haitham Bou-Ammar
Abstract:
Antibodies are canonically Y-shaped multimeric proteins capable of highly specific molecular recognition. The CDRH3 region located at the tip of variable chains of an antibody dominates antigen-binding specificity. Therefore, it is a priority to design optimal antigen-specific CDRH3 regions to develop therapeutic antibodies. However, the combinatorial nature of CDRH3 sequence space makes it imposs…
▽ More
Antibodies are canonically Y-shaped multimeric proteins capable of highly specific molecular recognition. The CDRH3 region located at the tip of variable chains of an antibody dominates antigen-binding specificity. Therefore, it is a priority to design optimal antigen-specific CDRH3 regions to develop therapeutic antibodies. However, the combinatorial nature of CDRH3 sequence space makes it impossible to search for an optimal binding sequence exhaustively and efficiently using computational approaches. Here, we present \texttt{AntBO}: a combinatorial Bayesian optimisation framework enabling efficient \textit{in silico} design of the CDRH3 region. Ideally, antibodies are expected to have high target specificity and developability. We introduce a CDRH3 trust region that restricts the search to sequences with favourable developability scores to achieve this goal. For benchmarking, \texttt{AntBO} uses the \texttt{Absolut!} software suite as a black-box oracle to score the target specificity and affinity of designed antibodies \textit{in silico} in an unconstrained fashion~\citep{robert2021one}. The experiments performed for $159$ discretised antigens used in \texttt{Absolut!} demonstrate the benefit of \texttt{AntBO} in designing CDRH3 regions with diverse biophysical properties. In under $200$ calls to black-box oracle, \texttt{AntBO} can suggest antibody sequences that outperform the best binding sequence drawn from 6.9 million experimentally obtained CDRH3s and a commonly used genetic algorithm baseline. Additionally, \texttt{AntBO} finds very-high affinity CDRH3 sequences in only 38 protein designs whilst requiring no domain knowledge. We conclude \texttt{AntBO} brings automated antibody design methods closer to what is practically viable for in vitro experimentation.
△ Less
Submitted 14 October, 2022; v1 submitted 29 January, 2022;
originally announced January 2022.
-
Approximating Bayes in the 21st Century
Authors:
Gael M. Martin,
David T. Frazier,
Christian P. Robert
Abstract:
The 21st century has seen an enormous growth in the development and use of approximate Bayesian methods. Such methods produce computational solutions to certain intractable statistical problems that challenge exact methods like Markov chain Monte Carlo: for instance, models with unavailable likelihoods, high-dimensional models, and models featuring large data sets. These approximate methods are th…
▽ More
The 21st century has seen an enormous growth in the development and use of approximate Bayesian methods. Such methods produce computational solutions to certain intractable statistical problems that challenge exact methods like Markov chain Monte Carlo: for instance, models with unavailable likelihoods, high-dimensional models, and models featuring large data sets. These approximate methods are the subject of this review. The aim is to help new researchers in particular -- and more generally those interested in adopting a Bayesian approach to empirical work -- distinguish between different approximate techniques; understand the sense in which they are approximate; appreciate when and why particular methods are useful; and see the ways in which they can can be combined.
△ Less
Submitted 20 December, 2021;
originally announced December 2021.
-
Living on the Edge: An Unified Approach to Antithetic Sampling
Authors:
Roberto Casarin,
Radu V. Craiu,
Lorenzo Frattarolo,
Christian P. Robert
Abstract:
We identify recurrent ingredients in the antithetic sampling literature leading to a unified sampling framework. We introduce a new class of antithetic schemes that includes the most used antithetic proposals. This perspective enables the derivation of new properties of the sampling schemes: i) optimality in the Kullback-Leibler sense; ii) closed-form multivariate Kendall's $τ$ and Spearman's $ρ$;…
▽ More
We identify recurrent ingredients in the antithetic sampling literature leading to a unified sampling framework. We introduce a new class of antithetic schemes that includes the most used antithetic proposals. This perspective enables the derivation of new properties of the sampling schemes: i) optimality in the Kullback-Leibler sense; ii) closed-form multivariate Kendall's $τ$ and Spearman's $ρ$; iii)ranking in concordance order and iv) a central limit theorem that characterizes stochastic behavior of Monte Carlo estimators when the sample size tends to infinity. Finally, we provide applications to Monte Carlo integration and Markov Chain Monte Carlo Bayesian estimation.
△ Less
Submitted 6 December, 2021; v1 submitted 28 October, 2021;
originally announced October 2021.
-
Rao-Blackwellization in the MCMC era
Authors:
Christian P. Robert,
Gareth O. Roberts
Abstract:
Rao-Blackwellization is a notion often occurring in the MCMC literature, with possibly different meanings and connections with the original Rao--Blackwell theorem (Rao, 1945 and Blackwell,1947), including a reduction of the variance of the resulting Monte Carlo approximations. This survey reviews some of the meanings of the term.
Rao-Blackwellization is a notion often occurring in the MCMC literature, with possibly different meanings and connections with the original Rao--Blackwell theorem (Rao, 1945 and Blackwell,1947), including a reduction of the variance of the resulting Monte Carlo approximations. This survey reviews some of the meanings of the term.
△ Less
Submitted 4 January, 2021;
originally announced January 2021.
-
Computing Bayes: Bayesian Computation from 1763 to the 21st Century
Authors:
Gael M. Martin,
David T. Frazier,
Christian P. Robert
Abstract:
The Bayesian statistical paradigm uses the language of probability to express uncertainty about the phenomena that generate observed data. Probability distributions thus characterize Bayesian analysis, with the rules of probability used to transform prior probability distributions for all unknowns - parameters, latent variables, models - into posterior distributions, subsequent to the observation…
▽ More
The Bayesian statistical paradigm uses the language of probability to express uncertainty about the phenomena that generate observed data. Probability distributions thus characterize Bayesian analysis, with the rules of probability used to transform prior probability distributions for all unknowns - parameters, latent variables, models - into posterior distributions, subsequent to the observation of data. Conducting Bayesian analysis requires the evaluation of integrals in which these probability distributions appear. Bayesian computation is all about evaluating such integrals in the typical case where no analytical solution exists. This paper takes the reader on a chronological tour of Bayesian computation over the past two and a half centuries. Beginning with the one-dimensional integral first confronted by Bayes in 1763, through to recent problems in which the unknowns number in the millions, we place all computational problems into a common framework, and describe all computational methods using a common notation. The aim is to help new researchers in particular - and more generally those interested in adopting a Bayesian approach to empirical work - make sense of the plethora of computational techniques that are now on offer; understand when and why different methods are useful; and see the links that do exist, between them all.
△ Less
Submitted 5 December, 2020; v1 submitted 14 April, 2020;
originally announced April 2020.
-
Generalized Poisson Difference Autoregressive Processes
Authors:
Giulia Carallo,
Roberto Casarin,
Christian P. Robert
Abstract:
This paper introduces a new stochastic process with values in the set Z of integers with sign. The increments of process are Poisson differences and the dynamics has an autoregressive structure. We study the properties of the process and exploit the thinning representation to derive stationarity conditions and the stationary distribution of the process. We provide a Bayesian inference method and a…
▽ More
This paper introduces a new stochastic process with values in the set Z of integers with sign. The increments of process are Poisson differences and the dynamics has an autoregressive structure. We study the properties of the process and exploit the thinning representation to derive stationarity conditions and the stationary distribution of the process. We provide a Bayesian inference method and an efficient posterior approximation procedure based on Monte Carlo. Numerical illustrations on both simulated and real data show the effectiveness of the proposed inference.
△ Less
Submitted 11 February, 2020;
originally announced February 2020.
-
Markov Chain Monte Carlo Methods, a survey with some frequent misunderstandings
Authors:
Christian P. Robert,
Wu Changye
Abstract:
In this chapter, we review some of the most standard MCMC tools used in Bayesian computation, along with vignettes on standard misunderstandings of these approaches taken from Q \&~A's on the forum Cross-validated answered by the first author.
In this chapter, we review some of the most standard MCMC tools used in Bayesian computation, along with vignettes on standard misunderstandings of these approaches taken from Q \&~A's on the forum Cross-validated answered by the first author.
△ Less
Submitted 17 January, 2020;
originally announced January 2020.
-
Parallelising MCMC via Random Forests
Authors:
Wu Changye,
Christian P. Robert
Abstract:
For Bayesian computation in big data contexts, the divide-and-conquer MCMC concept splits the whole data set into batches, runs MCMC algorithms separately over each batch to produce samples of parameters, and combines them to produce an approximation of the target distribution. In this article, we embed random forests into this framework and use each subposterior/partial-posterior as a proposal di…
▽ More
For Bayesian computation in big data contexts, the divide-and-conquer MCMC concept splits the whole data set into batches, runs MCMC algorithms separately over each batch to produce samples of parameters, and combines them to produce an approximation of the target distribution. In this article, we embed random forests into this framework and use each subposterior/partial-posterior as a proposal distribution to implement importance sampling. Unlike the existing divide-and-conquer MCMC, our methods are based on scaled subposteriors, whose scale factors are not necessarily restricted to being equal to one or to the number of subsets. Through several experiments, we show that our methods work well with models ranging from Gaussian cases to strongly non-Gaussian cases, and include model misspecification.
△ Less
Submitted 21 November, 2019;
originally announced November 2019.
-
Component-wise approximate Bayesian computation via Gibbs-like steps
Authors:
Grégoire Clarté,
Christian P. Robert,
Robin Ryder,
Julien Stoehr
Abstract:
Approximate Bayesian computation methods are useful for generative models with intractable likelihoods. These methods are however sensitive to the dimension of the parameter space, requiring exponentially increasing resources as this dimension grows. To tackle this difficulty, we explore a Gibbs version of the ABC approach that runs component-wise approximate Bayesian computation steps aimed at th…
▽ More
Approximate Bayesian computation methods are useful for generative models with intractable likelihoods. These methods are however sensitive to the dimension of the parameter space, requiring exponentially increasing resources as this dimension grows. To tackle this difficulty, we explore a Gibbs version of the ABC approach that runs component-wise approximate Bayesian computation steps aimed at the corresponding conditional posterior distributions, and based on summary statistics of reduced dimensions. While lacking the standard justifications for the Gibbs sampler, the resulting Markov chain is shown to converge in distribution under some partial independence conditions. The associated stationary distribution can further be shown to be close to the true posterior distribution and some hierarchical versions of the proposed mechanism enjoy a closed form limiting distribution. Experiments also demonstrate the gain in efficiency brought by the Gibbs version over the standard solution.
△ Less
Submitted 17 September, 2020; v1 submitted 31 May, 2019;
originally announced May 2019.
-
Approximate Bayesian computation with the Wasserstein distance
Authors:
Espen Bernton,
Pierre E. Jacob,
Mathieu Gerber,
Christian P. Robert
Abstract:
A growing number of generative statistical models do not permit the numerical evaluation of their likelihood functions. Approximate Bayesian computation (ABC) has become a popular approach to overcome this issue, in which one simulates synthetic data sets given parameters and compares summaries of these data sets with the corresponding observed values. We propose to avoid the use of summaries and…
▽ More
A growing number of generative statistical models do not permit the numerical evaluation of their likelihood functions. Approximate Bayesian computation (ABC) has become a popular approach to overcome this issue, in which one simulates synthetic data sets given parameters and compares summaries of these data sets with the corresponding observed values. We propose to avoid the use of summaries and the ensuing loss of information by instead using the Wasserstein distance between the empirical distributions of the observed and synthetic data. This generalizes the well-known approach of using order statistics within ABC to arbitrary dimensions. We describe how recently developed approximations of the Wasserstein distance allow the method to scale to realistic data sizes, and propose a new distance based on the Hilbert space-filling curve. We provide a theoretical study of the proposed method, describing consistency as the threshold goes to zero while the observations are kept fixed, and concentration properties as the number of observations grows. Various extensions to time series data are discussed. The approach is illustrated on various examples, including univariate and multivariate g-and-k distributions, a toggle switch model from systems biology, a queueing model, and a Lévy-driven stochastic volatility model.
△ Less
Submitted 9 May, 2019;
originally announced May 2019.
-
Monotonic Gaussian Process for Spatio-Temporal Disease Progression Modeling in Brain Imaging Data
Authors:
Clement Abi Nader,
Nicholas Ayache,
Philippe Robert,
Marco Lorenzi
Abstract:
We introduce a probabilistic generative model for disentangling spatio-temporal disease trajectories from series of high-dimensional brain images. The model is based on spatio-temporal matrix factorization, where inference on the sources is constrained by anatomically plausible statistical priors. To model realistic trajectories, the temporal sources are defined as monotonic and time-reparametrize…
▽ More
We introduce a probabilistic generative model for disentangling spatio-temporal disease trajectories from series of high-dimensional brain images. The model is based on spatio-temporal matrix factorization, where inference on the sources is constrained by anatomically plausible statistical priors. To model realistic trajectories, the temporal sources are defined as monotonic and time-reparametrized Gaussian Processes. To account for the non-stationarity of brain images, we model the spatial sources as sparse codes convolved at multiple scales. The method was tested on synthetic data favourably comparing with standard blind source separation approaches. The application on large-scale imaging data from a clinical study allows to disentangle differential temporal progression patterns map** brain regions key to neurodegeneration, while revealing a disease-specific time scale associated to the clinical diagnosis.
△ Less
Submitted 10 October, 2019; v1 submitted 28 February, 2019;
originally announced February 2019.
-
Model Selection for Mixture Models - Perspectives and Strategies
Authors:
Gilles Celeux,
Sylvia Fruewirth-Schnatter,
Christian P. Robert
Abstract:
Determining the number G of components in a finite mixture distribution is an important and difficult inference issue. This is a most important question, because statistical inference about the resulting model is highly sensitive to the value of G. Selecting an erroneous value of G may produce a poor density estimate. This is also a most difficult question from a theoretical perspective as it rela…
▽ More
Determining the number G of components in a finite mixture distribution is an important and difficult inference issue. This is a most important question, because statistical inference about the resulting model is highly sensitive to the value of G. Selecting an erroneous value of G may produce a poor density estimate. This is also a most difficult question from a theoretical perspective as it relates to unidentifiability issues of the mixture model. This is further a most relevant question from a practical viewpoint since the meaning of the number of components G is strongly related to the modelling purpose of a mixture distribution. We distinguish in this chapter between selecting G as a density estimation problem in Section 2 and selecting G in a model-based clustering framework in Section 3. Both sections discuss frequentist as well as Bayesian approaches. We present here some of the Bayesian solutions to the different interpretations of picking the "right" number of components in a mixture, before concluding on the ill-posed nature of the question.
△ Less
Submitted 24 December, 2018;
originally announced December 2018.
-
Computational Solutions for Bayesian Inference in Mixture Models
Authors:
Gilles Celeux,
Kaniav Kamary,
Gertraud Malsiner-Walli,
Jean-Michel Marin,
Christian P. Robert
Abstract:
This chapter surveys the most standard Monte Carlo methods available for simulating from a posterior distribution associated with a mixture and conducts some experiments about the robustness of the Gibbs sampler in high dimensional Gaussian settings. This is a chapter prepared for the forthcoming 'Handbook of Mixture Analysis'.
This chapter surveys the most standard Monte Carlo methods available for simulating from a posterior distribution associated with a mixture and conducts some experiments about the robustness of the Gibbs sampler in high dimensional Gaussian settings. This is a chapter prepared for the forthcoming 'Handbook of Mixture Analysis'.
△ Less
Submitted 18 December, 2018;
originally announced December 2018.
-
Faster Hamiltonian Monte Carlo by Learning Leapfrog Scale
Authors:
Changye Wu,
Julien Stoehr,
Christian P. Robert
Abstract:
Hamiltonian Monte Carlo samplers have become standard algorithms for MCMC implementations, as opposed to more basic versions, but they still require some amount of tuning and calibration. Exploiting the U-turn criterion of the NUTS algorithm (Hoffman and Gelman, 2014), we propose a version of HMC that relies on the distribution of the integration time of the associated leapfrog integrator. Using i…
▽ More
Hamiltonian Monte Carlo samplers have become standard algorithms for MCMC implementations, as opposed to more basic versions, but they still require some amount of tuning and calibration. Exploiting the U-turn criterion of the NUTS algorithm (Hoffman and Gelman, 2014), we propose a version of HMC that relies on the distribution of the integration time of the associated leapfrog integrator. Using in addition the primal-dual averaging method for tuning the step size of the integrator, we achieve an essentially calibration free version of HMC. When compared with the original NUTS on several benchmarks, this algorithm exhibits a significantly improved efficiency.
△ Less
Submitted 27 February, 2019; v1 submitted 10 October, 2018;
originally announced October 2018.
-
Rethinking the Effective Sample Size
Authors:
Víctor Elvira,
Luca Martino,
Christian P. Robert
Abstract:
The effective sample size (ESS) is widely used in sample-based simulation methods for assessing the quality of a Monte Carlo approximation of a given distribution and of related integrals. In this paper, we revisit the approximation of the ESS in the specific context of importance sampling (IS). The derivation of this approximation, that we will denote as $\widehat{\text{ESS}}$, is partially avail…
▽ More
The effective sample size (ESS) is widely used in sample-based simulation methods for assessing the quality of a Monte Carlo approximation of a given distribution and of related integrals. In this paper, we revisit the approximation of the ESS in the specific context of importance sampling (IS). The derivation of this approximation, that we will denote as $\widehat{\text{ESS}}$, is partially available in Kong (1992). This approximation has been widely used in the last 25 years due to its simplicity as a practical rule of thumb in a wide variety of importance sampling methods. However, we show that the multiple assumptions and approximations in the derivation of $\widehat{\text{ESS}}$, makes it difficult to be considered even as a reasonable approximation of the ESS. We extend the discussion of the $\widehat{\text{ESS}}$ in the multiple importance sampling (MIS) setting, we display numerical examples, and we discuss several avenues for develo** alternative metrics. This paper does not cover the use of ESS for MCMC algorithms.
△ Less
Submitted 31 March, 2022; v1 submitted 11 September, 2018;
originally announced September 2018.
-
The Coordinate Sampler: A Non-Reversible Gibbs-like MCMC Sampler
Authors:
Changye Wu,
Christian P. Robert
Abstract:
In this article, we derive a novel non-reversible, continuous-time Markov chain Monte Carlo (MCMC) sampler, called Coordinate Sampler, based on a piecewise deterministic Markov process (PDMP), which can be seen as a variant of the Zigzag sampler. In addition to proving a theoretical validation for this new sampling algorithm, we show that the Markov chain it induces exhibits geometrical ergodicity…
▽ More
In this article, we derive a novel non-reversible, continuous-time Markov chain Monte Carlo (MCMC) sampler, called Coordinate Sampler, based on a piecewise deterministic Markov process (PDMP), which can be seen as a variant of the Zigzag sampler. In addition to proving a theoretical validation for this new sampling algorithm, we show that the Markov chain it induces exhibits geometrical ergodicity convergence, for distributions whose tails decay at least as fast as an exponential distribution and at most as fast as a Gaussian distribution. Several numerical examples highlight that our coordinate sampler is more efficient than the Zigzag sampler, in terms of effective sample size.
△ Less
Submitted 11 April, 2019; v1 submitted 10 September, 2018;
originally announced September 2018.
-
Alzheimer's Disease Modelling and Staging through Independent Gaussian Process Analysis of Spatio-Temporal Brain Changes
Authors:
Clement Abi Nader,
Nicholas Ayache,
Philippe Robert,
Marco Lorenzi
Abstract:
Alzheimer's disease (AD) is characterized by complex and largely unknown progression dynamics affecting the brain's morphology. Although the disease evolution spans decades, to date we cannot rely on long-term data to model the pathological progression, since most of the available measures are on a short-term scale. It is therefore difficult to understand and quantify the temporal progression patt…
▽ More
Alzheimer's disease (AD) is characterized by complex and largely unknown progression dynamics affecting the brain's morphology. Although the disease evolution spans decades, to date we cannot rely on long-term data to model the pathological progression, since most of the available measures are on a short-term scale. It is therefore difficult to understand and quantify the temporal progression patterns affecting the brain regions across the AD evolution. In this work, we tackle this problem by presenting a generative model based on probabilistic matrix factorization across temporal and spatial sources. The proposed method addresses the problem of disease progression modelling by introducing clinically-inspired statistical priors. To promote smoothness in time and model plausible pathological evolutions, the temporal sources are defined as monotonic and independent Gaussian Processes. We also estimate an individual time-shift parameter for each patient to automatically position him/her along the sources time-axis. To encode the spatial continuity of the brain sub-structures, the spatial sources are modeled as Gaussian random fields. We test our algorithm on grey matter maps extracted from brain structural images. The experiments highlight differential temporal progression patterns map** brain regions key to the AD pathology, and reveal a disease-specific time scale associated with the decline of volumetric biomarkers across clinical stages.
△ Less
Submitted 20 August, 2018;
originally announced August 2018.
-
Multi-Channel Stochastic Variational Inference for the Joint Analysis of Heterogeneous Biomedical Data in Alzheimer's Disease
Authors:
Luigi Antelmi,
Nicholas Ayache,
Philippe Robert,
Marco Lorenzi
Abstract:
The joint analysis of biomedical data in Alzheimer's Disease (AD) is important for better clinical diagnosis and to understand the relationship between biomarkers. However, jointly accounting for heterogeneous measures poses important challenges related to the modeling of the variability and the interpretability of the results. These issues are here addressed by proposing a novel multi-channel sto…
▽ More
The joint analysis of biomedical data in Alzheimer's Disease (AD) is important for better clinical diagnosis and to understand the relationship between biomarkers. However, jointly accounting for heterogeneous measures poses important challenges related to the modeling of the variability and the interpretability of the results. These issues are here addressed by proposing a novel multi-channel stochastic generative model. We assume that a latent variable generates the data observed through different channels (e.g., clinical scores, imaging, ...) and describe an efficient way to estimate jointly the distribution of both latent variable and data generative process. Experiments on synthetic data show that the multi-channel formulation allows superior data reconstruction as opposed to the single channel one. Moreover, the derived lower bound of the model evidence represents a promising model selection criterion. Experiments on AD data show that the model parameters can be used for unsupervised patient stratification and for the joint interpretation of the heterogeneous observations. Because of its general and flexible formulation, we believe that the proposed method can find important applications as a general data fusion technique.
△ Less
Submitted 10 August, 2018;
originally announced August 2018.
-
Accelerating MCMC Algorithms
Authors:
Christian P. Robert,
Victor Elvira,
Nick Tawn,
Changye Wu
Abstract:
Markov chain Monte Carlo algorithms are used to simulate from complex statistical distributions by way of a local exploration of these distributions. This local feature avoids heavy requests on understanding the nature of the target, but it also potentially induces a lengthy exploration of this target, with a requirement on the number of simulations that grows with the dimension of the problem and…
▽ More
Markov chain Monte Carlo algorithms are used to simulate from complex statistical distributions by way of a local exploration of these distributions. This local feature avoids heavy requests on understanding the nature of the target, but it also potentially induces a lengthy exploration of this target, with a requirement on the number of simulations that grows with the dimension of the problem and with the complexity of the data behind it. Several techniques are available towards accelerating the convergence of these Monte Carlo algorithms, either at the exploration level (as in tempering, Hamiltonian Monte Carlo and partly deterministic methods) or at the exploitation level (with Rao-Blackwellisation and scalable methods).
△ Less
Submitted 11 April, 2018; v1 submitted 8 April, 2018;
originally announced April 2018.
-
Abandon Statistical Significance
Authors:
Blakeley B. McShane,
David Gal,
Andrew Gelman,
Christian Robert,
Jennifer L. Tackett
Abstract:
We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain unresolved by proposals involving modified p-value thresholds, confidence intervals, and Bayes factors. We then discuss our own proposal, which is to abandon statistical significance. We recommend drop** the…
▽ More
We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain unresolved by proposals involving modified p-value thresholds, confidence intervals, and Bayes factors. We then discuss our own proposal, which is to abandon statistical significance. We recommend drop** the NHST paradigm--and the p-value thresholds intrinsic to it--as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with currently subordinate factors (e.g., related prior evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain) as just one among many pieces of evidence. We have no desire to "ban" p-values or other purely statistical measures. Rather, we believe that such measures should not be thresholded and that, thresholded or not, they should not take priority over the currently subordinate factors. We also argue that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures. We offer recommendations for how our proposal can be implemented in the scientific publication process as well as in statistical decision making more broadly.
△ Less
Submitted 8 September, 2018; v1 submitted 21 September, 2017;
originally announced September 2017.
-
Better together? Statistical learning in models made of modules
Authors:
Pierre E. Jacob,
Lawrence M. Murray,
Chris C. Holmes,
Christian P. Robert
Abstract:
In modern applications, statisticians are faced with integrating heterogeneous data modalities relevant for an inference, prediction, or decision problem. In such circumstances, it is convenient to use a graphical model to represent the statistical dependencies, via a set of connected "modules", each relating to a specific data modality, and drawing on specific domain expertise in their developmen…
▽ More
In modern applications, statisticians are faced with integrating heterogeneous data modalities relevant for an inference, prediction, or decision problem. In such circumstances, it is convenient to use a graphical model to represent the statistical dependencies, via a set of connected "modules", each relating to a specific data modality, and drawing on specific domain expertise in their development. In principle, given data, the conventional statistical update then allows for coherent uncertainty quantification and information propagation through and across the modules. However, misspecification of any module can contaminate the estimate and update of others, often in unpredictable ways. In various settings, particularly when certain modules are trusted more than others, practitioners have preferred to avoid learning with the full model in favor of approaches that restrict the information propagation between modules, for example by restricting propagation to only particular directions along the edges of the graph. In this article, we investigate why these modular approaches might be preferable to the full model in misspecified settings. We propose principled criteria to choose between modular and full-model approaches. The question arises in many applied settings, including large stochastic dynamical systems, meta-analysis, epidemiological models, air pollution models, pharmacokinetics-pharmacodynamics, and causal inference with propensity scores.
△ Less
Submitted 29 August, 2017;
originally announced August 2017.
-
Model Misspecification in ABC: Consequences and Diagnostics
Authors:
David T. Frazier,
Christian P. Robert,
Judith Rousseau
Abstract:
We analyze the behavior of approximate Bayesian computation (ABC) when the model generating the simulated data differs from the actual data generating process; i.e., when the data simulator in ABC is misspecified. We demonstrate both theoretically and in simple, but practically relevant, examples that when the model is misspecified different versions of ABC can yield substantially different result…
▽ More
We analyze the behavior of approximate Bayesian computation (ABC) when the model generating the simulated data differs from the actual data generating process; i.e., when the data simulator in ABC is misspecified. We demonstrate both theoretically and in simple, but practically relevant, examples that when the model is misspecified different versions of ABC can yield substantially different results. Our theoretical results demonstrate that even though the model is misspecified, under regularity conditions, the accept/reject ABC approach concentrates posterior mass on an appropriately defined pseudo-true parameter value. However, under model misspecification the ABC posterior does not yield credible sets with valid frequentist coverage and has non-standard asymptotic behavior. In addition, we examine the theoretical behavior of the popular local regression adjustment to ABC under model misspecification and demonstrate that this approach concentrates posterior mass on a completely different pseudo-true value than accept/reject ABC. Using our theoretical results, we suggest two approaches to diagnose model misspecification in ABC. All theoretical results and diagnostics are illustrated in a simple running example.
△ Less
Submitted 9 July, 2019; v1 submitted 6 August, 2017;
originally announced August 2017.
-
Generalized Bouncy Particle Sampler
Authors:
Changye Wu,
Christian P. Robert
Abstract:
As a special example of piecewise deterministic Markov process, bouncy particle sampler is a rejection-free, irreversible Markov chain Monte Carlo algorithm and can draw samples from target distribution efficiently. We generalize bouncy particle sampler in terms of its transition dynamics. In BPS, the transition dynamic at event time is deterministic, but in GBPS, it is random. With the help of th…
▽ More
As a special example of piecewise deterministic Markov process, bouncy particle sampler is a rejection-free, irreversible Markov chain Monte Carlo algorithm and can draw samples from target distribution efficiently. We generalize bouncy particle sampler in terms of its transition dynamics. In BPS, the transition dynamic at event time is deterministic, but in GBPS, it is random. With the help of this randomness, GBPS can overcome the reducibility problem in BPS without refreshement.
△ Less
Submitted 18 June, 2017; v1 submitted 15 June, 2017;
originally announced June 2017.
-
Average of Recentered Parallel MCMC for Big Data
Authors:
Changye Wu,
Christian P. Robert
Abstract:
In big data context, traditional MCMC methods, such as Metropolis-Hastings algorithms and hybrid Monte Carlo, scale poorly because of their need to evaluate the likelihood over the whole data set at each iteration. In order to resurrect MCMC methods, numerous approaches belonging to two categories: divide-and-conquer and subsampling, are proposed. In this article, we study the parallel MCMC and pr…
▽ More
In big data context, traditional MCMC methods, such as Metropolis-Hastings algorithms and hybrid Monte Carlo, scale poorly because of their need to evaluate the likelihood over the whole data set at each iteration. In order to resurrect MCMC methods, numerous approaches belonging to two categories: divide-and-conquer and subsampling, are proposed. In this article, we study the parallel MCMC and propose a new combination method in the divide-and-conquer framework. Compared with some parallel MCMC methods, such as consensus Monte Carlo, Weierstrass Sampler, instead of sampling from subposteriors, our method runs MCMC on rescaled subposteriors, but share the same computation cost in the parallel stage. We also give the mathematical justification of our method and show its performance in several models. Besides, even though our new methods is proposed in parametric framework, it can been applied to non-parametric cases without difficulty.
△ Less
Submitted 18 June, 2017; v1 submitted 15 June, 2017;
originally announced June 2017.
-
Jeffreys priors for mixture estimation: properties and alternatives
Authors:
Clara Grazian,
Christian P. Robert
Abstract:
While Jeffreys priors usually are well-defined for the parameters of mixtures of distributions, they are not available in closed form. Furthermore, they often are improper priors. Hence, they have never been used to draw inference on the mixture parameters. The implementation and the properties of Jeffreys priors in several mixture settings are studied. It is shown that the associated posterior di…
▽ More
While Jeffreys priors usually are well-defined for the parameters of mixtures of distributions, they are not available in closed form. Furthermore, they often are improper priors. Hence, they have never been used to draw inference on the mixture parameters. The implementation and the properties of Jeffreys priors in several mixture settings are studied. It is shown that the associated posterior distributions most often are improper. Nevertheless, the Jeffreys prior for the mixture weights conditionally on the parameters of the mixture components will be shown to have the property of conservativeness with respect to the number of components, in case of overfitted mixture and it can be therefore used as a default priors in this context.
△ Less
Submitted 12 December, 2017; v1 submitted 6 June, 2017;
originally announced June 2017.
-
Some discussions on the Read Paper "Beyond subjective and objective in statistics" by A. Gelman and C. Hennig
Authors:
Gilles Celeux,
Jack Jewson,
Julie Josse,
Jean-Michel Marin,
Christian P. Robert
Abstract:
This note is a collection of several discussions of the paper "Beyond subjective and objective in statistics", read by A. Gelman and C. Hennig to the Royal Statistical Society on April 12, 2017, and to appear in the Journal of the Royal Statistical Society, Series A.
This note is a collection of several discussions of the paper "Beyond subjective and objective in statistics", read by A. Gelman and C. Hennig to the Royal Statistical Society on April 12, 2017, and to appear in the Journal of the Royal Statistical Society, Series A.
△ Less
Submitted 10 May, 2017;
originally announced May 2017.
-
On parameter estimation with the Wasserstein distance
Authors:
Espen Bernton,
Pierre E. Jacob,
Mathieu Gerber,
Christian P. Robert
Abstract:
Statistical inference can be performed by minimizing, over the parameter space, the Wasserstein distance between model distributions and the empirical distribution of the data. We study asymptotic properties of such minimum Wasserstein distance estimators, complementing results derived by Bassetti, Bodini and Regazzini in 2006. In particular, our results cover the misspecified setting, in which th…
▽ More
Statistical inference can be performed by minimizing, over the parameter space, the Wasserstein distance between model distributions and the empirical distribution of the data. We study asymptotic properties of such minimum Wasserstein distance estimators, complementing results derived by Bassetti, Bodini and Regazzini in 2006. In particular, our results cover the misspecified setting, in which the data-generating process is not assumed to be part of the family of distributions described by the model. Our results are motivated by recent applications of minimum Wasserstein estimators to complex generative models. We discuss some difficulties arising in the approximation of these estimators and illustrate their behavior in several numerical experiments. Two of our examples are taken from the literature on approximate Bayesian computation and have likelihood functions that are not analytically tractable. Two other examples involve misspecified models.
△ Less
Submitted 9 May, 2019; v1 submitted 18 January, 2017;
originally announced January 2017.
-
Some comments about A Bayesian criterion for singular models by M. Drton and M. Plummer
Authors:
Christian P. Robert,
Judith Rousseau
Abstract:
These are written comments about the Read Paper A Bayesian criterion for singular models by M. Drton and M. Plummer, read to the Royal Statistical Society on October 5, 2016. The discussion was delivered by Judith Rousseau.
These are written comments about the Read Paper A Bayesian criterion for singular models by M. Drton and M. Plummer, read to the Royal Statistical Society on October 5, 2016. The discussion was delivered by Judith Rousseau.
△ Less
Submitted 8 October, 2016;
originally announced October 2016.
-
Some comments about "Penalising model component complexity" by Simpson et al. (2017)
Authors:
Christian P. Robert,
Judith Rousseau
Abstract:
This note discusses the paper "Penalising model component complexity" by Simpson et al. (2017). While we acknowledge the highly novel approach to prior construction and commend the authors for setting new-encompassing principles that will Bayesian modelling, and while we perceive the potential connection with other branches of the literature, we remain uncertain as to what extent the principles ex…
▽ More
This note discusses the paper "Penalising model component complexity" by Simpson et al. (2017). While we acknowledge the highly novel approach to prior construction and commend the authors for setting new-encompassing principles that will Bayesian modelling, and while we perceive the potential connection with other branches of the literature, we remain uncertain as to what extent the principles exposed in the paper can be developed outside specific models, given their lack of precision. The very notions of model component, base model, overfitting prior are for instance conceptual rather than mathematical and we thus fear the concept of penalised complexity may not further than extending first-guess priors into larger families, thus failing to establish reference priors on a novel sound ground.
△ Less
Submitted 22 September, 2016;
originally announced September 2016.
-
Asymptotic Properties of Approximate Bayesian Computation
Authors:
David T. Frazier,
Gael M. Martin,
Christian P. Robert,
Judith Rousseau
Abstract:
Approximate Bayesian computation allows for statistical analysis in models with intractable likelihoods. In this paper we consider the asymptotic behaviour of the posterior distribution obtained by this method. We give general results on the rate at which the posterior distribution concentrates on sets containing the true parameter, its limiting shape, and the asymptotic distribution of the poster…
▽ More
Approximate Bayesian computation allows for statistical analysis in models with intractable likelihoods. In this paper we consider the asymptotic behaviour of the posterior distribution obtained by this method. We give general results on the rate at which the posterior distribution concentrates on sets containing the true parameter, its limiting shape, and the asymptotic distribution of the posterior mean. These results hold under given rates for the tolerance used within the method, mild regularity conditions on the summary statistics, and a condition linked to identification of the true parameters. Implications for practitioners are discussed.
△ Less
Submitted 8 May, 2018; v1 submitted 23 July, 2016;
originally announced July 2016.
-
ABC random forests for Bayesian parameter inference
Authors:
Louis Raynal,
Jean-Michel Marin,
Pierre Pudlo,
Mathieu Ribatet,
Christian P. Robert,
Arnaud Estoup
Abstract:
This preprint has been reviewed and recommended by Peer Community In Evolutionary Biology (http://dx.doi.org/10.24072/pci.evolbiol.100036). Approximate Bayesian computation (ABC) has grown into a standard methodology that manages Bayesian inference for models associated with intractable likelihood functions. Most ABC implementations require the preliminary selection of a vector of informative stat…
▽ More
This preprint has been reviewed and recommended by Peer Community In Evolutionary Biology (http://dx.doi.org/10.24072/pci.evolbiol.100036). Approximate Bayesian computation (ABC) has grown into a standard methodology that manages Bayesian inference for models associated with intractable likelihood functions. Most ABC implementations require the preliminary selection of a vector of informative statistics summarizing raw data. Furthermore, in almost all existing implementations, the tolerance level that separates acceptance from rejection of simulated parameter values needs to be calibrated. We propose to conduct likelihood-free Bayesian inferences about parameters with no prior selection of the relevant components of the summary statistics and bypassing the derivation of the associated tolerance level. The approach relies on the random forest methodology of Breiman (2001) applied in a (non parametric) regression setting. We advocate the derivation of a new random forest for each component of the parameter vector of interest. When compared with earlier ABC solutions, this method offers significant gains in terms of robustness to the choice of the summary statistics, does not depend on any type of tolerance level, and is a good trade-off in term of quality of point estimator precision and credible interval estimations for a given computing time. We illustrate the performance of our methodological proposal and compare it with earlier ABC methods on a Normal toy example and a population genetics example dealing with human population evolution. All methods designed here have been incorporated in the R package abcrf (version 1.7) available on CRAN.
△ Less
Submitted 2 November, 2018; v1 submitted 18 May, 2016;
originally announced May 2016.
-
Auxiliary Likelihood-Based Approximate Bayesian Computation in State Space Models
Authors:
Gael M. Martin,
Brendan P. M. McCabe,
David T. Frazier,
Worapree Maneesoonthorn,
Christian P. Robert
Abstract:
A computationally simple approach to inference in state space models is proposed, using approximate Bayesian computation (ABC). ABC avoids evaluation of an intractable likelihood by matching summary statistics for the observed data with statistics computed from data simulated from the true process, based on parameter draws from the prior. Draws that produce a 'match' between observed and simulated…
▽ More
A computationally simple approach to inference in state space models is proposed, using approximate Bayesian computation (ABC). ABC avoids evaluation of an intractable likelihood by matching summary statistics for the observed data with statistics computed from data simulated from the true process, based on parameter draws from the prior. Draws that produce a 'match' between observed and simulated summaries are retained, and used to estimate the inaccessible posterior. With no reduction to a low-dimensional set of sufficient statistics being possible in the state space setting, we define the summaries as the maximum of an auxiliary likelihood function, and thereby exploit the asymptotic sufficiency of this estimator for the auxiliary parameter vector. We derive conditions under which this approach - including a computationally efficient version based on the auxiliary score - achieves Bayesian consistency. To reduce the well-documented inaccuracy of ABC in multi-parameter settings, we propose the separate treatment of each parameter dimension using an integrated likelihood technique. Three stochastic volatility models for which exact Bayesian inference is either computationally challenging, or infeasible, are used for illustration. We demonstrate that our approach compares favorably against an extensive set of approximate and exact comparators. An empirical illustration completes the paper.
△ Less
Submitted 2 December, 2018; v1 submitted 27 April, 2016;
originally announced April 2016.
-
Some comments about James Watson's and Chris Holmes' "Approximate Models and Robust Decisions": Nonparametric Bayesian clay for robust decision bricks
Authors:
Christian P. Robert,
Judith Rousseau
Abstract:
This note discusses Watson and Holmes (2016) and their pro- posals towards more robust Bayesian decisions. While we acknowledge and commend the authors for setting new and all-encompassing prin- ciples of Bayesian robustness, and we appreciate the strong anchoring of those within a decision-theoretic referential, we remain uncertain as to which extent such principles can be applied outside binary…
▽ More
This note discusses Watson and Holmes (2016) and their pro- posals towards more robust Bayesian decisions. While we acknowledge and commend the authors for setting new and all-encompassing prin- ciples of Bayesian robustness, and we appreciate the strong anchoring of those within a decision-theoretic referential, we remain uncertain as to which extent such principles can be applied outside binary de- cisions. We also wonder at the ultimate relevance of Kullback-Leibler neighbourhoods to characterise robustness and favour extensions along non-parametric axes.
△ Less
Submitted 9 April, 2016; v1 submitted 30 March, 2016;
originally announced March 2016.
-
Weakly informative reparameterisations for location-scale mixtures
Authors:
Kaniav Kamary,
Jeong Eun Lee,
Christian P. Robert
Abstract:
While mixtures of Gaussian distributions have been studied for more than a century (Pearson, 1894), the construction of a reference Bayesian analysis of those models still remains unsolved, with a general prohibition of the usage of improper priors (Fruwirth-Schnatter, 2006) due to the ill-posed nature of such statistical objects. This difficulty is usually bypassed by an empirical Bayes resolutio…
▽ More
While mixtures of Gaussian distributions have been studied for more than a century (Pearson, 1894), the construction of a reference Bayesian analysis of those models still remains unsolved, with a general prohibition of the usage of improper priors (Fruwirth-Schnatter, 2006) due to the ill-posed nature of such statistical objects. This difficulty is usually bypassed by an empirical Bayes resolution (Richardson and Green, 1997). By creating a new parameterisation cantered on the mean and possibly the variance of the mixture distribution itself, we manage to develop here a weakly informative prior for a wide class of mixtures with an arbitrary number of components. We demonstrate that some posterior distributions associated with this prior and a minimal sample size are proper. We provide MCMC implementations that exhibit the expected exchangeability. We only study here the univariate case, the extension to multivariate location-scale mixtures being currently under study. An R package called Ultimixt is associated with this paper.
△ Less
Submitted 31 July, 2017; v1 submitted 6 January, 2016;
originally announced January 2016.
-
Jeffreys priors for mixture estimation
Authors:
Clara Grazian,
Christian Robert
Abstract:
While Jeffreys priors usually are well-defined for the parameters of mixtures of distributions, they are not available in closed form. Furthermore, they often are improper priors. Hence, they have never been used to draw inference on the mixture parameters. We study in this paper the implementation and the properties of Jeffreys priors in several mixture settings, show that the associated posterio…
▽ More
While Jeffreys priors usually are well-defined for the parameters of mixtures of distributions, they are not available in closed form. Furthermore, they often are improper priors. Hence, they have never been used to draw inference on the mixture parameters. We study in this paper the implementation and the properties of Jeffreys priors in several mixture settings, show that the associated posterior distributions most often are improper, and then propose a noninformative alternative for the analysis of mixtures.
△ Less
Submitted 20 December, 2015; v1 submitted 10 November, 2015;
originally announced November 2015.
-
On Consistency of Approximate Bayesian Computation
Authors:
David T. Frazier,
Gael M. Martin,
Christian P. Robert
Abstract:
Approximate Bayesian computation (ABC) methods have become increasingly prevalent of late, facilitating as they do the analysis of intractable, or challenging, statistical problems. With the initial focus being primarily on the practical import of ABC, exploration of its formal statistical properties has begun to attract more attention. The aim of this paper is to establish general conditions unde…
▽ More
Approximate Bayesian computation (ABC) methods have become increasingly prevalent of late, facilitating as they do the analysis of intractable, or challenging, statistical problems. With the initial focus being primarily on the practical import of ABC, exploration of its formal statistical properties has begun to attract more attention. The aim of this paper is to establish general conditions under which ABC methods are Bayesian consistent, in the sense of producing draws that yield a degenerate posterior distribution at the true parameter (vector) asymptotically (in the sample size). We derive conditions under which arbitrary summary statistics yield consistent inference in the Bayesian sense, with these conditions linked to identification of the true parameters. Using simple illustrative examples that have featured in the literature, we demonstrate that identification, and hence consistency, is unlikely to be achieved in many cases, and propose a simple diagnostic procedure that can indicate the presence of this problem. We also formally explore the link between consistency and the use of auxiliary models within ABC, and illustrate the subsequent results in the Lotka-Volterra predator-prey model.
△ Less
Submitted 21 August, 2015;
originally announced August 2015.
-
The expected demise of the Bayes factor
Authors:
Christian P. Robert
Abstract:
This note is a discussion commenting on the paper by Ly et al. on "Harold Jeffreys's Default Bayes Factor Hypothesis Tests: Explanation, Extension, and Application in Psychology" and on the perceived shortcomings of the classical Bayesian approach to testing, while reporting on an alternative approach advanced by Kamary, Mengersen, Robert and Rousseau (2014. arxiv:1412.2044) as a solution to this…
▽ More
This note is a discussion commenting on the paper by Ly et al. on "Harold Jeffreys's Default Bayes Factor Hypothesis Tests: Explanation, Extension, and Application in Psychology" and on the perceived shortcomings of the classical Bayesian approach to testing, while reporting on an alternative approach advanced by Kamary, Mengersen, Robert and Rousseau (2014. arxiv:1412.2044) as a solution to this quintessential inference problem.
△ Less
Submitted 27 July, 2015; v1 submitted 27 June, 2015;
originally announced June 2015.
-
Three discussions of the paper "sequential quasi-Monte Carlo sampling", by M. Gerber and N. Chopin
Authors:
Julyan Arbel,
Igor Prunster,
Christian P. Robert,
Robin J. Ryder
Abstract:
This is a collection of three written discussions of the paper "sequential quasi-Monte Carlo sampling" by M. Gerber and N. Chopin, following the presentation given before the Royal Statistical Society in London on December 10th, 2014.
This is a collection of three written discussions of the paper "sequential quasi-Monte Carlo sampling" by M. Gerber and N. Chopin, following the presentation given before the Royal Statistical Society in London on December 10th, 2014.
△ Less
Submitted 26 May, 2015; v1 submitted 24 May, 2015;
originally announced May 2015.
-
The Metropolis-Hastings algorithm
Authors:
Christian P. Robert
Abstract:
This short note is a self-contained and basic introduction to the Metropolis-Hastings algorithm, this ubiquitous tool used for producing dependent simulations from an arbitrary distribution. The document illustrates the principles of the methodology on simple examples with R codes and provides references to the recent extensions of the method.
This short note is a self-contained and basic introduction to the Metropolis-Hastings algorithm, this ubiquitous tool used for producing dependent simulations from an arbitrary distribution. The document illustrates the principles of the methodology on simple examples with R codes and provides references to the recent extensions of the method.
△ Less
Submitted 27 January, 2016; v1 submitted 8 April, 2015;
originally announced April 2015.
-
Likelihood-free Model Choice
Authors:
Jean-Michel Marin,
Pierre Pudlo,
Arnaud Estoup,
Christian P. Robert
Abstract:
This document is an invited chapter covering the specificities of ABC model choice, intended for the incoming Handbook of ABC by Sisson, Fan, and Beaumont (2017). Beyond exposing the potential pitfalls of ABC based posterior probabilities, the review emphasizes mostly the solution proposed by Pudlo et al. (2016) on the use of random forests for aggregating summary statistics and and for estimating…
▽ More
This document is an invited chapter covering the specificities of ABC model choice, intended for the incoming Handbook of ABC by Sisson, Fan, and Beaumont (2017). Beyond exposing the potential pitfalls of ABC based posterior probabilities, the review emphasizes mostly the solution proposed by Pudlo et al. (2016) on the use of random forests for aggregating summary statistics and and for estimating the posterior probability of the most likely model via a secondary random fores.
△ Less
Submitted 16 September, 2016; v1 submitted 26 March, 2015;
originally announced March 2015.