Search | arXiv e-print repository

The Impact of Loss Estimation on Gibbs Measures

Authors: David T. Frazier, Jeremias Knoblauch, Christopher Drovandi

Abstract: In recent years, the shortcomings of Bayes posteriors as inferential devices has received increased attention. A popular strategy for fixing them has been to instead target a Gibbs measure based on losses that connect a parameter of interest to observed data. While existing theory for such inference procedures relies on these losses to be analytically available, in many situations these losses mus… ▽ More In recent years, the shortcomings of Bayes posteriors as inferential devices has received increased attention. A popular strategy for fixing them has been to instead target a Gibbs measure based on losses that connect a parameter of interest to observed data. While existing theory for such inference procedures relies on these losses to be analytically available, in many situations these losses must be stochastically estimated using pseudo-observations. The current paper fills this research gap, and derives the first asymptotic theory for Gibbs measures based on estimated losses. Our findings reveal that the number of pseudo-observations required to accurately approximate the exact Gibbs measure depends on the rates at which the bias and variance of the estimated loss converge to zero. These results are particularly consequential for the emerging field of generalised Bayesian inference, for estimated intractable likelihoods, and for biased pseudo-marginal approaches. We apply our results to three Gibbs measures that have been proposed to deal with intractable likelihoods and model misspecification. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2311.15485 [pdf, other]

Calibrated Generalized Bayesian Inference

Authors: David T. Frazier, Christopher Drovandi, Robert Kohn

Abstract: We provide a simple and general solution to the fundamental open problem of inaccurate uncertainty quantification of Bayesian inference in misspecified or approximate models, and of generalized Bayesian posteriors more generally. While existing solutions are based on explicit Gaussian posterior approximations, or computationally onerous post-processing procedures, we demonstrate that correct uncer… ▽ More We provide a simple and general solution to the fundamental open problem of inaccurate uncertainty quantification of Bayesian inference in misspecified or approximate models, and of generalized Bayesian posteriors more generally. While existing solutions are based on explicit Gaussian posterior approximations, or computationally onerous post-processing procedures, we demonstrate that correct uncertainty quantification can be achieved by substituting the usual posterior with an alternative posterior that conveys the same information. This solution applies to both likelihood-based and loss-based posteriors, and we formally demonstrate the reliable uncertainty quantification of this approach. The new approach is demonstrated through a range of examples, including generalized linear models, and doubly intractable models. △ Less

Submitted 26 November, 2023; originally announced November 2023.

Comments: This paper is a substantially revised version of arXiv:2302.06031v1. This revised version has a slightly different focus, additional examples, and theoretical results, as well as different authors

arXiv:2311.01021 [pdf, ps, other]

ABC-based Forecasting in State Space Models

Authors: Chaya Weerasinghe, Ruben Loaiza-Maya, Gael M. Martin, David T. Frazier

Abstract: Approximate Bayesian Computation (ABC) has gained popularity as a method for conducting inference and forecasting in complex models, most notably those which are intractable in some sense. In this paper we use ABC to produce probabilistic forecasts in state space models (SSMs). Whilst ABC-based forecasting in correctly-specified SSMs has been studied, the misspecified case has not been investigate… ▽ More Approximate Bayesian Computation (ABC) has gained popularity as a method for conducting inference and forecasting in complex models, most notably those which are intractable in some sense. In this paper we use ABC to produce probabilistic forecasts in state space models (SSMs). Whilst ABC-based forecasting in correctly-specified SSMs has been studied, the misspecified case has not been investigated, and it is that case which we emphasize. We invoke recent principles of 'focused' Bayesian prediction, whereby Bayesian updates are driven by a scoring rule that rewards predictive accuracy; the aim being to produce predictives that perform well in that rule, despite misspecification. Two methods are investigated for producing the focused predictions. In a simulation setting, 'coherent' predictions are in evidence for both methods: the predictive constructed via the use of a particular scoring rule predicts best according to that rule. Importantly, both focused methods typically produce more accurate forecasts than an exact, but misspecified, predictive. An empirical application to a truly intractable SSM completes the paper. △ Less

Submitted 2 November, 2023; originally announced November 2023.

arXiv:2308.05263 [pdf, other]

Solving the Forecast Combination Puzzle

Authors: David T. Frazier, Ryan Covey, Gael M. Martin, Donald Poskitt

Abstract: We demonstrate that the forecasting combination puzzle is a consequence of the methodology commonly used to produce forecast combinations. By the combination puzzle, we refer to the empirical finding that predictions formed by combining multiple forecasts in ways that seek to optimize forecast performance often do not out-perform more naive, e.g. equally-weighted, approaches. In particular, we dem… ▽ More We demonstrate that the forecasting combination puzzle is a consequence of the methodology commonly used to produce forecast combinations. By the combination puzzle, we refer to the empirical finding that predictions formed by combining multiple forecasts in ways that seek to optimize forecast performance often do not out-perform more naive, e.g. equally-weighted, approaches. In particular, we demonstrate that, due to the manner in which such forecasts are typically produced, tests that aim to discriminate between the predictive accuracy of competing combination strategies can have low power, and can lack size control, leading to an outcome that favours the naive approach. We show that this poor performance is due to the behavior of the corresponding test statistic, which has a non-standard asymptotic distribution under the null hypothesis of no inferior predictive accuracy, rather than the {standard normal distribution that is} {typically adopted}. In addition, we demonstrate that the low power of such predictive accuracy tests in the forecast combination setting can be completely avoided if more efficient estimation strategies are used in the production of the combinations, when feasible. We illustrate these findings both in the context of forecasting a functional of interest and in terms of predictive densities. A short empirical example {using daily financial returns} exemplifies how researchers can avoid the puzzle in practical settings. △ Less

Submitted 9 August, 2023; originally announced August 2023.

arXiv:2305.08429 [pdf, other]

Bayesian inference for misspecified generative models

Authors: David J. Nott, Christopher Drovandi, David T. Frazier

Abstract: Bayesian inference is a powerful tool for combining information in complex settings, a task of increasing importance in modern applications. However, Bayesian inference with a flawed model can produce unreliable conclusions. This review discusses approaches to performing Bayesian inference when the model is misspecified, where by misspecified we mean that the analyst is unwilling to act as if the… ▽ More Bayesian inference is a powerful tool for combining information in complex settings, a task of increasing importance in modern applications. However, Bayesian inference with a flawed model can produce unreliable conclusions. This review discusses approaches to performing Bayesian inference when the model is misspecified, where by misspecified we mean that the analyst is unwilling to act as if the model is correct. Much has been written about this topic, and in most cases we do not believe that a conventional Bayesian analysis is meaningful when there is serious model misspecification. Nevertheless, in some cases it is possible to use a well-specified model to give meaning to a Bayesian analysis of a misspecified model and we will focus on such cases. Three main classes of methods are discussed - restricted likelihood methods, which use a model based on a non-sufficient summary of the original data; modular inference methods which use a model constructed from coupled submodels and some of the submodels are correctly specified; and the use of a reference model to construct a projected posterior or predictive distribution for a simplified model considered to be useful for prediction or interpretation. △ Less

Submitted 18 May, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

Comments: Review paper under submission for Annual Review of Statistics and its Application

arXiv:2305.05120 [pdf, other]

Bayesian Synthetic Likelihood

Authors: David T. Frazier, Christopher Drovandi, David J. Nott

Abstract: Bayesian statistics is concerned with conducting posterior inference for the unknown quantities in a given statistical model. Conventional Bayesian inference requires the specification of a probabilistic model for the observed data, and the construction of the resulting likelihood function. However, sometimes the model is so complicated that evaluation of the likelihood is infeasible, which render… ▽ More Bayesian statistics is concerned with conducting posterior inference for the unknown quantities in a given statistical model. Conventional Bayesian inference requires the specification of a probabilistic model for the observed data, and the construction of the resulting likelihood function. However, sometimes the model is so complicated that evaluation of the likelihood is infeasible, which renders exact Bayesian inference impossible. Bayesian synthetic likelihood (BSL) is a posterior approximation procedure that can be used to conduct inference in situations where the likelihood is intractable, but where simulation from the model is straightforward. In this entry, we give a high-level presentation of BSL, and its extensions aimed at delivering scalable and robust posterior inferences. △ Less

Submitted 10 May, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Comments: This manuscript will eventually appear in Wiley StatsRef-Statistics Reference Online, and should not be confused with the original article on Bayesian Synthetic Likelihood by Price, Drovandi, Nott and Lee (2018)

arXiv:2302.06031 [pdf, ps, other]

Reliable Bayesian Inference in Misspecified Models

Authors: David T. Frazier, Robert Kohn, Christopher Drovandi, David Gunawan

Abstract: We provide a general solution to a fundamental open problem in Bayesian inference, namely poor uncertainty quantification, from a frequency standpoint, of Bayesian methods in misspecified models. While existing solutions are based on explicit Gaussian approximations of the posterior, or computationally onerous post-processing procedures, we demonstrate that correct uncertainty quantification can b… ▽ More We provide a general solution to a fundamental open problem in Bayesian inference, namely poor uncertainty quantification, from a frequency standpoint, of Bayesian methods in misspecified models. While existing solutions are based on explicit Gaussian approximations of the posterior, or computationally onerous post-processing procedures, we demonstrate that correct uncertainty quantification can be achieved by replacing the usual posterior with an intuitive approximate posterior. Critically, our solution is applicable to likelihood-based, and generalized, posteriors as well as cases where the likelihood is intractable and must be estimated. We formally demonstrate the reliable uncertainty quantification of our proposed approach, and show that valid uncertainty quantification is not an asymptotic result but occurs even in small samples. We illustrate this approach through a range of examples, including linear, and generalized, mixed effects models. △ Less

Submitted 12 February, 2023; originally announced February 2023.

arXiv:2301.13368 [pdf, other]

Misspecification-robust Sequential Neural Likelihood for Simulation-based Inference

Authors: Ryan P. Kelly, David J. Nott, David T. Frazier, David J. Warne, Chris Drovandi

Abstract: Simulation-based inference techniques are indispensable for parameter estimation of mechanistic and simulable models with intractable likelihoods. While traditional statistical approaches like approximate Bayesian computation and Bayesian synthetic likelihood have been studied under well-specified and misspecified settings, they often suffer from inefficiencies due to wasted model simulations. Neu… ▽ More Simulation-based inference techniques are indispensable for parameter estimation of mechanistic and simulable models with intractable likelihoods. While traditional statistical approaches like approximate Bayesian computation and Bayesian synthetic likelihood have been studied under well-specified and misspecified settings, they often suffer from inefficiencies due to wasted model simulations. Neural approaches, such as sequential neural likelihood (SNL) avoid this wastage by utilising all model simulations to train a neural surrogate for the likelihood function. However, the performance of SNL under model misspecification is unreliable and can result in overconfident posteriors centred around an inaccurate parameter estimate. In this paper, we propose a novel SNL method, which through the incorporation of additional adjustment parameters, is robust to model misspecification and capable of identifying features of the data that the model is not able to recover. We demonstrate the efficacy of our approach through several illustrative examples, where our method gives more accurate point estimates and uncertainty quantification than SNL. △ Less

Submitted 7 March, 2024; v1 submitted 30 January, 2023; originally announced January 2023.

arXiv:2301.10911 [pdf, other]

Accurate semi-modular posterior inference with a user-defined loss function

Authors: David T. Frazier, David J. Nott

Abstract: Bayesian inference has widely acknowledged advantages in many problems, but it can also be unreliable if the model is misspecified. Bayesian modular inference is concerned with inference in complex models which have been specified through a collection of coupled sub-models. The sub-models are called modules in the literature, and they often arise from modeling different data sources, or from combi… ▽ More Bayesian inference has widely acknowledged advantages in many problems, but it can also be unreliable if the model is misspecified. Bayesian modular inference is concerned with inference in complex models which have been specified through a collection of coupled sub-models. The sub-models are called modules in the literature, and they often arise from modeling different data sources, or from combining domain knowledge from different disciplines. When some modules are misspecified, cutting feedback is a widely used Bayesian modular inference method which ensures that information from suspect model components is not used in making inferences about parameters in correctly specified modules. However, in general settings it is difficult to decide when this ``cut posterior'' is preferable to the exact posterior. When misspecification is not severe, cutting feedback may increase the uncertainty in Bayesian posterior inference greatly without reducing estimation bias substantially. This motivates semi-modular inference methods, which avoid the binary cut of cutting feedback approaches. In this work, using a local model misspecification framework, we provide the first precise formulation of the the bias-variance trade-off that has motivated the literature on semi-modular inference. We then implement a mixture-based semi-modular inference approach, demonstrating theoretically that it delivers inferences that are more accurate, in terms of a user-defined loss function, than if either the cut or full posterior were used by themselves. The new method is demonstrated in a number of applications. △ Less

Submitted 28 July, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

arXiv:2212.03471 [pdf, ps, other]

Bayesian Forecasting in Economics and Finance: A Modern Review

Authors: Gael M. Martin, David T. Frazier, Worapree Maneesoonthorn, Ruben Loaiza-Maya, Florian Huber, Gary Koop, John Maheu, Didier Nibbering, Anastasios Panagiotelis

Abstract: The Bayesian statistical paradigm provides a principled and coherent approach to probabilistic forecasting. Uncertainty about all unknowns that characterize any forecasting problem -- model, parameters, latent states -- is able to be quantified explicitly, and factored into the forecast distribution via the process of integration or averaging. Allied with the elegance of the method, Bayesian forec… ▽ More The Bayesian statistical paradigm provides a principled and coherent approach to probabilistic forecasting. Uncertainty about all unknowns that characterize any forecasting problem -- model, parameters, latent states -- is able to be quantified explicitly, and factored into the forecast distribution via the process of integration or averaging. Allied with the elegance of the method, Bayesian forecasting is now underpinned by the burgeoning field of Bayesian computation, which enables Bayesian forecasts to be produced for virtually any problem, no matter how large, or complex. The current state of play in Bayesian forecasting in economics and finance is the subject of this review. The aim is to provide the reader with an overview of modern approaches to the field, set in some historical context; and with sufficient computational detail given to assist the reader with implementation. △ Less

Submitted 28 July, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

Comments: The paper is now published online at: https://doi.org/10.1016/j.ijforecast.2023.05.002

arXiv:2212.02658 [pdf, other]

Better Together: pooling information in likelihood-free inference

Authors: David T. Frazier, Christopher Drovandi, David J. Nott

Abstract: Likelihood-free inference (LFI) methods, such as Approximate Bayesian computation (ABC), are now routinely applied to conduct inference in complex models. While the application of LFI is now commonplace, the choice of which summary statistics to use in the construction of the posterior remains an open question that is fraught with both practical and theoretical challenges. Instead of choosing a si… ▽ More Likelihood-free inference (LFI) methods, such as Approximate Bayesian computation (ABC), are now routinely applied to conduct inference in complex models. While the application of LFI is now commonplace, the choice of which summary statistics to use in the construction of the posterior remains an open question that is fraught with both practical and theoretical challenges. Instead of choosing a single vector of summaries on which to base inference, we suggest a new pooled posterior and show how to optimally combine inferences from different LFI posteriors. This pooled approach to inference obviates the need to choose a single vector of summaries, or even a single LFI algorithm, and delivers guaranteed inferential accuracy without requiring the computational resources associated with sampling LFI posteriors in high-dimensions. We illustrate this approach through a series of benchmark examples considered in the LFI literature. △ Less

Submitted 5 December, 2022; originally announced December 2022.

arXiv:2210.12589 [pdf, ps, other]

Testing model specification in approximate Bayesian computation

Authors: Andrés Ramírez-Hassan, David T. Frazier

Abstract: We present a procedure to diagnose model misspecification in situations where inference is performed using approximate Bayesian computation. We demonstrate theoretically, and empirically that this procedure can consistently detect the presence of model misspecification. Our examples demonstrates that this approach delivers good finite-sample performance and is computational less onerous than exist… ▽ More We present a procedure to diagnose model misspecification in situations where inference is performed using approximate Bayesian computation. We demonstrate theoretically, and empirically that this procedure can consistently detect the presence of model misspecification. Our examples demonstrates that this approach delivers good finite-sample performance and is computational less onerous than existing approaches, all of which require re-running the inference algorithm. An empirical application to modelling exchange rate log returns using a g-and-k distribution completes the paper. △ Less

Submitted 22 October, 2022; originally announced October 2022.

arXiv:2208.00646 [pdf, other]

doi 10.1214/22-STS876

Computing Bayes: From Then 'Til Now'

Authors: Gael M. Martin, David T. Frazier, Christian P. Robert

Abstract: This paper takes the reader on a journey through the history of Bayesian computation, from the 18th century to the present day. Beginning with the one-dimensional integral first confronted by Bayes in 1763, we highlight the key contributions of: Laplace, Metropolis (and, importantly, his co-authors!), Hammersley and Handscomb, and Hastings, all of which set the foundations for the computational re… ▽ More This paper takes the reader on a journey through the history of Bayesian computation, from the 18th century to the present day. Beginning with the one-dimensional integral first confronted by Bayes in 1763, we highlight the key contributions of: Laplace, Metropolis (and, importantly, his co-authors!), Hammersley and Handscomb, and Hastings, all of which set the foundations for the computational revolution in the late 20th century -- led, primarily, by Markov chain Monte Carlo (MCMC) algorithms. A very short outline of 21st century computational methods -- including pseudo-marginal MCMC, Hamiltonian Monte Carlo, sequential Monte Carlo, and the various `approximate' methods -- completes the paper. △ Less

Submitted 1 August, 2022; originally announced August 2022.

Comments: Material that appeared in an earlier paper, `Computing Bayes: Bayesian Computation from 1763 to the 21st Century' (arXiv:2004.06425) has been broken up into two separate papers: this historical overview of, and timeline for, all computational developments is retained; and a secondary paper (arXiv:2112.10342), which provides a more detailed review of 21st century

Journal ref: Statistical Science, 2023

arXiv:2207.06655 [pdf, other]

Improving the Accuracy of Marginal Approximations in Likelihood-Free Inference via Localisation

Authors: Christopher Drovandi, David J Nott, David T Frazier

Abstract: Likelihood-free methods are an essential tool for performing inference for implicit models which can be simulated from, but for which the corresponding likelihood is intractable. However, common likelihood-free methods do not scale well to a large number of model parameters. A promising approach to high-dimensional likelihood-free inference involves estimating low-dimensional marginal posteriors b… ▽ More Likelihood-free methods are an essential tool for performing inference for implicit models which can be simulated from, but for which the corresponding likelihood is intractable. However, common likelihood-free methods do not scale well to a large number of model parameters. A promising approach to high-dimensional likelihood-free inference involves estimating low-dimensional marginal posteriors by conditioning only on summary statistics believed to be informative for the low-dimensional component, and then combining the low-dimensional approximations in some way. In this paper, we demonstrate that such low-dimensional approximations can be surprisingly poor in practice for seemingly intuitive summary statistic choices. We describe an idealized low-dimensional summary statistic that is, in principle, suitable for marginal estimation. However, a direct approximation of the idealized choice is difficult in practice. We thus suggest an alternative approach to marginal estimation which is easier to implement and automate. Given an initial choice of low-dimensional summary statistic that might only be informative about a marginal posterior location, the new method improves performance by first crudely localising the posterior approximation using all the summary statistics to ensure global identifiability, followed by a second step that hones in on an accurate low-dimensional approximation using the low-dimensional summary statistic. We show that the posterior this approach targets can be represented as a logarithmic pool of posterior distributions based on the low-dimensional and full summary statistics, respectively. The good performance of our method is illustrated in several examples. △ Less

Submitted 14 July, 2022; originally announced July 2022.

Comments: 30 pages, 9 figures

arXiv:2206.02376 [pdf, other]

The Impact of Sampling Variability on Estimated Combinations of Distributional Forecasts

Authors: Ryan Zischke, Gael M. Martin, David T. Frazier, D. S. Poskitt

Abstract: We investigate the performance and sampling variability of estimated forecast combinations, with particular attention given to the combination of forecast distributions. Unknown parameters in the forecast combination are optimized according to criterion functions based on proper scoring rules, which are chosen to reward the form of forecast accuracy that matters for the problem at hand, and foreca… ▽ More We investigate the performance and sampling variability of estimated forecast combinations, with particular attention given to the combination of forecast distributions. Unknown parameters in the forecast combination are optimized according to criterion functions based on proper scoring rules, which are chosen to reward the form of forecast accuracy that matters for the problem at hand, and forecast performance is measured using the out-of-sample expectation of said scoring rule. Our results provide novel insights into the behavior of estimated forecast combinations. Firstly, we show that, asymptotically, the sampling variability in the performance of standard forecast combinations is determined solely by estimation of the constituent models, with estimation of the combination weights contributing no sampling variability whatsoever, at first order. Secondly, we show that, if computationally feasible, forecast combinations produced in a single step -- in which the constituent model and combination function parameters are estimated jointly -- have superior predictive accuracy and lower sampling variability than standard forecast combinations -- where constituent model and combination function parameters are estimated in two steps. These theoretical insights are demonstrated numerically, both in simulation settings and in an extensive empirical illustration using a time series of S&P500 returns. △ Less

Submitted 6 June, 2022; originally announced June 2022.

Comments: 42 pages, 8 figures, 1 table; the views expressed in this paper are those of the authors alone, and do not in any way represent the Methodology Division, the Australian Bureau of Statistics, the Australian Public Service, or the Australian Government

arXiv:2203.09782 [pdf, other]

Modularized Bayesian analyses and cutting feedback in likelihood-free inference

Authors: Atlanta Chakraborty, David J. Nott, Christopher Drovandi, David T. Frazier, Scott A. Sisson

Abstract: There has been much recent interest in modifying Bayesian inference for misspecified models so that it is useful for specific purposes. One popular modified Bayesian inference method is "cutting feedback" which can be used when the model consists of a number of coupled modules, with only some of the modules being misspecified. Cutting feedback methods represent the full posterior distribution in t… ▽ More There has been much recent interest in modifying Bayesian inference for misspecified models so that it is useful for specific purposes. One popular modified Bayesian inference method is "cutting feedback" which can be used when the model consists of a number of coupled modules, with only some of the modules being misspecified. Cutting feedback methods represent the full posterior distribution in terms of conditional and sequential components, and then modify some terms in such a representation based on the modular structure for specification or computation of a modified posterior distribution. The main goal of this is to avoid contamination of inferences for parameters of interest by misspecified modules. Computation for cut posterior distributions is challenging, and here we consider cutting feedback for likelihood-free inference based on Gaussian mixture approximations to the joint distribution of parameters and data summary statistics. We exploit the fact that marginal and conditional distributions of a Gaussian mixture are Gaussian mixtures to give explicit approximations to marginal or conditional posterior distributions so that we can easily approximate cut posterior analyses. The mixture approach allows repeated approximation of posterior distributions for different data based on a single mixture fit, which is important for model checks which aid in the decision of whether to "cut". A semi-modular approach to likelihood-free inference where feedback is partially cut is also developed. The benefits of the method are illustrated in two challenging examples, a collective cell spreading model and a continuous time model for asset returns with jumps. △ Less

Submitted 18 March, 2022; originally announced March 2022.

arXiv:2202.09968 [pdf, other]

Cutting feedback and modularized analyses in generalized Bayesian inference

Authors: David T. Frazier, David J. Nott

Abstract: This work considers Bayesian inference under misspecification for complex statistical models comprised of simpler submodels, referred to as modules, that are coupled together. Such ``multi-modular" models often arise when combining information from different data sources, where there is a module for each data source. When some of the modules are misspecified, the challenges of Bayesian inference u… ▽ More This work considers Bayesian inference under misspecification for complex statistical models comprised of simpler submodels, referred to as modules, that are coupled together. Such ``multi-modular" models often arise when combining information from different data sources, where there is a module for each data source. When some of the modules are misspecified, the challenges of Bayesian inference under misspecification can sometimes be addressed by using ``cutting feedback" methods, which modify conventional Bayesian inference by limiting the influence of unreliable modules. Here we investigate cutting feedback methods in the context of generalized posterior distributions, which are built from arbitrary loss functions, and present novel findings on their behaviour. We make three main contributions. First, we describe how cutting feedback methods can be defined in the generalized Bayes setting, and discuss the appropriate scaling of the loss functions for different modules to each other and the prior. Second, we derive a novel result about the large sample behaviour of the posterior for a given module's parameters conditional on the parameters of other modules. This formally justifies the use of conditional Laplace approximations, which provide better approximations of conditional posterior distributions compared to conditional distributions from a Laplace approximation of the joint posterior. Our final contribution leverages the large sample approximations of our second contribution to provide convenient diagnostics for understanding the sensitivity of inference to the coupling of the modules, and to implement a new semi-modular posterior approach for conducting robust Bayesian modular inference. The usefulness of the methodology is illustrated in several benchmark examples from the literature on cut model inference. △ Less

Submitted 1 August, 2023; v1 submitted 20 February, 2022; originally announced February 2022.

arXiv:2112.12841 [pdf, other]

ABC of the Future

Authors: Henri Pesonen, Umberto Simola, Alvaro Köhn-Luque, Henri Vuollekoski, Xiaoran Lai, Arnoldo Frigessi, Samuel Kaski, David T. Frazier, Worapree Maneesoonthorn, Gael M. Martin, Jukka Corander

Abstract: Approximate Bayesian computation (ABC) has advanced in two decades from a seminal idea to a practically applicable inference tool for simulator-based statistical models, which are becoming increasingly popular in many research domains. The computational feasibility of ABC for practical applications has been recently boosted by adopting techniques from machine learning to build surrogate models for… ▽ More Approximate Bayesian computation (ABC) has advanced in two decades from a seminal idea to a practically applicable inference tool for simulator-based statistical models, which are becoming increasingly popular in many research domains. The computational feasibility of ABC for practical applications has been recently boosted by adopting techniques from machine learning to build surrogate models for the approximate likelihood or posterior and by the introduction of a general-purpose software platform with several advanced features, including automated parallelization. Here we demonstrate the strengths of the advances in ABC by going beyond the typical benchmark examples and considering real applications in astronomy, infectious disease epidemiology, personalised cancer therapy and financial prediction. We anticipate that the emerging success of ABC in producing actual added value and quantitative insights in the real world will continue to inspire a plethora of further applications across different fields of science, social science and technology. △ Less

Submitted 3 October, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

Comments: 29 pages, 7 figures update : added details to some of the sections, corrected typos and clarified notation

arXiv:2112.10342 [pdf, other]

doi 10.1214/22-STS875

Approximating Bayes in the 21st Century

Authors: Gael M. Martin, David T. Frazier, Christian P. Robert

Abstract: The 21st century has seen an enormous growth in the development and use of approximate Bayesian methods. Such methods produce computational solutions to certain intractable statistical problems that challenge exact methods like Markov chain Monte Carlo: for instance, models with unavailable likelihoods, high-dimensional models, and models featuring large data sets. These approximate methods are th… ▽ More The 21st century has seen an enormous growth in the development and use of approximate Bayesian methods. Such methods produce computational solutions to certain intractable statistical problems that challenge exact methods like Markov chain Monte Carlo: for instance, models with unavailable likelihoods, high-dimensional models, and models featuring large data sets. These approximate methods are the subject of this review. The aim is to help new researchers in particular -- and more generally those interested in adopting a Bayesian approach to empirical work -- distinguish between different approximate techniques; understand the sense in which they are approximate; appreciate when and why particular methods are useful; and see the ways in which they can can be combined. △ Less

Submitted 20 December, 2021; originally announced December 2021.

Comments: arXiv admin note: text overlap with arXiv:2004.06425

Journal ref: Statistical Science, 2023

arXiv:2106.12262 [pdf, other]

Variational Bayes in State Space Models: Inferential and Predictive Accuracy

Authors: David T. Frazier, Ruben Loaiza-Maya, Gael M. Martin

Abstract: Using theoretical and numerical results, we document the accuracy of commonly applied variational Bayes methods across a range of state space models. The results demonstrate that, in terms of accuracy on fixed parameters, there is a clear hierarchy in terms of the methods, with approaches that do not approximate the states yielding superior accuracy over methods that do. We also document numerical… ▽ More Using theoretical and numerical results, we document the accuracy of commonly applied variational Bayes methods across a range of state space models. The results demonstrate that, in terms of accuracy on fixed parameters, there is a clear hierarchy in terms of the methods, with approaches that do not approximate the states yielding superior accuracy over methods that do. We also document numerically that the inferential discrepancies between the various methods often yield only small discrepancies in predictive accuracy over small out-of-sample evaluation periods. Nevertheless, in certain settings, these predictive discrepancies can become meaningful over a longer out-of-sample period. This finding indicates that the invariance of predictive results to inferential inaccuracy, which has been an oft-touted point made by practitioners seeking to justify the use of variational inference, is not ubiquitous and must be assessed on a case-by-case basis. △ Less

Submitted 23 February, 2022; v1 submitted 23 June, 2021; originally announced June 2021.

arXiv:2104.14054 [pdf, ps, other]

Loss-Based Variational Bayes Prediction

Authors: David T. Frazier, Ruben Loaiza-Maya, Gael M. Martin, Bonsoo Koo

Abstract: We propose a new approach to Bayesian prediction that caters for models with a large number of parameters and is robust to model misspecification. Given a class of high-dimensional (but parametric) predictive models, this new approach constructs a posterior predictive using a variational approximation to a generalized posterior that is directly focused on predictive accuracy. The theoretical behav… ▽ More We propose a new approach to Bayesian prediction that caters for models with a large number of parameters and is robust to model misspecification. Given a class of high-dimensional (but parametric) predictive models, this new approach constructs a posterior predictive using a variational approximation to a generalized posterior that is directly focused on predictive accuracy. The theoretical behavior of the new prediction approach is analyzed and a form of optimality demonstrated. Applications to both simulated and empirical data using high-dimensional Bayesian neural network and autoregressive mixture models demonstrate that the approach provides more accurate results than various alternatives, including misspecified likelihood-based predictions. △ Less

Submitted 12 May, 2022; v1 submitted 28 April, 2021; originally announced April 2021.

arXiv:2104.03436 [pdf, other]

Synthetic Likelihood in Misspecified Models: Consequences and Corrections

Authors: David T. Frazier, Christopher Drovandi, David J. Nott

Abstract: We analyse the behaviour of the synthetic likelihood (SL) method when the model generating the simulated data differs from the actual data generating process. One of the most common methods to obtain SL-based inferences is via the Bayesian posterior distribution, with this method often referred to as Bayesian synthetic likelihood (BSL). We demonstrate that when the model is misspecified, the BSL p… ▽ More We analyse the behaviour of the synthetic likelihood (SL) method when the model generating the simulated data differs from the actual data generating process. One of the most common methods to obtain SL-based inferences is via the Bayesian posterior distribution, with this method often referred to as Bayesian synthetic likelihood (BSL). We demonstrate that when the model is misspecified, the BSL posterior can be poorly behaved, placing significant posterior mass on values of the model parameters that do not represent the true features observed in the data. Theoretical results demonstrate that in misspecified models the BSL posterior can display a wide range of behaviours depending on the level of model misspecification, including being asymptotically non-Gaussian. Our results suggest that a recently proposed robust BSL approach can ameliorate this behavior and deliver reliable posterior inference under model misspecification. We document all theoretical results using a simple running example. △ Less

Submitted 7 April, 2021; originally announced April 2021.

arXiv:2103.02407 [pdf, other]

A Comparison of Likelihood-Free Methods With and Without Summary Statistics

Authors: Christopher Drovandi, David T Frazier

Abstract: Likelihood-free methods are useful for parameter estimation of complex models with intractable likelihood functions for which it is easy to simulate data. Such models are prevalent in many disciplines including genetics, biology, ecology and cosmology. Likelihood-free methods avoid explicit likelihood evaluation by finding parameter values of the model that generate data close to the observed data… ▽ More Likelihood-free methods are useful for parameter estimation of complex models with intractable likelihood functions for which it is easy to simulate data. Such models are prevalent in many disciplines including genetics, biology, ecology and cosmology. Likelihood-free methods avoid explicit likelihood evaluation by finding parameter values of the model that generate data close to the observed data. The general consensus has been that it is most efficient to compare datasets on the basis of a low dimensional informative summary statistic, incurring information loss in favour of reduced dimensionality. More recently, researchers have explored various approaches for efficiently comparing empirical distributions in the likelihood-free context in an effort to avoid data summarisation. This article provides a review of these full data distance based approaches, and conducts the first comprehensive comparison of such methods, both qualitatively and empirically. We also conduct a substantive empirical comparison with summary statistic based likelihood-free methods. The discussion and results offer guidance to practitioners considering a likelihood-free approach. Whilst we find the best approach to be problem dependent, we also find that the full data distance based approaches are promising and warrant further development. We discuss some opportunities for future research in this space. Computer code to implement the methods discussed in this paper can be found at https://github.com/cdrovandi/ABC-dist-compare. △ Less

Submitted 25 March, 2022; v1 submitted 3 March, 2021; originally announced March 2021.

Comments: To appear in Statistics and Computing

arXiv:2012.03854 [pdf, other]

doi 10.1016/j.ijforecast.2021.11.001

Forecasting: theory and practice

Authors: Fotios Petropoulos, Daniele Apiletti, Vassilios Assimakopoulos, Mohamed Zied Babai, Devon K. Barrow, Souhaib Ben Taieb, Christoph Bergmeir, Ricardo J. Bessa, Jakub Bijak, John E. Boylan, Jethro Browell, Claudio Carnevale, Jennifer L. Castle, Pasquale Cirillo, Michael P. Clements, Clara Cordeiro, Fernando Luiz Cyrino Oliveira, Shari De Baets, Alexander Dokumentov, Joanne Ellison, Piotr Fiszeder, Philip Hans Franses, David T. Frazier, Michael Gilliland, M. Sinan Gönül , et al. (55 additional authors not shown)

Abstract: Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systemati… ▽ More Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts. We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases. △ Less

Submitted 5 January, 2022; v1 submitted 4 December, 2020; originally announced December 2020.

arXiv:2009.09592 [pdf, ps, other]

Optimal probabilistic forecasts: When do they work?

Authors: Gael M. Martin, Rubén Loaiza-Maya, David T. Frazier, Worapree Maneesoonthorn, Andrés Ramírez Hassan

Abstract: Proper scoring rules are used to assess the out-of-sample accuracy of probabilistic forecasts, with different scoring rules rewarding distinct aspects of forecast performance. Herein, we re-investigate the practice of using proper scoring rules to produce probabilistic forecasts that are `optimal' according to a given score, and assess when their out-of-sample accuracy is superior to alternative f… ▽ More Proper scoring rules are used to assess the out-of-sample accuracy of probabilistic forecasts, with different scoring rules rewarding distinct aspects of forecast performance. Herein, we re-investigate the practice of using proper scoring rules to produce probabilistic forecasts that are `optimal' according to a given score, and assess when their out-of-sample accuracy is superior to alternative forecasts, according to that score. Particular attention is paid to relative predictive performance under misspecification of the predictive model. Using numerical illustrations, we document several novel findings within this paradigm that highlight the important interplay between the true data generating process, the assumed predictive model and the scoring rule. Notably, we show that only when a predictive model is sufficiently compatible with the true process to allow a particular score criterion to reward what it is designed to reward, will this approach to forecasting reap benefits. Subject to this compatibility however, the superiority of the optimal forecast will be greater, the greater is the degree of misspecification. We explore these issues under a range of different scenarios, and using both artificially simulated and empirical data. △ Less

Submitted 20 September, 2020; originally announced September 2020.

arXiv:2008.04099 [pdf, ps, other]

Robust Approximate Bayesian Computation: An Adjustment Approach

Authors: David T. Frazier, Christopher Drovandi, Ruben Loaiza-Maya

Abstract: We propose a novel approach to approximate Bayesian computation (ABC) that seeks to cater for possible misspecification of the assumed model. This new approach can be equally applied to rejection-based ABC and to popular regression adjustment ABC. We demonstrate that this new approach mitigates the poor performance of regression adjusted ABC that can eventuate when the model is misspecified. In ad… ▽ More We propose a novel approach to approximate Bayesian computation (ABC) that seeks to cater for possible misspecification of the assumed model. This new approach can be equally applied to rejection-based ABC and to popular regression adjustment ABC. We demonstrate that this new approach mitigates the poor performance of regression adjusted ABC that can eventuate when the model is misspecified. In addition, this new adjustment approach allows us to detect which features of the observed data can not be reliably reproduced by the assumed model. A series of simulated and empirical examples illustrate this new approach. △ Less

Submitted 7 August, 2020; originally announced August 2020.

Comments: arXiv admin note: text overlap with arXiv:1904.04551

arXiv:2006.14126 [pdf, other]

Robust and Efficient Approximate Bayesian Computation: A Minimum Distance Approach

Authors: David T. Frazier

Abstract: In many instances, the application of approximate Bayesian methods is hampered by two practical features: 1) the requirement to project the data down to low-dimensional summary, including the choice of this projection, which ultimately yields inefficient inference; 2) a possible lack of robustness to deviations from the underlying model structure. Motivated by these efficiency and robustness conce… ▽ More In many instances, the application of approximate Bayesian methods is hampered by two practical features: 1) the requirement to project the data down to low-dimensional summary, including the choice of this projection, which ultimately yields inefficient inference; 2) a possible lack of robustness to deviations from the underlying model structure. Motivated by these efficiency and robustness concerns, we construct a new Bayesian method that can deliver efficient estimators when the underlying model is well-specified, and which is simultaneously robust to certain forms of model misspecification. This new approach bypasses the calculation of summaries by considering a norm between empirical and simulated probability measures. For specific choices of the norm, we demonstrate that this approach can deliver point estimators that are as efficient as those obtained using exact Bayesian inference, while also simultaneously displaying robustness to deviations from the underlying model assumptions. △ Less

Submitted 24 June, 2020; originally announced June 2020.

arXiv:2006.10245 [pdf, other]

Approximate Maximum Likelihood for Complex Structural Models

Authors: Veronika Czellar, David T. Frazier, Eric Renault

Abstract: Indirect Inference (I-I) is a popular technique for estimating complex parametric models whose likelihood function is intractable, however, the statistical efficiency of I-I estimation is questionable. While the efficient method of moments, Gallant and Tauchen (1996), promises efficiency, the price to pay for this efficiency is a loss of parsimony and thereby a potential lack of robustness to mode… ▽ More Indirect Inference (I-I) is a popular technique for estimating complex parametric models whose likelihood function is intractable, however, the statistical efficiency of I-I estimation is questionable. While the efficient method of moments, Gallant and Tauchen (1996), promises efficiency, the price to pay for this efficiency is a loss of parsimony and thereby a potential lack of robustness to model misspecification. This stands in contrast to simpler I-I estimation strategies, which are known to display less sensitivity to model misspecification precisely due to their focus on specific elements of the underlying structural model. In this research, we propose a new simulation-based approach that maintains the parsimony of I-I estimation, which is often critical in empirical applications, but can also deliver estimators that are nearly as efficient as maximum likelihood. This new approach is based on using a constrained approximation to the structural model, which ensures identification and can deliver estimators that are nearly efficient. We demonstrate this approach through several examples, and show that this approach can deliver estimators that are nearly as efficient as maximum likelihood, when feasible, but can be employed in many situations where maximum likelihood is infeasible. △ Less

Submitted 17 June, 2020; originally announced June 2020.

arXiv:2004.06425 [pdf, ps, other]

Computing Bayes: Bayesian Computation from 1763 to the 21st Century

Authors: Gael M. Martin, David T. Frazier, Christian P. Robert

Abstract: The Bayesian statistical paradigm uses the language of probability to express uncertainty about the phenomena that generate observed data. Probability distributions thus characterize Bayesian analysis, with the rules of probability used to transform prior probability distributions for all unknowns - parameters, latent variables, models - into posterior distributions, subsequent to the observation… ▽ More The Bayesian statistical paradigm uses the language of probability to express uncertainty about the phenomena that generate observed data. Probability distributions thus characterize Bayesian analysis, with the rules of probability used to transform prior probability distributions for all unknowns - parameters, latent variables, models - into posterior distributions, subsequent to the observation of data. Conducting Bayesian analysis requires the evaluation of integrals in which these probability distributions appear. Bayesian computation is all about evaluating such integrals in the typical case where no analytical solution exists. This paper takes the reader on a chronological tour of Bayesian computation over the past two and a half centuries. Beginning with the one-dimensional integral first confronted by Bayes in 1763, through to recent problems in which the unknowns number in the millions, we place all computational problems into a common framework, and describe all computational methods using a common notation. The aim is to help new researchers in particular - and more generally those interested in adopting a Bayesian approach to empirical work - make sense of the plethora of computational techniques that are now on offer; understand when and why different methods are useful; and see the links that do exist, between them all. △ Less

Submitted 5 December, 2020; v1 submitted 14 April, 2020; originally announced April 2020.

Comments: 47 pages

arXiv:1912.12571 [pdf, other]

Focused Bayesian Prediction

Authors: Ruben Loaiza-Maya, Gael M. Martin, David T. Frazier

Abstract: We propose a new method for conducting Bayesian prediction that delivers accurate predictions without correctly specifying the unknown true data generating process. A prior is defined over a class of plausible predictive models. After observing data, we update the prior to a posterior over these models, via a criterion that captures a user-specified measure of predictive accuracy. Under regularity… ▽ More We propose a new method for conducting Bayesian prediction that delivers accurate predictions without correctly specifying the unknown true data generating process. A prior is defined over a class of plausible predictive models. After observing data, we update the prior to a posterior over these models, via a criterion that captures a user-specified measure of predictive accuracy. Under regularity, this update yields posterior concentration onto the element of the predictive class that maximizes the expectation of the accuracy measure. In a series of simulation experiments and empirical examples we find notable gains in predictive accuracy relative to conventional likelihood-based prediction. △ Less

Submitted 21 August, 2020; v1 submitted 28 December, 2019; originally announced December 2019.

arXiv:1909.04857 [pdf, other]

Efficient Bayesian synthetic likelihood with whitening transformations

Authors: Jacob W. Priddle, Scott A. Sisson, David T. Frazier, Christopher Drovandi

Abstract: Likelihood-free methods are an established approach for performing approximate Bayesian inference for models with intractable likelihood functions. However, they can be computationally demanding. Bayesian synthetic likelihood (BSL) is a popular such method that approximates the likelihood function of the summary statistic with a known, tractable distribution -- typically Gaussian -- and then perfo… ▽ More Likelihood-free methods are an established approach for performing approximate Bayesian inference for models with intractable likelihood functions. However, they can be computationally demanding. Bayesian synthetic likelihood (BSL) is a popular such method that approximates the likelihood function of the summary statistic with a known, tractable distribution -- typically Gaussian -- and then performs statistical inference using standard likelihood-based techniques. However, as the number of summary statistics grows, the number of model simulations required to accurately estimate the covariance matrix for this likelihood rapidly increases. This poses significant challenge for the application of BSL, especially in cases where model simulation is expensive. In this article we propose whitening BSL (wBSL) -- an efficient BSL method that uses approximate whitening transformations to decorrelate the summary statistics at each algorithm iteration. We show empirically that this can reduce the number of model simulations required to implement BSL by more than an order of magnitude, without much loss of accuracy. We explore a range of whitening procedures and demonstrate the performance of wBSL on a range of simulated and real modelling scenarios from ecology and biology. △ Less

Submitted 31 January, 2020; v1 submitted 11 September, 2019; originally announced September 2019.

arXiv:1904.04551 [pdf, ps, other]

Robust Approximate Bayesian Inference with Synthetic Likelihood

Authors: David T. Frazier, Christopher Drovandi

Abstract: Bayesian synthetic likelihood (BSL) is now an established method for conducting approximate Bayesian inference in models where, due to the intractability of the likelihood function, exact Bayesian approaches are either infeasible or computationally too demanding. Implicit in the application of BSL is the assumption that the data generating process (DGP) can produce simulated summary statistics tha… ▽ More Bayesian synthetic likelihood (BSL) is now an established method for conducting approximate Bayesian inference in models where, due to the intractability of the likelihood function, exact Bayesian approaches are either infeasible or computationally too demanding. Implicit in the application of BSL is the assumption that the data generating process (DGP) can produce simulated summary statistics that capture the behaviour of the observed summary statistics. We demonstrate that if this compatibility between the actual and assumed DGP is not satisfied, i.e., if the model is misspecified, BSL can yield unreliable parameter inference. To circumvent this issue, we propose a new BSL approach that can detect the presence of model misspecification, and simultaneously deliver useful inferences even under significant model misspecification. Two simulated and two real data examples demonstrate the performance of this new approach to BSL, and document its superior accuracy over standard BSL when the assumed model is misspecified. △ Less

Submitted 11 June, 2020; v1 submitted 9 April, 2019; originally announced April 2019.

arXiv:1902.04827 [pdf, other]

Bayesian inference using synthetic likelihood: asymptotics and adjustments

Authors: David T. Frazier, David J. Nott, Christopher Drovandi, Robert Kohn

Abstract: Implementing Bayesian inference is often computationally challenging in applications involving complex models, and sometimes calculating the likelihood itself is difficult. Synthetic likelihood is one approach for carrying out inference when the likelihood is intractable, but it is straightforward to simulate from the model. The method constructs an approximate likelihood by taking a vector summar… ▽ More Implementing Bayesian inference is often computationally challenging in applications involving complex models, and sometimes calculating the likelihood itself is difficult. Synthetic likelihood is one approach for carrying out inference when the likelihood is intractable, but it is straightforward to simulate from the model. The method constructs an approximate likelihood by taking a vector summary statistic as being multivariate normal, with the unknown mean and covariance matrix estimated by simulation for any given parameter value. Our article makes three contributions. The first shows that if the summary statistic satisfies a central limit theorem, then the synthetic likelihood posterior is asymptotically normal and yields credible sets with the correct level of frequentist coverage. This result is similar to that obtained by approximate Bayesian computation. The second contribution compares the computational efficiency of Bayesian synthetic likelihood and approximate Bayesian computation using the acceptance probability for rejection and importance sampling algorithms with a "good" proposal distribution. We show that Bayesian synthetic likelihood is computationally more efficient than approximate Bayesian computation, and behaves similarly to regression-adjusted approximate Bayesian computation. Based on the asymptotic results, the third contribution proposes using adjusted inference methods when a possibly misspecified form is assumed for the covariance matrix of the synthetic likelihood, such as diagonal or a factor model, to speed up the computation. The methodology is illustrated with some simulated and real examples. △ Less

Submitted 12 March, 2021; v1 submitted 13 February, 2019; originally announced February 2019.

arXiv:1712.07750 [pdf, ps, other]

Approximate Bayesian Forecasting

Authors: David T. Frazier, Worapree Maneesoonthorn, Gael M. Martin, Brendan P. M. McCabe

Abstract: Approximate Bayesian Computation (ABC) has become increasingly prominent as a method for conducting parameter inference in a range of challenging statistical problems, most notably those characterized by an intractable likelihood function. In this paper, we focus on the use of ABC not as a tool for parametric inference, but as a means of generating probabilistic forecasts; or for conducting what w… ▽ More Approximate Bayesian Computation (ABC) has become increasingly prominent as a method for conducting parameter inference in a range of challenging statistical problems, most notably those characterized by an intractable likelihood function. In this paper, we focus on the use of ABC not as a tool for parametric inference, but as a means of generating probabilistic forecasts; or for conducting what we refer to as `approximate Bayesian forecasting'. The four key issues explored are: i) the link between the theoretical behavior of the ABC posterior and that of the ABC-based predictive; ii) the use of proper scoring rules to measure the (potential) loss of forecast accuracy when using an approximate rather than an exact predictive; iii) the performance of approximate Bayesian forecasting in state space models; and iv) the use of forecasting criteria to inform the selection of ABC summaries in empirical settings. The primary finding of the paper is that ABC can provide a computationally efficient means of generating probabilistic forecasts that are nearly identical to those produced by the exact predictive, and in a fraction of the time required to produce predictions via an exact method. y identical to those produced by the exact predictive, and in a fraction of the time required to produce predictions via an exact method. △ Less

Submitted 19 June, 2018; v1 submitted 20 December, 2017; originally announced December 2017.

arXiv:1708.02365 [pdf, ps, other]

Indirect Inference with a Non-Smooth Criterion Function

Authors: David T. Frazier, Tatsushi Oka, Dan Zhu

Abstract: Indirect inference requires simulating realisations of endogenous variables from the model under study. When the endogenous variables are discontinuous functions of the model parameters, the resulting indirect inference criterion function is discontinuous and does not permit the use of derivative-based optimisation routines. Using a change of variables technique, we propose a novel simulation algo… ▽ More Indirect inference requires simulating realisations of endogenous variables from the model under study. When the endogenous variables are discontinuous functions of the model parameters, the resulting indirect inference criterion function is discontinuous and does not permit the use of derivative-based optimisation routines. Using a change of variables technique, we propose a novel simulation algorithm that alleviates the discontinuities inherent in such indirect inference criterion functions, and permits the application of derivative-based optimisation routines to estimate the unknown model parameters. Unlike competing approaches, this approach does not rely on kernel smoothing or bandwidth parameters. Several Monte Carlo examples that have featured in the literature on indirect inference with discontinuous outcomes illustrate the approach, and demonstrate the superior performance of this approach over existing alternatives. △ Less

Submitted 9 July, 2019; v1 submitted 8 August, 2017; originally announced August 2017.

Comments: This paper is a revision of arXiv:1708.02365 and supersedes the earlier arXiv paper "Derivative-Based Optimization with a Non-Smooth Simulated Criterion"

arXiv:1708.01974 [pdf, ps, other]

doi 10.1111/369--7412/20/82421

Model Misspecification in ABC: Consequences and Diagnostics

Authors: David T. Frazier, Christian P. Robert, Judith Rousseau

Abstract: We analyze the behavior of approximate Bayesian computation (ABC) when the model generating the simulated data differs from the actual data generating process; i.e., when the data simulator in ABC is misspecified. We demonstrate both theoretically and in simple, but practically relevant, examples that when the model is misspecified different versions of ABC can yield substantially different result… ▽ More We analyze the behavior of approximate Bayesian computation (ABC) when the model generating the simulated data differs from the actual data generating process; i.e., when the data simulator in ABC is misspecified. We demonstrate both theoretically and in simple, but practically relevant, examples that when the model is misspecified different versions of ABC can yield substantially different results. Our theoretical results demonstrate that even though the model is misspecified, under regularity conditions, the accept/reject ABC approach concentrates posterior mass on an appropriately defined pseudo-true parameter value. However, under model misspecification the ABC posterior does not yield credible sets with valid frequentist coverage and has non-standard asymptotic behavior. In addition, we examine the theoretical behavior of the popular local regression adjustment to ABC under model misspecification and demonstrate that this approach concentrates posterior mass on a completely different pseudo-true value than accept/reject ABC. Using our theoretical results, we suggest two approaches to diagnose model misspecification in ABC. All theoretical results and diagnostics are illustrated in a simple running example. △ Less

Submitted 9 July, 2019; v1 submitted 6 August, 2017; originally announced August 2017.

arXiv:1607.06903 [pdf, ps, other]

Asymptotic Properties of Approximate Bayesian Computation

Authors: David T. Frazier, Gael M. Martin, Christian P. Robert, Judith Rousseau

Abstract: Approximate Bayesian computation allows for statistical analysis in models with intractable likelihoods. In this paper we consider the asymptotic behaviour of the posterior distribution obtained by this method. We give general results on the rate at which the posterior distribution concentrates on sets containing the true parameter, its limiting shape, and the asymptotic distribution of the poster… ▽ More Approximate Bayesian computation allows for statistical analysis in models with intractable likelihoods. In this paper we consider the asymptotic behaviour of the posterior distribution obtained by this method. We give general results on the rate at which the posterior distribution concentrates on sets containing the true parameter, its limiting shape, and the asymptotic distribution of the posterior mean. These results hold under given rates for the tolerance used within the method, mild regularity conditions on the summary statistics, and a condition linked to identification of the true parameters. Implications for practitioners are discussed. △ Less

Submitted 8 May, 2018; v1 submitted 23 July, 2016; originally announced July 2016.

Comments: This 31 pages paper is a revised version of the paper, including supplementary material

arXiv:1607.06163 [pdf, ps, other]

Indirect Inference With(Out) Constraints

Authors: David T. Frazier, Eric Renault

Abstract: Indirect Inference (I-I) estimation of structural parameters $θ$ {requires matching observed and simulated statistics, which are most often generated using an auxiliary model that depends on instrumental parameters $β$.} {The estimators of the instrumental parameters will encapsulate} the statistical information used for inference about the structural parameters. As such, artificially constraining… ▽ More Indirect Inference (I-I) estimation of structural parameters $θ$ {requires matching observed and simulated statistics, which are most often generated using an auxiliary model that depends on instrumental parameters $β$.} {The estimators of the instrumental parameters will encapsulate} the statistical information used for inference about the structural parameters. As such, artificially constraining these parameters may restrict the ability of the auxiliary model to accurately replicate features in the structural data, which may lead to a range of issues, such as, a loss of identification. However, in certain situations the parameters $β$ naturally come with a set of $q$ restrictions. Examples include settings where $β$ must be estimated subject to $q$ possibly strict inequality constraints $g(β) > 0$, such as, when I-I is based on GARCH auxiliary models. In these settings we propose a novel I-I approach that uses appropriately modified unconstrained auxiliary statistics, which are simple to compute and always exists. We state the relevant asymptotic theory for this I-I approach without constraints and show that it can be reinterpreted as a standard implementation of I-I through a properly modified binding function. Several examples that have featured in the literature illustrate our approach. △ Less

Submitted 20 August, 2019; v1 submitted 20 July, 2016; originally announced July 2016.

arXiv:1604.07949 [pdf, other]

Auxiliary Likelihood-Based Approximate Bayesian Computation in State Space Models

Authors: Gael M. Martin, Brendan P. M. McCabe, David T. Frazier, Worapree Maneesoonthorn, Christian P. Robert

Abstract: A computationally simple approach to inference in state space models is proposed, using approximate Bayesian computation (ABC). ABC avoids evaluation of an intractable likelihood by matching summary statistics for the observed data with statistics computed from data simulated from the true process, based on parameter draws from the prior. Draws that produce a 'match' between observed and simulated… ▽ More A computationally simple approach to inference in state space models is proposed, using approximate Bayesian computation (ABC). ABC avoids evaluation of an intractable likelihood by matching summary statistics for the observed data with statistics computed from data simulated from the true process, based on parameter draws from the prior. Draws that produce a 'match' between observed and simulated summaries are retained, and used to estimate the inaccessible posterior. With no reduction to a low-dimensional set of sufficient statistics being possible in the state space setting, we define the summaries as the maximum of an auxiliary likelihood function, and thereby exploit the asymptotic sufficiency of this estimator for the auxiliary parameter vector. We derive conditions under which this approach - including a computationally efficient version based on the auxiliary score - achieves Bayesian consistency. To reduce the well-documented inaccuracy of ABC in multi-parameter settings, we propose the separate treatment of each parameter dimension using an integrated likelihood technique. Three stochastic volatility models for which exact Bayesian inference is either computationally challenging, or infeasible, are used for illustration. We demonstrate that our approach compares favorably against an extensive set of approximate and exact comparators. An empirical illustration completes the paper. △ Less

Submitted 2 December, 2018; v1 submitted 27 April, 2016; originally announced April 2016.

Comments: This paper is forthcoming at the Journal of Computational and Graphical Statistics. It also supersedes the earlier arXiv paper "Approximate Bayesian Computation in State Space Models" (arXiv:1409.8363)

arXiv:1508.05178 [pdf, other]

On Consistency of Approximate Bayesian Computation

Authors: David T. Frazier, Gael M. Martin, Christian P. Robert

Abstract: Approximate Bayesian computation (ABC) methods have become increasingly prevalent of late, facilitating as they do the analysis of intractable, or challenging, statistical problems. With the initial focus being primarily on the practical import of ABC, exploration of its formal statistical properties has begun to attract more attention. The aim of this paper is to establish general conditions unde… ▽ More Approximate Bayesian computation (ABC) methods have become increasingly prevalent of late, facilitating as they do the analysis of intractable, or challenging, statistical problems. With the initial focus being primarily on the practical import of ABC, exploration of its formal statistical properties has begun to attract more attention. The aim of this paper is to establish general conditions under which ABC methods are Bayesian consistent, in the sense of producing draws that yield a degenerate posterior distribution at the true parameter (vector) asymptotically (in the sample size). We derive conditions under which arbitrary summary statistics yield consistent inference in the Bayesian sense, with these conditions linked to identification of the true parameters. Using simple illustrative examples that have featured in the literature, we demonstrate that identification, and hence consistency, is unlikely to be achieved in many cases, and propose a simple diagnostic procedure that can indicate the presence of this problem. We also formally explore the link between consistency and the use of auxiliary models within ABC, and illustrate the subsequent results in the Lotka-Volterra predator-prey model. △ Less

Submitted 21 August, 2015; originally announced August 2015.

MSC Class: 62F15; 62F12; 62C10

Showing 1–40 of 40 results for author: Frazier, D T