Search | arXiv e-print repository

Sequential Bayesian inference for stochastic epidemic models of cumulative incidence

Authors: Sam A. Whitaker, Andrew Golightly, Colin S. Gillespie, Theodore Kypraios

Abstract: Epidemics are inherently stochastic, and stochastic models provide an appropriate way to describe and analyse such phenomena. Given temporal incidence data consisting of, for example, the number of new infections or removals in a given time window, a continuous-time discrete-valued Markov process provides a natural description of the dynamics of each model component, typically taken to be the numb… ▽ More Epidemics are inherently stochastic, and stochastic models provide an appropriate way to describe and analyse such phenomena. Given temporal incidence data consisting of, for example, the number of new infections or removals in a given time window, a continuous-time discrete-valued Markov process provides a natural description of the dynamics of each model component, typically taken to be the number of susceptible, exposed, infected or removed individuals. Fitting the SEIR model to time-course data is a challenging problem due incomplete observations and, consequently, the intractability of the observed data likelihood. Whilst sampling based inference schemes such as Markov chain Monte Carlo are routinely applied, their computational cost typically restricts analysis to data sets of no more than a few thousand infective cases. Instead, we develop a sequential inference scheme that makes use of a computationally cheap approximation of the most natural Markov process model. Crucially, the resulting model allows a tractable conditional parameter posterior which can be summarised in terms of a set of low dimensional statistics. This is used to rejuvenate parameter samples in conjunction with a novel bridge construct for propagating state trajectories conditional on the next observation of cumulative incidence. The resulting inference framework also allows for stochastic infection and reporting rates. We illustrate our approach using synthetic and real data applications. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 27 pages

arXiv:2303.15371 [pdf, ps, other]

Accelerating Bayesian inference for stochastic epidemic models using incidence data

Authors: Andrew Golightly, Laura E. Wadkin, Sam A. Whitaker, Andrew W. Baggaley, Nick G. Parker, Theodore Kypraios

Abstract: We consider the case of performing Bayesian inference for stochastic epidemic compartment models, using incomplete time course data consisting of incidence counts that are either the number of new infections or removals in time intervals of fixed length. We eschew the most natural Markov jump process representation for reasons of computational efficiency, and focus on a stochastic differential equ… ▽ More We consider the case of performing Bayesian inference for stochastic epidemic compartment models, using incomplete time course data consisting of incidence counts that are either the number of new infections or removals in time intervals of fixed length. We eschew the most natural Markov jump process representation for reasons of computational efficiency, and focus on a stochastic differential equation representation. This is further approximated to give a tractable Gaussian process, that is, the linear noise approximation (LNA). Unless the observation model linking the LNA to data is both linear and Gaussian, the observed data likelihood remains intractable. It is in this setting that we consider two approaches for marginalising over the latent process: a correlated pseudo-marginal method and analytic marginalisation via a Gaussian approximation of the observation model. We compare and contrast these approaches using synthetic data before applying the best performing method to real data consisting of removal incidence of oak processionary moth nests in Richmond Park, London. Our approach further allows comparison between various competing compartment models. △ Less

Submitted 7 August, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

Comments: 25 pages

arXiv:2212.01202 [pdf, other]

Comparative Judgement Modeling to Map Forced Marriage at Local Levels

Authors: R. G. Seymour, A. Nyarko-Agyei, H. R. McCabe, K. Severn, T. Kypraios, D. Sirl, A. Taylor

Abstract: Forcing someone into marriage against their will is a violation of their human rights. In 2021, the county of Nottinghamshire, UK, launched a strategy to tackle forced marriage and violence against women and girls. However, accessing information about where victims are located in the county could compromise their safety, so it is not possible to develop interventions for different areas of the cou… ▽ More Forcing someone into marriage against their will is a violation of their human rights. In 2021, the county of Nottinghamshire, UK, launched a strategy to tackle forced marriage and violence against women and girls. However, accessing information about where victims are located in the county could compromise their safety, so it is not possible to develop interventions for different areas of the county. Comparative judgement studies offer a way to map the risk of human rights abuses without collecting data that could compromise victim safety. Current methods require studies to have a large number of participants, so we develop a comparative judgement model that provides a more flexible spatial modelling structure and a mechanism to schedule comparisons more effectively. The methods reduce the data collection burden on participants and make a comparative judgement study feasible with a small number of participants. Underpinning these methods is a latent variable representation that improves on the scalability of previous comparative judgement models. We use these methods to map the risk of forced marriage across Nottinghamshire thereby supporting the county's strategy for tackling violence against women and girls. △ Less

Submitted 25 March, 2023; v1 submitted 2 December, 2022; originally announced December 2022.

Comments: Submitted. 31 pages, 8 figures

MSC Class: 62G05; 62P25; 62-08

arXiv:2009.04137 [pdf, other]

doi 10.1111/rssc.12515

A Bayesian Nonparametric Analysis of the 2003 Outbreak of Highly Pathogenic Avian Influenza in the Netherlands

Authors: R. G. Seymour, T. Kypraios, P. D. O'Neill, T. J. Hagenaars

Abstract: Infectious diseases on farms pose both public and animal health risks, so understanding how they spread between farms is crucial for develo** disease control strategies to prevent future outbreaks. We develop novel Bayesian nonparametric methodology to fit spatial stochastic transmission models in which the infection rate between any two farms is a function that depends on the distance between t… ▽ More Infectious diseases on farms pose both public and animal health risks, so understanding how they spread between farms is crucial for develo** disease control strategies to prevent future outbreaks. We develop novel Bayesian nonparametric methodology to fit spatial stochastic transmission models in which the infection rate between any two farms is a function that depends on the distance between them, but without assuming a specified parametric form. Making nonparametric inference in this context is challenging since the likelihood function of the observed data is intractable because the underlying transmission process is unobserved. We adopt a fully Bayesian approach by assigning a transformed Gaussian Process prior distribution to the infection rate function, and then develop an efficient data augmentation Markov Chain Monte Carlo algorithm to perform Bayesian inference. We use the posterior predictive distribution to simulate the effect of different disease control methods and their economic impact. We analyse a large outbreak of Avian Influenza in the Netherlands and infer the between-farm infection rate, as well as the unknown infection status of farms which were pre-emptively culled. We use our results to analyse ring-culling strategies, and conclude that although effective, ring-culling has limited impact in high density areas. △ Less

Submitted 23 August, 2021; v1 submitted 9 September, 2020; originally announced September 2020.

Comments: 24 pages, 4 figures

MSC Class: 62G99 (Primary) 62P10; 62M20 (Secondary)

arXiv:1904.08356 [pdf, other]

Scalable Bayesian Inference for Population Markov Jump Processes

Authors: Iker Perez, Theodore Kypraios

Abstract: Bayesian inference for Markov jump processes (MJPs) where available observations relate to either system states or jumps typically relies on data-augmentation Markov Chain Monte Carlo. State-of-the-art developments involve representing MJP paths with auxiliary candidate jump times that are later thinned. However, these algorithms are i) unfeasible in situations involving large or infinite capacity… ▽ More Bayesian inference for Markov jump processes (MJPs) where available observations relate to either system states or jumps typically relies on data-augmentation Markov Chain Monte Carlo. State-of-the-art developments involve representing MJP paths with auxiliary candidate jump times that are later thinned. However, these algorithms are i) unfeasible in situations involving large or infinite capacity systems and ii) not amenable for all observation types. In this paper we establish and present a general data-augmentation framework for population MJPs based on uniformized representations of the underlying non-stationary jump processes. This leads to multiple novel MCMC samplers which enable exact (in the Monte Carlo sense) inference tasks for model parameters. We show that proposed samplers outperform existing popular approaches, and offer substantial efficiency gains in applications to partially observed stochastic epidemics, immigration processes and predator-prey dynamical systems. △ Less

Submitted 17 April, 2019; originally announced April 2019.

arXiv:1806.02458 [pdf, ps, other]

On Bayesian inferential tasks with infinite-state jump processes: efficient data augmentation

Authors: Iker Perez, Lax Chan, Mercedes Torres Torres, James Goulding, Theodore Kypraios

Abstract: Advances in sampling schemes for Markov jump processes have recently enabled multiple inferential tasks. However, in statistical and machine learning applications, we often require that these continuous-time models find support on structured and infinite state spaces. In these cases, exact sampling may only be achieved by often inefficient particle filtering procedures, and rapidly augmenting obse… ▽ More Advances in sampling schemes for Markov jump processes have recently enabled multiple inferential tasks. However, in statistical and machine learning applications, we often require that these continuous-time models find support on structured and infinite state spaces. In these cases, exact sampling may only be achieved by often inefficient particle filtering procedures, and rapidly augmenting observed datasets remains a significant challenge. Here, we build on the principles of uniformization and present a tractable framework to address this problem, which greatly improves the efficiency of existing state-of-the-art methods commonly used in small finite-state systems, and further scales their use to infinite-state scenarios. We capitalize on the marginal role of variable subsets in a model hierarchy during the process jumps, and describe an algorithm that relies on measurable map**s between pairs of states and carefully designed sets of synthetic jump observations. The proposed method enables the efficient integration of slice sampling techniques and it can overcome the existing computational bottleneck. We offer evidence by means of experiments addressing inference and clustering tasks on both simulated and real data sets. △ Less

Submitted 6 June, 2018; originally announced June 2018.

arXiv:1710.04977 [pdf, other]

Bayes factors for partially observed stochastic epidemic models

Authors: Muteb Alharthi, Theodore Kypraios, Philip D. O'Neill

Abstract: We consider the problem of model choice for stochastic epidemic models given partial observation of a disease outbreak through time. Our main focus is on the use of Bayes factors. Although Bayes factors have appeared in the epidemic modelling literature before, they can be hard to compute and little attention has been given to fundamental questions concerning their utility. In this paper we derive… ▽ More We consider the problem of model choice for stochastic epidemic models given partial observation of a disease outbreak through time. Our main focus is on the use of Bayes factors. Although Bayes factors have appeared in the epidemic modelling literature before, they can be hard to compute and little attention has been given to fundamental questions concerning their utility. In this paper we derive analytic expressions for Bayes factors given complete observation through time, which suggest practical guidelines for model choice problems. We extend the power posterior method for computing Bayes factors so as to account for missing data and apply this approach to partially observed epidemics. For comparison, we also explore the use of a deviance information criterion for missing data scenarios. The methods are illustrated via examples involving both simulated and real data. △ Less

Submitted 13 October, 2017; originally announced October 2017.

arXiv:1706.02940 [pdf, other]

Bayesian nonparametrics for stochastic epidemic models

Authors: Theodore Kypraios, Philip D. O'Neill

Abstract: The vast majority of models for the spread of communicable diseases are parametric in nature and involve underlying assumptions about how the disease spreads through a population. In this article we consider the use of Bayesian nonparametric approaches to analysing data from disease outbreaks. Specifically we focus on methods for estimating the infection process in simple models under the assumpti… ▽ More The vast majority of models for the spread of communicable diseases are parametric in nature and involve underlying assumptions about how the disease spreads through a population. In this article we consider the use of Bayesian nonparametric approaches to analysing data from disease outbreaks. Specifically we focus on methods for estimating the infection process in simple models under the assumption that this process has an explicit time-dependence. △ Less

Submitted 9 June, 2017; originally announced June 2017.

arXiv:1704.02791 [pdf, ps, other]

Efficient SMC$^2$ schemes for stochastic kinetic models

Authors: Andrew Golightly, Theodore Kypraios

Abstract: Fitting stochastic kinetic models represented by Markov jump processes within the Bayesian paradigm is complicated by the intractability of the observed data likelihood. There has therefore been considerable attention given to the design of pseudo-marginal Markov chain Monte Carlo algorithms for such models. However, these methods are typically computationally intensive, often require careful tuni… ▽ More Fitting stochastic kinetic models represented by Markov jump processes within the Bayesian paradigm is complicated by the intractability of the observed data likelihood. There has therefore been considerable attention given to the design of pseudo-marginal Markov chain Monte Carlo algorithms for such models. However, these methods are typically computationally intensive, often require careful tuning and must be restarted from scratch upon receipt of new observations. Sequential Monte Carlo (SMC) methods on the other hand aim to efficiently reuse posterior samples at each time point. Despite their appeal, applying SMC schemes in scenarios with both dynamic states and static parameters is made difficult by the problem of particle degeneracy. A principled approach for overcoming this problem is to move each parameter particle through a Metropolis-Hastings kernel that leaves the target invariant. This rejuvenation step is key to a recently proposed SMC$^2$ algorithm, which can be seen as the pseudo-marginal analogue of an idealised scheme known as iterated batch importance sampling. Computing the parameter weights in SMC$^2$ requires running a particle filter over dynamic states to unbiasedly estimate the intractable observed data likelihood contributions at each time point. In this paper, we propose to use an auxiliary particle filter inside the SMC$^2$ scheme. Our method uses two recently proposed constructs for sampling conditioned jump processes and we find that the resulting inference schemes typically require fewer state particles than when using a simple bootstrap filter. Using two applications, we compare the performance of the proposed approach with various competing methods, including two global MCMC schemes. △ Less

Submitted 3 August, 2017; v1 submitted 10 April, 2017; originally announced April 2017.

Comments: 22 pages

arXiv:1703.03475 [pdf, ps, other]

Auxiliary Variables for Bayesian Inference in Multi-Class Queueing Networks

Authors: Iker Perez, David Hodge, Theodore Kypraios

Abstract: Queue networks describe complex stochastic systems of both theoretical and practical interest. They provide the means to assess alterations, diagnose poor performance and evaluate robustness across sets of interconnected resources. In the present paper, we focus on the underlying continuous-time Markov chains induced by these networks, and we present a flexible method for drawing parameter inferen… ▽ More Queue networks describe complex stochastic systems of both theoretical and practical interest. They provide the means to assess alterations, diagnose poor performance and evaluate robustness across sets of interconnected resources. In the present paper, we focus on the underlying continuous-time Markov chains induced by these networks, and we present a flexible method for drawing parameter inference in multi-class Markovian cases with switching and different service disciplines. The approach is directed towards the inferential problem with missing data and introduces a slice sampling technique with map**s to the measurable space of task transitions between service stations. The method deals with time and tractability issues, can handle prior system knowledge and overcomes common restrictions on service rates across existing inferential frameworks. Finally, the proposed algorithm is validated on synthetic data and applied to a real data set, obtained from a service delivery tasking tool implemented in two university hospitals. △ Less

Submitted 1 November, 2017; v1 submitted 9 March, 2017; originally announced March 2017.

arXiv:1611.02492 [pdf, other]

A rare event approach to high dimensional Approximate Bayesian computation

Authors: Dennis Prangle, Richard G. Everitt, Theodore Kypraios

Abstract: Approximate Bayesian computation (ABC) methods permit approximate inference for intractable likelihoods when it is possible to simulate from the model. However they perform poorly for high dimensional data, and in practice must usually be used in conjunction with dimension reduction methods, resulting in a loss of accuracy which is hard to quantify or control. We propose a new ABC method for high… ▽ More Approximate Bayesian computation (ABC) methods permit approximate inference for intractable likelihoods when it is possible to simulate from the model. However they perform poorly for high dimensional data, and in practice must usually be used in conjunction with dimension reduction methods, resulting in a loss of accuracy which is hard to quantify or control. We propose a new ABC method for high dimensional data based on rare event methods which we refer to as RE-ABC. This uses a latent variable representation of the model. For a given parameter value, we estimate the probability of the rare event that the latent variables correspond to data roughly consistent with the observations. This is performed using sequential Monte Carlo and slice sampling to systematically search the space of latent variables. In contrast standard ABC can be viewed as using a more naive Monte Carlo estimate. We use our rare event probability estimator as a likelihood estimate within the pseudo-marginal Metropolis-Hastings algorithm for parameter inference. We provide asymptotics showing that RE-ABC has a lower computational cost for high dimensional data than standard ABC methods. We also illustrate our approach empirically, on a Gaussian distribution and an application in infectious disease modelling. △ Less

Submitted 4 April, 2017; v1 submitted 8 November, 2016; originally announced November 2016.

Comments: Supplementary material at end of pdf

arXiv:1605.07924 [pdf, other]

Modelling and Bayesian analysis of the Abakaliki Smallpox Data

Authors: Jessica E. Stockdale, Theodore Kypraios, Philip D. O'Neill

Abstract: The celebrated Abakaliki smallpox data have appeared numerous times in the epidemic modelling literature, but in almost all cases only a specific subset of the data is considered. There is one previous analysis of the full data set, but this relies on approximation methods to derive a likelihood. The data themselves continue to be of interest due to concerns about the possible re-emergence of smal… ▽ More The celebrated Abakaliki smallpox data have appeared numerous times in the epidemic modelling literature, but in almost all cases only a specific subset of the data is considered. There is one previous analysis of the full data set, but this relies on approximation methods to derive a likelihood. The data themselves continue to be of interest due to concerns about the possible re-emergence of smallpox as a bioterrorism weapon. We present the first full Bayesian analysis using data-augmentation Markov chain Monte Carlo methods which avoid the need for likelihood approximations. Results include estimates of basic model parameters as well as reproduction numbers and the likely path of infection. Model assessment is carried out using simulation-based methods. △ Less

Submitted 25 May, 2016; originally announced May 2016.

arXiv:1602.04721 [pdf, ps, other]

Evaluating hospital infection control measures for antimicrobial-resistant pathogens using stochastic transmission models: application to Vancomycin-Resistant Enterococci in intensive care units

Authors: Yinghui Wei, Theodore Kypraios, Philip D. O'Neill, Susan S. Huang, Sheryl L. Rifas-Shiman, Ben S. Cooper

Abstract: Nosocomial pathogens such as Methicillin-Resistant {\em Staphylococcus aureus} (MRSA) and Vancomycin-resistant {\em Enterococci} (VRE) are the cause of significant morbidity and mortality among hospital patients. It is important to be able to assess the efficacy of control measures using data on patient outcomes. In this paper we describe methods for analysing such data using patient-level stochas… ▽ More Nosocomial pathogens such as Methicillin-Resistant {\em Staphylococcus aureus} (MRSA) and Vancomycin-resistant {\em Enterococci} (VRE) are the cause of significant morbidity and mortality among hospital patients. It is important to be able to assess the efficacy of control measures using data on patient outcomes. In this paper we describe methods for analysing such data using patient-level stochastic models which seek to describe the underlying unobserved process of transmission. The methods are applied to detailed longitudinal patient-level data on VRE from a study in a US hospital with eight intensive care units (ICUs). The data comprise admission and discharge dates, dates and results of screening tests, and dates during which precautionary measures were in place for each patient during the study period. Results include estimates of the efficacy of the control measures, the proportion of unobserved patients colonized with VRE and the proportion of patients colonized on admission. △ Less

Submitted 15 February, 2016; originally announced February 2016.

arXiv:1510.03229 [pdf, other]

doi 10.1088/1367-2630/18/4/043018

Statistically efficient tomography of low rank states with incomplete measurements

Authors: Anirudh Acharya, Theodore Kypraios, Madalin Guta

Abstract: The construction of physically relevant low dimensional state models, and the design of appropriate measurements are key issues in tackling quantum state tomography for large dimensional systems. We consider the statistical problem of estimating low rank states in the set-up of multiple ions tomography, and investigate how the estimation error behaves with a reduction in the number of measurement… ▽ More The construction of physically relevant low dimensional state models, and the design of appropriate measurements are key issues in tackling quantum state tomography for large dimensional systems. We consider the statistical problem of estimating low rank states in the set-up of multiple ions tomography, and investigate how the estimation error behaves with a reduction in the number of measurement settings, compared with the standard ion tomography setup. We present extensive simulation results showing that the error is robust with respect to the choice of states of a given rank, the random selection of settings, and that the number of settings can be significantly reduced with only a negligible increase in error. We present an argument to explain these findings based on a concentration inequality for the Fisher information matrix. In the more general setup of random basis measurements we use this argument to show that for certain rank $r$ states it suffices to measure in $O(r\log d)$ bases to achieve the average Fisher information over all bases. We present numerical evidence for states upto 8 atoms, supporting a conjecture on a lower bound for the Fisher information which, if true, would imply a similar behaviour in the case of Pauli bases. The relation to similar problems in compressed sensing is also discussed. △ Less

Submitted 23 October, 2015; v1 submitted 12 October, 2015; originally announced October 2015.

Comments: 19 pages, 6 figures ; V2: updated figure 5, added references, changed title, updated abstract

Journal ref: New Journal of Physics, Volume 18, April 2016

arXiv:1411.7888 [pdf, ps, other]

Bayesian model choice via mixture distributions with application to epidemics and population process models

Authors: Philip D. O'Neill, Theodore Kypraios

Abstract: We describe a new method for evaluating Bayes factors. The key idea is to introduce a hypermodel in which the competing models are components of a mixture distribution. Inference for the mixing probabilities then yields estimates of the Bayes factors. Our motivation is the setting where the observed data are a partially observed realisation of a stochastic population process, although the methods… ▽ More We describe a new method for evaluating Bayes factors. The key idea is to introduce a hypermodel in which the competing models are components of a mixture distribution. Inference for the mixing probabilities then yields estimates of the Bayes factors. Our motivation is the setting where the observed data are a partially observed realisation of a stochastic population process, although the methods have far wider applicability. The methods allow for missing data and for parameters to be shared between models. Illustrative examples including epidemics, population processes and regression models are given, showing that the methods are competitive compared to existing approaches. △ Less

Submitted 14 February, 2016; v1 submitted 28 November, 2014; originally announced November 2014.

arXiv:1411.2624 [pdf, ps, other]

Bayesian Non-Parametric Inference for Infectious Disease Data

Authors: Edward S. Knock, Theodore Kypraios

Abstract: We propose a framework for Bayesian non-parametric estimation of the rate at which new infections occur assuming that the epidemic is partially observed. The developed methodology relies on modelling the rate at which new infections occur as a function which only depends on time. Two different types of prior distributions are proposed namely using step-functions and B-splines. The methodology is i… ▽ More We propose a framework for Bayesian non-parametric estimation of the rate at which new infections occur assuming that the epidemic is partially observed. The developed methodology relies on modelling the rate at which new infections occur as a function which only depends on time. Two different types of prior distributions are proposed namely using step-functions and B-splines. The methodology is illustrated using both simulated and real datasets and we show that certain aspects of the epidemic such as seasonality and super-spreading events are picked up without having to explicitly incorporate them into a parametric model. △ Less

Submitted 15 December, 2014; v1 submitted 10 November, 2014; originally announced November 2014.

arXiv:1401.2894 [pdf, other]

Exact Bayesian Inference for the Bingham Distribution

Authors: Christopher J. Fallaize, Theodore Kypraios

Abstract: This paper is concerned with making Bayesian inference from data that are assumed to be drawn from a Bingham distribution. A barrier to the Bayesian approach is the parameter-dependent normalising constant of the Bingham distribution, which, even when it can be evaluated or accurately approximated, would have to be calculated at each iteration of an MCMC scheme, thereby greatly increasing the comp… ▽ More This paper is concerned with making Bayesian inference from data that are assumed to be drawn from a Bingham distribution. A barrier to the Bayesian approach is the parameter-dependent normalising constant of the Bingham distribution, which, even when it can be evaluated or accurately approximated, would have to be calculated at each iteration of an MCMC scheme, thereby greatly increasing the computational burden. We propose a method which enables exact (in Monte Carlo sense) Bayesian inference for the unknown parameters of the Bingham distribution by completely avoiding the need to evaluate this constant. We apply the method to simulated and real data, and illustrate that it is simpler to implement, faster, and performs better than an alternative algorithm that has recently been proposed in the literature. △ Less

Submitted 13 January, 2014; originally announced January 2014.

arXiv:1401.1749 [pdf, other]

doi 10.1214/15-AOAS898

Reconstructing transmission trees for communicable diseases using densely sampled genetic data

Authors: Colin J. Worby, Philip D. O'Neill, Theodore Kypraios, Julie V. Robotham, Daniela De Angelis, Edward J. P. Cartwright, Sharon J. Peacock, Ben S. Cooper

Abstract: Whole genome sequencing of pathogens from multiple hosts in an epidemic offers the potential to investigate who infected whom with unparalleled resolution, potentially yielding important insights into disease dynamics and the impact of control measures. We considered disease outbreaks in a setting with dense genomic sampling, and formulated stochastic epidemic models to investigate person-to-perso… ▽ More Whole genome sequencing of pathogens from multiple hosts in an epidemic offers the potential to investigate who infected whom with unparalleled resolution, potentially yielding important insights into disease dynamics and the impact of control measures. We considered disease outbreaks in a setting with dense genomic sampling, and formulated stochastic epidemic models to investigate person-to-person transmission, based on observed genomic and epidemiological data. We constructed models in which the genetic distance between sampled genotypes depends on the epidemiological relationship between the hosts. A data augmented Markov chain Monte Carlo algorithm was used to sample over the transmission trees, providing a posterior probability for any given transmission route. We investigated the predictive performance of our methodology using simulated data, demonstrating high sensitivity and specificity, particularly for rapidly mutating pathogens with low transmissibility. We then analyzed data collected during an outbreak of methicillin-resistant Staphylococcus aureus in a hospital, identifying probable transmission routes and estimating epidemiological parameters. Our approach overcomes limitations of previous methods, providing a framework with the flexibility to allow for unobserved infection times, multiple independent introductions of the pathogen, and within-host genetic diversity, as well as allowing forward simulation. △ Less

Submitted 6 December, 2015; v1 submitted 8 January, 2014; originally announced January 2014.

Journal ref: Ann. Appl. Stat. 10 (2016) 395-417

arXiv:1311.4091 [pdf, other]

doi 10.1088/1751-8113/47/41/415302

Maximum likelihood versus likelihood-free quantum system identification in the atom maser

Authors: Catalin Catana, Theodore Kypraios, Madalin Guta

Abstract: We consider the system identification problem of estimating a dynamical parameter of a Markovian quantum open system (the atom maser), by performing continuous time measurements in the system's output (outgoing atoms). Two estimation methods are investigated and compared. On the one hand, the maximum likelihood estimator (MLE) takes into account the full measurement data and is asymptotically opti… ▽ More We consider the system identification problem of estimating a dynamical parameter of a Markovian quantum open system (the atom maser), by performing continuous time measurements in the system's output (outgoing atoms). Two estimation methods are investigated and compared. On the one hand, the maximum likelihood estimator (MLE) takes into account the full measurement data and is asymptotically optimal in terms of its mean square error. On the other hand, the `likelihood-free' method of approximate Bayesian computation (ABC) produces an approximation of the posterior distribution for a given set of summary statistics, by sampling trajectories at different parameter values and comparing them with the measurement data via chosen statistics. Our analysis is performed on the atom maser model, which exhibits interesting features such as bistability and dynamical phase transitions, and has connections with the classical theory of hidden Markov processes. Building on previous results which showed that atom counts are poor statistics for certain values of the Rabi angle, we apply MLE to the full measurement data and estimate its Fisher information. We then select several correlation statistics such as waiting times, distribution of successive identical detections, and use them as input of the ABC algorithm. The resulting posterior distribution follows closely the data likelihood, showing that the selected statistics contain `most' statistical information about the Rabi angle. △ Less

Submitted 16 November, 2013; originally announced November 2013.

Comments: 25 pages, 14 figures

arXiv:1301.2975 [pdf, ps, other]

Fast Approximate Bayesian Computation for discretely observed Markov models using a factorised posterior distribution

Authors: Simon R. White, Theodore Kypraios, Simon P. Preston

Abstract: Many modern statistical applications involve inference for complicated stochastic models for which the likelihood function is difficult or even impossible to calculate, and hence conventional likelihood-based inferential echniques cannot be used. In such settings, Bayesian inference can be performed using Approximate Bayesian Computation (ABC). However, in spite of many recent developments to ABC… ▽ More Many modern statistical applications involve inference for complicated stochastic models for which the likelihood function is difficult or even impossible to calculate, and hence conventional likelihood-based inferential echniques cannot be used. In such settings, Bayesian inference can be performed using Approximate Bayesian Computation (ABC). However, in spite of many recent developments to ABC methodology, in many applications the computational cost of ABC necessitates the choice of summary statistics and tolerances that can potentially severely bias the estimate of the posterior. We propose a new "piecewise" ABC approach suitable for discretely observed Markov models that involves writing the posterior density of the parameters as a product of factors, each a function of only a subset of the data, and then using ABC within each factor. The approach has the advantage of side-step** the need to choose a summary statistic and it enables a stringent tolerance to be set, making the posterior "less approximate". We investigate two methods for estimating the posterior density based on ABC samples for each of the factors: the first is to use a Gaussian approximation for each factor, and the second is to use a kernel density estimate. Both methods have their merits. The Gaussian approximation is simple, fast, and probably adequate for many applications. On the other hand, using instead a kernel density estimate has the benefit of consistently estimating the true ABC posterior as the number of ABC samples tends to infinity. We illustrate the piecewise ABC approach for three examples; in each case, the approach enables "exact matching" between simulations and data and offers fast and accurate inference. △ Less

Submitted 28 May, 2013; v1 submitted 14 January, 2013; originally announced January 2013.

arXiv:1206.4032 [pdf, other]

doi 10.1088/1367-2630/14/10/105002

Rank-based model selection for multiple ions quantum tomography

Authors: Madalin Guta, Theodore Kypraios, Ian Dryden

Abstract: The statistical analysis of measurement data has become a key component of many quantum engineering experiments. As standard full state tomography becomes unfeasible for large dimensional quantum systems, one needs to exploit prior information and the "sparsity" properties of the experimental state in order to reduce the dimensionality of the estimation problem. In this paper we propose model sele… ▽ More The statistical analysis of measurement data has become a key component of many quantum engineering experiments. As standard full state tomography becomes unfeasible for large dimensional quantum systems, one needs to exploit prior information and the "sparsity" properties of the experimental state in order to reduce the dimensionality of the estimation problem. In this paper we propose model selection as a general principle for finding the simplest, or most parsimonious explanation of the data, by fitting different models and choosing the estimator with the best trade-off between likelihood fit and model complexity. We apply two well established model selection methods -- the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) -- to models consising of states of fixed rank and datasets such as are currently produced in multiple ions experiments. We test the performance of AIC and BIC on randomly chosen low rank states of 4 ions, and study the dependence of the selected rank with the number of measurement repetitions for one ion states. We then apply the methods to real data from a 4 ions experiment aimed at creating a Smolin state of rank 4. The two methods indicate that the optimal model for describing the data lies between ranks 6 and 9, and the Pearson $χ^{2}$ test is applied to validate this conclusion. Additionally we find that the mean square error of the maximum likelihood estimator for pure states is close to that of the optimal over all possible measurements. △ Less

Submitted 18 June, 2012; originally announced June 2012.

Comments: 24 pages, 6 figures, 3 tables

Journal ref: New J. Phys. (14) 105002, 2012

arXiv:0908.2066 [pdf, ps, other]

Statistical inference for stochastic epidemic models with three levels of mixing

Authors: Tom Britton, Theodore Kypraios, Philip O'Neill

Abstract: A stochastic epidemic model is defined in which each individual belongs to a household, a secondary grou** (typically school or workplace) and also the community as a whole. Moreover, infectious contacts take place in these three settings according to potentially different rates. For this model we consider how different kinds of data can be used to estimate the infection rate parameters with a… ▽ More A stochastic epidemic model is defined in which each individual belongs to a household, a secondary grou** (typically school or workplace) and also the community as a whole. Moreover, infectious contacts take place in these three settings according to potentially different rates. For this model we consider how different kinds of data can be used to estimate the infection rate parameters with a view to understanding what can and cannot be inferred, and with what precision. Among other things we find that temporal data can be of considerable inferential benefit compared to final size data, that the degree of heterogeneity in the data can have a considerable effect on inference for non-household transmission, and that inferences can be materially different from those obtained from a model with two levels of mixing. Keywords: Basic reproduction number, Bayesian inference, Epidemic model, Infectious disease data, Markov chain Monte Carlo, Networks. △ Less

Submitted 14 August, 2009; originally announced August 2009.

Showing 1–22 of 22 results for author: Kypraios, T