Skip to main content

Showing 1–50 of 81 results for author: Fearnhead, P

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.19051  [pdf, other

    stat.ML cs.LG stat.CO

    Stochastic Gradient Piecewise Deterministic Monte Carlo Samplers

    Authors: Paul Fearnhead, Sebastiano Grazzi, Chris Nemeth, Gareth O. Roberts

    Abstract: Recent work has suggested using Monte Carlo methods based on piecewise deterministic Markov processes (PDMPs) to sample from target distributions of interest. PDMPs are non-reversible continuous-time processes endowed with momentum, and hence can mix better than standard reversible MCMC samplers. Furthermore, they can incorporate exact sub-sampling schemes which only require access to a single (ra… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    MSC Class: 62-08 62F15

  2. arXiv:2406.11664  [pdf, other

    stat.ML cs.LG stat.CO

    Diffusion Generative Modelling for Divide-and-Conquer MCMC

    Authors: C. Trojan, P. Fearnhead, C. Nemeth

    Abstract: Divide-and-conquer MCMC is a strategy for parallelising Markov Chain Monte Carlo sampling by running independent samplers on disjoint subsets of a dataset and merging their output. An ongoing challenge in the literature is to efficiently perform this merging without imposing distributional assumptions on the posteriors. We propose using diffusion generative modelling to fit density approximations… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 16 pages, 5 figures

  3. arXiv:2405.15670  [pdf, other

    stat.ME

    Post-selection inference for quantifying uncertainty in changes in variance

    Authors: Rachel Carrington, Paul Fearnhead

    Abstract: Quantifying uncertainty in detected changepoints is an important problem. However it is challenging as the naive approach would use the data twice, first to detect the changes, and then to test them. This will bias the test, and can lead to anti-conservative p-values. One approach to avoid this is to use ideas from post-selection inference, which conditions on the information in the data used to c… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 25 pages, 12 figures, plus 6 pages supplementary material

  4. arXiv:2405.06796  [pdf, other

    stat.ME math.ST stat.CO

    The Multiple Change-in-Gaussian-Mean Problem

    Authors: Paul Fearnhead, Piotr Fryzlewicz

    Abstract: A manuscript version of the chapter "The Multiple Change-in-Gaussian-Mean Problem" from the book "Change-Point Detection and Data Segmentation" by Fearnhead and Fryzlewicz, currently in preparation. All R code and data to accompany this chapter and the book are gradually being made available through https://github.com/pfryz/cpdds.

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: This is a draft chapter from the forthcoming book "Change-Point Detection and Data Segmentation" by Paul Fearnhead and Piotr Fryzlewicz. Comments, particularly regarding the history of work in this area, are welcome

  5. arXiv:2403.18549  [pdf, other

    stat.ME

    A communication-efficient, online changepoint detection method for monitoring distributed sensor networks

    Authors: Ziyang Yang, Idris A. Eckley, Paul Fearnhead

    Abstract: We consider the challenge of efficiently detecting changes within a network of sensors, where we also need to minimise communication between sensors and the cloud. We propose an online, communication-efficient method to detect such changes. The procedure works by performing likelihood ratio tests at each time point, and two thresholds are chosen to filter unimportant test statistics and make decis… ▽ More

    Submitted 9 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: 36 pages, 8 figures, 5 tables

  6. arXiv:2311.01174  [pdf, other

    stat.CO

    Online Multivariate Changepoint Detection: Leveraging Links With Computational Geometry

    Authors: Liudmila Pishchagina, Gaetano Romano, Paul Fearnhead, Vincent Runge, Guillem Rigaill

    Abstract: The increasing volume of data streams poses significant computational challenges for detecting changepoints online. Likelihood-based methods are effective, but their straightforward implementation becomes impractical online. We develop two online algorithms that exactly calculate the likelihood ratio test for a single changepoint in p-dimensional data streams by leveraging fascinating connections… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: 31 pages,15 figures

  7. arXiv:2310.10761  [pdf, other

    stat.ME

    Simulation Based Composite Likelihood

    Authors: Lorenzo Rimella, Chris Jewell, Paul Fearnhead

    Abstract: Inference for high-dimensional hidden Markov models is challenging due to the exponential-in-dimension computational cost of the forward algorithm. To address this issue, we introduce an innovative composite likelihood approach called "Simulation Based Composite Likelihood" (SimBa-CL). With SimBa-CL, we approximate the likelihood by the product of its marginals, which we estimate using Monte Carlo… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  8. arXiv:2302.04743  [pdf, other

    stat.CO stat.ME stat.ML

    A Constant-per-Iteration Likelihood Ratio Test for Online Changepoint Detection for Exponential Family Models

    Authors: Kes Ward, Gaetano Romano, Idris Eckley, Paul Fearnhead

    Abstract: Online changepoint detection algorithms that are based on likelihood-ratio tests have been shown to have excellent statistical properties. However, a simple online implementation is computationally infeasible as, at time $T$, it involves considering $O(T)$ possible locations for the change. Recently, the FOCuS algorithm has been introduced for detecting changes in mean in Gaussian data that decrea… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

  9. arXiv:2302.02718  [pdf, other

    stat.ME stat.CO stat.ML

    A Log-Linear Non-Parametric Online Changepoint Detection Algorithm based on Functional Pruning

    Authors: Gaetano Romano, Idris A Eckley, Paul Fearnhead

    Abstract: Online changepoint detection aims to detect anomalies and changes in real-time in high-frequency data streams, sometimes with limited available computational resources. This is an important task that is rooted in many real-world applications, including and not limited to cybersecurity, medicine and astrophysics. While fast and efficient online algorithms have been recently introduced, these rely o… ▽ More

    Submitted 11 January, 2024; v1 submitted 6 February, 2023; originally announced February 2023.

  10. arXiv:2301.05636  [pdf, other

    stat.ME

    Improving Power by Conditioning on Less in Post-selection Inference for Changepoints

    Authors: Rachel Carrington, Paul Fearnhead

    Abstract: Post-selection inference has recently been proposed as a way of quantifying uncertainty about detected changepoints. The idea is to run a changepoint detection algorithm, and then re-use the same data to perform a test for a change near each of the detected changes. By defining the p-value for the test appropriately, so that it is conditional on the information used to choose the test, this approa… ▽ More

    Submitted 17 January, 2024; v1 submitted 13 January, 2023; originally announced January 2023.

    Comments: 32 pages, 14 figures

  11. arXiv:2211.03860  [pdf, ps, other

    stat.ML cs.LG stat.ME

    Automatic Change-Point Detection in Time Series via Deep Learning

    Authors: Jie Li, Paul Fearnhead, Piotr Fryzlewicz, Tengyao Wang

    Abstract: Detecting change-points in data is challenging because of the range of possible types of change and types of behaviour of data when there is no change. Statistically efficient methods for detecting a change will depend on both of these features, and it can be difficult for a practitioner to develop an appropriate detection method for their application of interest. We show how to automatically gene… ▽ More

    Submitted 10 October, 2023; v1 submitted 7 November, 2022; originally announced November 2022.

    Comments: 33 pages, 15 figures and 3 tables

  12. arXiv:2210.16189  [pdf, ps, other

    stat.ML cs.LG stat.CO stat.ME

    Preferential Subsampling for Stochastic Gradient Langevin Dynamics

    Authors: Srshti Putcha, Christopher Nemeth, Paul Fearnhead

    Abstract: Stochastic gradient MCMC (SGMCMC) offers a scalable alternative to traditional MCMC, by constructing an unbiased estimate of the gradient of the log-posterior with a small, uniformly-weighted subsample of the data. While efficient to compute, the resulting gradient estimator may exhibit a high variance and impact sampler performance. The problem of variance control has been traditionally addressed… ▽ More

    Submitted 8 July, 2023; v1 submitted 28 October, 2022; originally announced October 2022.

    Comments: 22 pages, 5 figures. Appeared in the proceedings of AISTATS 2023

  13. arXiv:2210.07066  [pdf, other

    stat.ME math.ST

    Detecting A Single Change-point

    Authors: Paul Fearnhead, Piotr Fryzlewicz

    Abstract: This chapter overviews some of the work on detecting and estimating the location of a single change. We first consider the most common change-point problem, namely that of detecting a change in mean, before looking at extensions to detecting other types of change. The intuition from the problem of detecting a single change-point is helpful for understanding the variety of methods for detecting mul… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: This is a draft chapter from the forthcoming book "Change-Point Detection and Data Segmentation" by Paul Fearnhead and Piotr Fryzlewicz. Comments, particularly regarding the history of work in this area, are welcome

  14. arXiv:2208.11331  [pdf, other

    stat.AP

    Inference on Extended-Spectrum Beta-Lactamase Escherichia coli and Klebsiella pneumoniae data through SMC$^2$

    Authors: Lorenzo Rimella, Simon Alderton, Melodie Sammarro, Barry Rowlingson, Derek Cocker, Nick Feasey, Paul Fearnhead, Christopher Jewell

    Abstract: We propose a novel stochastic model for the spread of antimicrobial-resistant bacteria in a population, together with an efficient algorithm for fitting such a model to sample data. We introduce an individual-based model for the epidemic, with the state of the model determining which individuals are colonised by the bacteria. The transmission rate of the epidemic takes into account both individual… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

  15. arXiv:2208.11009  [pdf, other

    stat.CO stat.AP

    cpop: Detecting changes in piecewise-linear signals

    Authors: Paul Fearnhead, Daniel Grose

    Abstract: Changepoint detection is an important problem with applications across many application domains. There are many different types of changes that one may wish to detect, and a wide-range of algorithms and software for detecting them. However there are relatively few approaches for detecting changes-in-slope in the mean of a signal plus noise model. We describe the R package, cpop, available on the C… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

  16. arXiv:2208.01494  [pdf, other

    astro-ph.HE stat.CO

    Poisson-FOCuS: An efficient online method for detecting count bursts with application to gamma ray burst detection

    Authors: Kes Ward, Giuseppe Dilillo, Idris Eckley, Paul Fearnhead

    Abstract: Gamma-ray bursts are flashes of light from distant exploding stars. Cube satellites that monitor photons across different energy bands are used to detect these bursts. There is a need for computationally efficient algorithms, able to run using the limited computational resource onboard a cube satellite, that can detect when gamma-ray bursts occur. Current algorithms are based on monitoring photon… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

  17. arXiv:2206.05161  [pdf, other

    stat.ME

    Approximating optimal SMC proposal distributions in individual-based epidemic models

    Authors: Lorenzo Rimella, Christopher Jewell, Paul Fearnhead

    Abstract: Many epidemic models are naturally defined as individual-based models: where we track the state of each individual within a susceptible population. Inference for individual-based models is challenging due to the high-dimensional state-space of such models, which increases exponentially with population size. We consider sequential Monte Carlo algorithms for inference for individual-based epidemic m… ▽ More

    Submitted 6 March, 2023; v1 submitted 10 June, 2022; originally announced June 2022.

  18. arXiv:2205.09559  [pdf, other

    stat.ME stat.CO stat.ML

    Continuously-Tempered PDMP Samplers

    Authors: Matthew Sutton, Robert Salomone, Augustin Chevallier, Paul Fearnhead

    Abstract: New sampling algorithms based on simulating continuous-time stochastic processes called piece-wise deterministic Markov processes (PDMPs) have shown considerable promise. However, these methods can struggle to sample from multi-modal or heavy-tailed distributions. We show how tempering ideas can improve the mixing of PDMPs in such cases. We introduce an extended distribution defined over the state… ▽ More

    Submitted 29 May, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

  19. arXiv:2204.02724  [pdf, other

    stat.ME

    High-dimensional time series segmentation via factor-adjusted vector autoregressive modelling

    Authors: Haeran Cho, Hyeyoung Maeng, Idris A. Eckley, Paul Fearnhead

    Abstract: Vector autoregressive (VAR) models are popularly adopted for modelling high-dimensional time series, and their piecewise extensions allow for structural changes in the data. In VAR modelling, the number of parameters grow quadratically with the dimensionality which necessitates the sparsity assumption in high dimensions. However, it is debatable whether such an assumption is adequate for handling… ▽ More

    Submitted 20 January, 2023; v1 submitted 6 April, 2022; originally announced April 2022.

  20. arXiv:2202.09129  [pdf, other

    stat.CO cs.LG

    Efficient computation of the volume of a polytope in high-dimensions using Piecewise Deterministic Markov Processes

    Authors: Augustin Chevallier, Frédéric Cazals, Paul Fearnhead

    Abstract: Computing the volume of a polytope in high dimensions is computationally challenging but has wide applications. Current state-of-the-art algorithms to compute such volumes rely on efficient sampling of a Gaussian distribution restricted to the polytope, using e.g. Hamiltonian Monte Carlo. We present a new sampling strategy that uses a Piecewise Deterministic Markov Process. Like Hamiltonian Monte… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

    Report number: AISTATS 2022

  21. arXiv:2112.12897  [pdf, other

    stat.ME

    Concave-Convex PDMP-based sampling

    Authors: Matthew Sutton, Paul Fearnhead

    Abstract: Recently non-reversible samplers based on simulating piecewise deterministic Markov processes (PDMPs) have shown potential for efficient sampling in Bayesian inference problems. However, there remains a lack of guidance on how to best implement these algorithms. If implemented poorly, the computational costs of simulating event times can out-weigh the statistical efficiency of the non-reversible d… ▽ More

    Submitted 23 December, 2021; originally announced December 2021.

  22. arXiv:2111.05859  [pdf, other

    math.ST math.PR stat.CO stat.ME

    PDMP Monte Carlo methods for piecewise-smooth densities

    Authors: Augustin Chevallier, Sam Power, Andi Q. Wang, Paul Fearnhead

    Abstract: There has been substantial interest in develo** Markov chain Monte Carlo algorithms based on piecewise-deterministic Markov processes. However existing algorithms can only be used if the target distribution of interest is differentiable everywhere. The key to adapting these algorithms so that they can sample from to densities with discontinuities is defining appropriate dynamics for the process… ▽ More

    Submitted 10 November, 2021; originally announced November 2021.

  23. arXiv:2110.08205  [pdf, other

    stat.ME stat.CO stat.ML

    Fast Online Changepoint Detection via Functional Pruning CUSUM statistics

    Authors: Gaetano Romano, Idris Eckley, Paul Fearnhead, Guillem Rigaill

    Abstract: Many modern applications of online changepoint detection require the ability to process high-frequency observations, sometimes with limited available computational resources. Online algorithms for detecting a change in mean often involve using a moving window, or specifying the expected size of change. Such choices affect which changes the algorithms have most power to detect. We introduce an algo… ▽ More

    Submitted 27 July, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Report number: 24(81)

    Journal ref: Journal of Machine Learning Research, 2023

  24. arXiv:2105.07538  [pdf, ps, other

    stat.ME

    Collective anomaly detection in High-dimensional VAR Models

    Authors: Hyeyoung Maeng, Idris Eckley, Paul Fearnhead

    Abstract: There is increasing interest in detecting collective anomalies: potentially short periods of time where the features of data change before reverting back to normal behaviour. We propose a new method for detecting a collective anomaly in VAR models. Our focus is on situations where the change in the VAR coefficient matrix at an anomaly is sparse, i.e. a small number of entries of the VAR coefficien… ▽ More

    Submitted 16 May, 2021; originally announced May 2021.

  25. arXiv:2011.03599  [pdf, other

    stat.ME stat.AP

    A computationally efficient, high-dimensional multiple changepoint procedure with application to global terrorism incidence

    Authors: S. O. Tickle, I. A. Eckley, P. Fearnhead

    Abstract: Detecting changepoints in datasets with many variates is a data science challenge of increasing importance. Motivated by the problem of detecting changes in the incidence of terrorism from a global terrorism database, we propose a novel approach to multiple changepoint detection in multivariate time series. Our method, which we call SUBSET, is a model-based approach which uses a penalised likeliho… ▽ More

    Submitted 26 March, 2021; v1 submitted 6 November, 2020; originally announced November 2020.

  26. arXiv:2010.11771  [pdf, other

    stat.CO stat.ML

    Reversible Jump PDMP Samplers for Variable Selection

    Authors: Augustin Chevallier, Paul Fearnhead, Matthew Sutton

    Abstract: A new class of Markov chain Monte Carlo (MCMC) algorithms, based on simulating piecewise deterministic Markov processes (PDMPs), have recently shown great promise: they are non-reversible, can mix better than standard MCMC algorithms, and can use subsampling ideas to speed up computation in big data scenarios. However, current PDMP samplers can only sample from posterior densities that are differe… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Comments: Code available from https://github.com/matt-sutton/rjpdmp

  27. arXiv:2010.09353  [pdf, other

    stat.AP

    anomaly : Detection of Anomalous Structure in Time Series Data

    Authors: Alex Fisch, Daniel Grose, Idris A. Eckley, Paul Fearnhead, Lawrence Bardwell

    Abstract: One of the contemporary challenges in anomaly detection is the ability to detect, and differentiate between, both point and collective anomalies within a data sequence or time series. The anomaly package has been developed to provide users with a choice of anomaly detection methods and, in particular, provides an implementation of the recently proposed Collective And Point Anomaly family of anomal… ▽ More

    Submitted 29 January, 2024; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: 24 pages, 6 figures. An R package that implements the methods discussed in the paper can be obtained from The Comprehensive R Archive Network (CRAN) via https://cran.r-project.org/web/packages/anomaly/index.html

  28. arXiv:2010.06937  [pdf, other

    stat.ME stat.AP stat.CO

    Scalable changepoint and anomaly detection in cross-correlated data with an application to condition monitoring

    Authors: Martin Tveten, Idris A. Eckley, Paul Fearnhead

    Abstract: Motivated by a condition monitoring application arising from subsea engineering we derive a novel, scalable approach to detecting anomalous mean structure in a subset of correlated multivariate time series. Given the need to analyse such series efficiently we explore a computationally efficient approximation of the maximum likelihood solution to the resulting modelling framework, and develop a new… ▽ More

    Submitted 31 March, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

    Comments: 48 pages, 25 figures, 13 tables

  29. arXiv:2007.03238  [pdf, other

    stat.ME stat.AP stat.CO stat.ML

    Innovative And Additive Outlier Robust Kalman Filtering With A Robust Particle Filter

    Authors: Alexander T. M. Fisch, Idris A. Eckley, P. Fearnhead

    Abstract: In this paper, we propose CE-BASS, a particle mixture Kalman filter which is robust to both innovative and additive outliers, and able to fully capture multi-modality in the distribution of the hidden state. Furthermore, the particle sampling approach re-samples past states, which enables CE-BASS to handle innovative outliers which are not immediately visible in the observations, such as trend cha… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

  30. arXiv:2005.01379  [pdf, other

    stat.ME stat.AP stat.CO

    Detecting Abrupt Changes in the Presence of Local Fluctuations and Autocorrelated Noise

    Authors: Gaetano Romano, Guillem Rigaill, Vincent Runge, Paul Fearnhead

    Abstract: Whilst there are a plethora of algorithms for detecting changes in mean in univariate time-series, almost all struggle in real applications where there is autocorrelated noise or where the mean fluctuates locally between the abrupt changes that one wishes to detect. In these cases, default implementations, which are often based on assumptions of a constant mean between changes and independent nois… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

  31. arXiv:2002.03646  [pdf, other

    stat.CO

    gfpop: an R Package for Univariate Graph-Constrained Change-Point Detection

    Authors: Vincent Runge, Toby Dylan Hocking, Gaetano Romano, Fatemeh Afghah, Paul Fearnhead, Guillem Rigaill

    Abstract: In a world with data that change rapidly and abruptly, it is important to detect those changes accurately. In this paper we describe an R package implementing a generalized version of an algorithm recently proposed by Hocking et al. [2020] for penalized maximum likelihood inference of constrained multiple change-point models. This algorithm can be used to pinpoint the precise locations of abrupt c… ▽ More

    Submitted 11 April, 2022; v1 submitted 10 February, 2020; originally announced February 2020.

    MSC Class: 62M10; 60J22

  32. arXiv:2001.02883  [pdf, other

    stat.ME

    Semi-automated simultaneous predictor selection for Regression-SARIMA models

    Authors: Aaron Lowther, Paul Fearnhead, Matthew Nunes, Kjeld Jensen

    Abstract: Deciding which predictors to use plays an integral role in deriving statistical models in a wide range of applications. Motivated by the challenges of predicting events across a telecommunications network, we propose a semi-automated, joint model-fitting and predictor selection procedure for linear regression models. Our approach can model and account for serial correlation in the regression resid… ▽ More

    Submitted 9 January, 2020; originally announced January 2020.

    Comments: 18 pages, 12 figures

    MSC Class: 62M10

  33. arXiv:1911.01716  [pdf, other

    math.ST stat.ME

    Consistency of a range of penalised cost approaches for detecting multiple changepoints

    Authors: Chao Zheng, Idris A. Eckley, Paul Fearnhead

    Abstract: A common approach to detect multiple changepoints is to minimise a measure of data fit plus a penalty that is linear in the number of changepoints. This paper shows that the general finite sample behaviour of such a method can be related to its behaviour when analysing data with either none or one changepoint. This results in simpler conditions for verifying whether the method will consistently es… ▽ More

    Submitted 12 August, 2022; v1 submitted 5 November, 2019; originally announced November 2019.

  34. arXiv:1910.04291  [pdf, other

    stat.ME

    Testing for a Change in Mean After Changepoint Detection

    Authors: Sean Jewell, Paul Fearnhead, Daniela Witten

    Abstract: While many methods are available to detect structural changes in a time series, few procedures are available to quantify the uncertainty of these estimates post-detection. In this work, we fill this gap by proposing a new framework to test the null hypothesis that there is no change in mean around an estimated changepoint. We further show that it is possible to efficiently carry out this framework… ▽ More

    Submitted 14 April, 2021; v1 submitted 9 October, 2019; originally announced October 2019.

    Comments: Main text: 28 pages, 5 figures. Supplementary Materials: 15 pages, 4 figures

  35. arXiv:1909.01691  [pdf, other

    stat.ME math.ST stat.CO stat.ML

    Subset Multivariate Collective And Point Anomaly Detection

    Authors: Alexander T M Fisch, Idris A Eckley, Paul Fearnhead

    Abstract: In recent years, there has been a growing interest in identifying anomalous structure within multivariate data streams. We consider the problem of detecting collective anomalies, corresponding to intervals where one or more of the data streams behaves anomalously. We first develop a test for a single collective anomaly that has power to simultaneously detect anomalies that are either rare, that is… ▽ More

    Submitted 4 September, 2019; originally announced September 2019.

  36. arXiv:1908.06835  [pdf, ps, other

    stat.CO stat.ME

    Evaluation of extremal properties of GARCH(p,q) processes

    Authors: Fabrizio Laurini, Paul Fearnhead, Jonathan A. Tawn

    Abstract: Generalized autoregressive conditionally heteroskedastic (GARCH) processes are widely used for modelling features commonly found in observed financial returns. The extremal properties of these processes are of considerable interest for market risk management. For the simplest GARCH(p,q) process, with max(p,q) = 1, all extremal features have been fully characterised. Although the marginal features… ▽ More

    Submitted 19 August, 2019; originally announced August 2019.

  37. arXiv:1907.06986  [pdf, other

    stat.CO stat.ML

    Stochastic gradient Markov chain Monte Carlo

    Authors: Christopher Nemeth, Paul Fearnhead

    Abstract: Markov chain Monte Carlo (MCMC) algorithms are generally regarded as the gold standard technique for Bayesian inference. They are theoretically well-understood and conceptually simple to apply in practice. The drawback of MCMC is that in general performing exact inference requires all of the data to be processed at each iteration of the algorithm. For large data sets, the computational cost of MCM… ▽ More

    Submitted 16 July, 2019; originally announced July 2019.

  38. arXiv:1901.10568  [pdf, other

    stat.ML cs.LG stat.CO

    Stochastic Gradient MCMC for Nonlinear State Space Models

    Authors: Christopher Aicher, Srshti Putcha, Christopher Nemeth, Paul Fearnhead, Emily B. Fox

    Abstract: State space models (SSMs) provide a flexible framework for modeling complex time series via a latent stochastic process. Inference for nonlinear, non-Gaussian SSMs is often tackled with particle methods that do not scale well to long time series. The challenge is two-fold: not only do computations scale linearly with time, as in the linear case, but particle filters additionally suffer from increa… ▽ More

    Submitted 16 July, 2023; v1 submitted 29 January, 2019; originally announced January 2019.

    Comments: To appear in Bayesian Analysis

  39. arXiv:1810.03591  [pdf, other

    stat.ME

    Parallelisation of a Common Changepoint Detection Method

    Authors: S. O. Tickle, I. A. Eckley, P. Fearnhead, K. Haynes

    Abstract: In recent years, various means of efficiently detecting changepoints in the univariate setting have been proposed, with one popular approach involving minimising a penalised cost function using dynamic programming. In some situations, these algorithms can have an expected computational cost that is linear in the number of data points; however, the worst case cost remains quadratic. We introduce tw… ▽ More

    Submitted 8 October, 2018; originally announced October 2018.

  40. arXiv:1810.00117  [pdf, other

    stat.CO

    Generalized Functional Pruning Optimal Partitioning (GFPOP) for Constrained Changepoint Detection in Genomic Data

    Authors: Toby Dylan Hocking, Guillem Rigaill, Paul Fearnhead, Guillaume Bourque

    Abstract: We describe a new algorithm and R package for peak detection in genomic data sets using constrained changepoint algorithms. These detect changes from background to peak regions by imposing the constraint that the mean should alternately increase then decrease. An existing algorithm for this problem exists, and gives state-of-the-art accuracy results, but it is computationally expensive when the nu… ▽ More

    Submitted 28 September, 2018; originally announced October 2018.

  41. arXiv:1806.07137  [pdf, other

    stat.CO cs.LG stat.ML

    Large-Scale Stochastic Sampling from the Probability Simplex

    Authors: Jack Baker, Paul Fearnhead, Emily B Fox, Christopher Nemeth

    Abstract: Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When applied to distributions defined on a constrained space the time-discretization error can dominate when we are near the boundary of the space. We demons… ▽ More

    Submitted 26 October, 2018; v1 submitted 19 June, 2018; originally announced June 2018.

    Comments: Accepted to Advances in Neural Information Processing Systems (2018)

  42. arXiv:1806.01947  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    A linear time method for the detection of point and collective anomalies

    Authors: Alexander T. M. Fisch, Idris A. Eckley, Paul Fearnhead

    Abstract: The challenge of efficiently identifying anomalies in data sequences is an important statistical problem that now arises in many applications. Whilst there has been substantial work aimed at making statistical analyses robust to outliers, or point anomalies, there has been much less work on detecting anomalous segments, or collective anomalies, particularly in those settings where point anomalies… ▽ More

    Submitted 11 April, 2019; v1 submitted 5 June, 2018; originally announced June 2018.

  43. arXiv:1804.03963  [pdf, other

    stat.ME stat.CO

    Motor Unit Number Estimation via Sequential Monte Carlo

    Authors: Simon Taylor, Chris Sherlock, Gareth Ridall, Paul Fearnhead

    Abstract: A change in the number of motor units that operate a particular muscle is an important indicator for the progress of a neuromuscular disease and the efficacy of a therapy. Inference for realistic statistical models of the typical data produced when testing muscle function is difficult, and estimating the number of motor units from these data is an ongoing statistical challenge. We consider a set o… ▽ More

    Submitted 11 April, 2018; originally announced April 2018.

  44. arXiv:1802.07380  [pdf, other

    stat.ME q-bio.NC stat.AP

    Fast Nonconvex Deconvolution of Calcium Imaging Data

    Authors: Sean Jewell, Toby Dylan Hocking, Paul Fearnhead, Daniela Witten

    Abstract: Calcium imaging data promises to transform the field of neuroscience by making it possible to record from large populations of neurons simultaneously. However, determining the exact moment in time at which a neuron spikes, from a calcium imaging data set, amounts to a non-trivial deconvolution problem which is of critical importance for downstream analyses. While a number of formulations have been… ▽ More

    Submitted 20 February, 2018; originally announced February 2018.

    Comments: 30 pages, 9 figures

  45. arXiv:1712.06201  [pdf, ps, other

    stat.ME math.PR stat.CO

    Continious-time Importance Sampling: Monte Carlo Methods which Avoid Time-discretisation Error

    Authors: Paul Fearnhead, Krzystof Latuszynski, Gareth O. Roberts, Giorgos Sermaidis

    Abstract: In this paper we develop a continuous-time sequential importance sampling (CIS) algorithm which eliminates time-discretisation errors and provides online unbiased estimation for continuous time Markov processes, in particular for diffusions. Our work removes the strong conditions imposed by the EA and thus extends significantly the class of discretisation error-free MC methods for diffusions. The… ▽ More

    Submitted 17 December, 2017; originally announced December 2017.

  46. arXiv:1710.00578  [pdf, other

    stat.CO stat.AP stat.ML

    sgmcmc: An R Package for Stochastic Gradient Markov Chain Monte Carlo

    Authors: Jack Baker, Paul Fearnhead, Emily B. Fox, Christopher Nemeth

    Abstract: This paper introduces the R package sgmcmc; which can be used for Bayesian inference on problems with large datasets using stochastic gradient Markov chain Monte Carlo (SGMCMC). Traditional Markov chain Monte Carlo (MCMC) methods, such as Metropolis-Hastings, are known to run prohibitively slowly as the dataset size increases. SGMCMC solves this issue by only using a subset of data at each iterati… ▽ More

    Submitted 13 April, 2018; v1 submitted 2 October, 2017; originally announced October 2017.

  47. Particle Filters and Data Assimilation

    Authors: Paul Fearnhead, Hans Künsch

    Abstract: State-space models can be used to incorporate subject knowledge on the underlying dynamics of a time series by the introduction of a latent Markov state-process. A user can specify the dynamics of this process together with how the state relates to partial and noisy observations that have been made. Inference and prediction then involves solving a challenging inverse problem: calculating the condi… ▽ More

    Submitted 13 September, 2017; originally announced September 2017.

    Comments: To appear in `Annual Review of Statistics and Its Application'

  48. arXiv:1706.07712  [pdf, other

    stat.ME math.ST stat.CO

    Asymptotics of ABC

    Authors: Paul Fearnhead

    Abstract: We present an informal review of recent work on the asymptotics of Approximate Bayesian Computation (ABC). In particular we focus on how does the ABC posterior, or point estimates obtained by ABC, behave in the limit as we have more data? The results we review show that ABC can perform well in terms of point estimation, but standard implementations will over-estimate the uncertainty about the para… ▽ More

    Submitted 23 June, 2017; originally announced June 2017.

    Comments: This document is due to appear as a chapter of the forthcoming Handbook of Approximate Bayesian Computation (ABC) edited by S. Sisson, Y. Fan, and M. Beaumont

  49. arXiv:1706.05439  [pdf, other

    stat.CO cs.LG stat.ML

    Control Variates for Stochastic Gradient MCMC

    Authors: Jack Baker, Paul Fearnhead, Emily B. Fox, Christopher Nemeth

    Abstract: It is well known that Markov chain Monte Carlo (MCMC) methods scale poorly with dataset size. A popular class of methods for solving this issue is stochastic gradient MCMC. These methods use a noisy estimate of the gradient of the log posterior, which reduces the per iteration computational cost of the algorithm. Despite this, there are a number of results suggesting that stochastic gradient Lange… ▽ More

    Submitted 14 December, 2017; v1 submitted 16 June, 2017; originally announced June 2017.

  50. arXiv:1703.03352  [pdf, other

    stat.CO q-bio.GN stat.ML

    A log-linear time algorithm for constrained changepoint detection

    Authors: Toby Dylan Hocking, Guillem Rigaill, Paul Fearnhead, Guillaume Bourque

    Abstract: Changepoint detection is a central problem in time series and genomic data. For some applications, it is natural to impose constraints on the directions of changes. One example is ChIP-seq data, for which adding an up-down constraint improves peak detection accuracy, but makes the optimization problem more complicated. We show how a recently proposed functional pruning technique can be adapted to… ▽ More

    Submitted 9 March, 2017; originally announced March 2017.