-
Towards fast machine-learning-assisted Bayesian posterior inference of microseismic event location and source mechanism
Authors:
Davide Piras,
Alessio Spurio Mancini,
Ana M. G. Ferreira,
Benjamin Joachimi,
Michael P. Hobson
Abstract:
Bayesian inference applied to microseismic activity monitoring allows the accurate location of microseismic events from recorded seismograms and the estimation of the associated uncertainties. However, the forward modelling of these microseismic events, which is necessary to perform Bayesian source inversion, can be prohibitively expensive in terms of computational resources. A viable solution is…
▽ More
Bayesian inference applied to microseismic activity monitoring allows the accurate location of microseismic events from recorded seismograms and the estimation of the associated uncertainties. However, the forward modelling of these microseismic events, which is necessary to perform Bayesian source inversion, can be prohibitively expensive in terms of computational resources. A viable solution is to train a surrogate model based on machine learning techniques, to emulate the forward model and thus accelerate Bayesian inference. In this paper, we substantially enhance previous work, which considered only sources with isotropic moment tensors. We train a machine learning algorithm on the power spectrum of the recorded pressure wave and show that the trained emulator allows complete and fast event locations for $\textit{any}$ source mechanism. Moreover, we show that our approach is computationally inexpensive, as it can be run in less than 1 hour on a commercial laptop, while yielding accurate results using less than $10^4$ training seismograms. We additionally demonstrate how the trained emulators can be used to identify the source mechanism through the estimation of the Bayesian evidence. Finally, we demonstrate that our approach is robust to real noise as measured in field data. This work lays the foundations for efficient, accurate future joint determinations of event location and moment tensor, and associated uncertainties, which are ultimately key for accurately characterising human-induced and natural earthquakes, and for enhanced quantitative seismic hazard assessments.
△ Less
Submitted 28 October, 2022; v1 submitted 12 January, 2021;
originally announced January 2021.
-
Accelerating Bayesian microseismic event location with deep learning
Authors:
A. Spurio Mancini,
D. Piras,
A. M. G. Ferreira,
M. P. Hobson,
B. Joachimi
Abstract:
We present a series of new open source deep learning algorithms to accelerate Bayesian full waveform point source inversion of microseismic events. Inferring the joint posterior probability distribution of moment tensor components and source location is key for rigorous uncertainty quantification. However, the inference process requires forward modelling of microseismic traces for each set of para…
▽ More
We present a series of new open source deep learning algorithms to accelerate Bayesian full waveform point source inversion of microseismic events. Inferring the joint posterior probability distribution of moment tensor components and source location is key for rigorous uncertainty quantification. However, the inference process requires forward modelling of microseismic traces for each set of parameters explored by the sampling algorithm, which makes the inference very computationally intensive. In this paper we focus on accelerating this process by training deep learning models to learn the map** between source location and seismic traces, for a given 3D heterogeneous velocity model, and a fixed isotropic moment tensor for the sources. These trained emulators replace the expensive solution of the elastic wave equation in the inference process. We compare our results with a previous study that used emulators based on Gaussian Processes to invert microseismic events. We show that all of our models provide more accurate predictions and $\sim 100$ times faster predictions than the method based on Gaussian Processes, and a $\mathcal{O}(10^5)$ speed-up factor over a pseudo-spectral method for waveform generation. For example, a 2-s long synthetic trace can be generated in $\sim 10$ ms on a common laptop processor, instead of $\sim$ 1 hr using a pseudo-spectral method on a high-profile Graphics Processing Units card. We also show that our inference results are in excellent agreement with those obtained from traditional location methods based on travel time estimates. The speed, accuracy and scalability of our open source deep learning models pave the way for extensions of these emulators to generic source mechanisms and application to joint Bayesian inversion of moment tensor components and source location using full waveforms.
△ Less
Submitted 2 August, 2021; v1 submitted 14 September, 2020;
originally announced September 2020.
-
Dense output for highly oscillatory numerical solutions
Authors:
F. J. Agocs,
M. P. Hobson,
W. J. Handley,
A. N. Lasenby
Abstract:
We present a method to construct a continuous extension (otherwise known as dense output) for a numerical routine in the special case of the numerical solution being a scalar-valued function exhibiting rapid oscillations. Such cases call for numerical routines that make use of the known global behaviour of the solution, one example being methods using asymptotic expansions to forecast the solution…
▽ More
We present a method to construct a continuous extension (otherwise known as dense output) for a numerical routine in the special case of the numerical solution being a scalar-valued function exhibiting rapid oscillations. Such cases call for numerical routines that make use of the known global behaviour of the solution, one example being methods using asymptotic expansions to forecast the solution at each step of the independent variable. An example is oscode, numerical routine which uses the Wentzel-Kramers-Brillouin (WKB) approximation when the solution oscillates rapidly and otherwise behaves as a Runge-Kutta (RK) solver. Polynomial interpolation is not suitable for producing the solution at an arbitrary point mid-step, since efficient numerical methods based on the WKB approximation will step through multiple oscillations in a single step. Instead we construct the continuous solution by extending the numerical quadrature used in computing a WKB approximation of the solution with no additional evaluations of the differential equation or terms within, and provide an error estimate on this dense output. Finally, we draw attention to previous work on the continuous extension of Runge-Kutta formulae, and construct an extension to a RK method based on Gauss--Lobatto quadrature nodes, thus describing how to generate dense output from each of the methods underlying oscode.
△ Less
Submitted 9 July, 2020;
originally announced July 2020.
-
An efficient method for solving highly oscillatory ordinary differential equations with applications to physical systems
Authors:
F. J. Agocs,
W. J. Handley,
A. N. Lasenby,
M. P. Hobson
Abstract:
We present a novel numerical routine (oscode) with a C++ and Python interface for the efficient solution of one-dimensional, second-order, ordinary differential equations with rapidly oscillating solutions. The method is based on a Runge-Kutta-like step** procedure that makes use of the Wentzel-Kramers-Brillouin (WKB) approximation to skip regions of integration where the characteristic frequenc…
▽ More
We present a novel numerical routine (oscode) with a C++ and Python interface for the efficient solution of one-dimensional, second-order, ordinary differential equations with rapidly oscillating solutions. The method is based on a Runge-Kutta-like step** procedure that makes use of the Wentzel-Kramers-Brillouin (WKB) approximation to skip regions of integration where the characteristic frequency varies slowly. In regions where this is not the case, the method is able to switch to a made-to-measure Runge-Kutta integrator that minimises the total number of function evaluations. We demonstrate the effectiveness of the method with example solutions of the Airy equation and an equation exhibiting a burst of oscillations, discussing the error properties of the method in detail. We then show the method applied to physical systems. First, the one-dimensional, time-independent Schrödinger equation is solved as part of a shooting method to search for the energy eigenvalues for a potential with quartic anharmonicity. Then, the method is used to solve the Mukhanov-Sasaki equation describing the evolution of cosmological perturbations, and the primordial power spectrum of the perturbations is computed in different cosmological scenarios. We compare the performance of our solver in calculating a primordial power spectrum of scalar perturbations to that of BINGO, an efficient code specifically designed for such applications.
△ Less
Submitted 13 December, 2019; v1 submitted 30 May, 2019;
originally announced June 2019.
-
Fast GPU-Based Seismogram Simulation from Microseismic Events in Marine Environments Using Heterogeneous Velocity Models
Authors:
Saptarshi Das,
Xi Chen,
Michael P. Hobson
Abstract:
A novel approach is presented for fast generation of synthetic seismograms due to microseismic events, using heterogeneous marine velocity models. The partial differential equations (PDEs) for the 3D elastic wave equation have been numerically solved using the Fourier domain pseudo-spectral method which is parallelizable on the graphics processing unit (GPU) cards, thus making it faster compared t…
▽ More
A novel approach is presented for fast generation of synthetic seismograms due to microseismic events, using heterogeneous marine velocity models. The partial differential equations (PDEs) for the 3D elastic wave equation have been numerically solved using the Fourier domain pseudo-spectral method which is parallelizable on the graphics processing unit (GPU) cards, thus making it faster compared to traditional CPU based computing platforms. Due to computationally expensive forward simulation of large geological models, several combinations of individual synthetic seismic traces are used for specified microseismic event locations, in order to simulate the effect of realistic microseismic activity patterns in the subsurface. We here explore the patterns generated by few hundreds of microseismic events with different source mechanisms using various combinations, both in event amplitudes and origin times, using the simulated pressure and three component particle velocity fields via 1D, 2D and 3D seismic visualizations.
△ Less
Submitted 13 May, 2017;
originally announced May 2017.
-
The Runge-Kutta-Wentzel-Kramers-Brillouin Method
Authors:
W. J. Handley,
A. N. Lasenby,
M. P. Hobson
Abstract:
We demonstrate the effectiveness of a novel scheme for numerically solving linear differential equations whose solutions exhibit extreme oscillation. We take a standard Runge-Kutta approach, but replace the Taylor expansion formula with a Wentzel-Kramers-Brillouin method. The method is demonstrated by application to the Airy equation, along with a more complicated burst-oscillation case. Finally,…
▽ More
We demonstrate the effectiveness of a novel scheme for numerically solving linear differential equations whose solutions exhibit extreme oscillation. We take a standard Runge-Kutta approach, but replace the Taylor expansion formula with a Wentzel-Kramers-Brillouin method. The method is demonstrated by application to the Airy equation, along with a more complicated burst-oscillation case. Finally, we compare our scheme to existing approaches.
△ Less
Submitted 9 December, 2016; v1 submitted 3 December, 2016;
originally announced December 2016.
-
SKYNET: an efficient and robust neural network training tool for machine learning in astronomy
Authors:
Philip Graff,
Farhan Feroz,
Michael P. Hobson,
Anthony N. Lasenby
Abstract:
We present the first public release of our generic neural network training algorithm, called SkyNet. This efficient and robust machine learning tool is able to train large and deep feed-forward neural networks, including autoencoders, for use in a wide range of supervised and unsupervised learning applications, such as regression, classification, density estimation, clustering and dimensionality r…
▽ More
We present the first public release of our generic neural network training algorithm, called SkyNet. This efficient and robust machine learning tool is able to train large and deep feed-forward neural networks, including autoencoders, for use in a wide range of supervised and unsupervised learning applications, such as regression, classification, density estimation, clustering and dimensionality reduction. SkyNet uses a `pre-training' method to obtain a set of network parameters that has empirically been shown to be close to a good solution, followed by further optimisation using a regularised variant of Newton's method, where the level of regularisation is determined and adjusted automatically; the latter uses second-order derivative information to improve convergence, but without the need to evaluate or store the full Hessian matrix, by using a fast approximate method to calculate Hessian-vector products. This combination of methods allows for the training of complicated networks that are difficult to optimise using standard backpropagation techniques. SkyNet employs convergence criteria that naturally prevent overfitting, and also includes a fast algorithm for estimating the accuracy of network outputs. The utility and flexibility of SkyNet are demonstrated by application to a number of toy problems, and to astronomical problems focusing on the recovery of structure from blurred and noisy images, the identification of gamma-ray bursters, and the compression and denoising of galaxy images. The SkyNet software, which is implemented in standard ANSI C and fully parallelised using MPI, is available at http://www.mrao.cam.ac.uk/software/skynet/.
△ Less
Submitted 27 January, 2014; v1 submitted 3 September, 2013;
originally announced September 2013.
-
Importance Nested Sampling and the MultiNest Algorithm
Authors:
F. Feroz,
M. P. Hobson,
E. Cameron,
A. N. Pettitt
Abstract:
Bayesian inference involves two main computational challenges. First, in estimating the parameters of some model for the data, the posterior distribution may well be highly multi-modal: a regime in which the convergence to stationarity of traditional Markov Chain Monte Carlo (MCMC) techniques becomes incredibly slow. Second, in selecting between a set of competing models the necessary estimation o…
▽ More
Bayesian inference involves two main computational challenges. First, in estimating the parameters of some model for the data, the posterior distribution may well be highly multi-modal: a regime in which the convergence to stationarity of traditional Markov Chain Monte Carlo (MCMC) techniques becomes incredibly slow. Second, in selecting between a set of competing models the necessary estimation of the Bayesian evidence for each is, by definition, a (possibly high-dimensional) integration over the entire parameter space; again this can be a daunting computational task, although new Monte Carlo (MC) integration algorithms offer solutions of ever increasing efficiency. Nested sampling (NS) is one such contemporary MC strategy targeted at calculation of the Bayesian evidence, but which also enables posterior inference as a by-product, thereby allowing simultaneous parameter estimation and model selection. The widely-used MultiNest algorithm presents a particularly efficient implementation of the NS technique for multi-modal posteriors. In this paper we discuss importance nested sampling (INS), an alternative summation of the MultiNest draws, which can calculate the Bayesian evidence at up to an order of magnitude higher accuracy than `vanilla' NS with no change in the way MultiNest explores the parameter space. This is accomplished by treating as a (pseudo-)importance sample the totality of points collected by MultiNest, including those previously discarded under the constrained likelihood sampling of the NS algorithm. We apply this technique to several challenging test problems and compare the accuracy of Bayesian evidences obtained with INS against those from vanilla NS.
△ Less
Submitted 26 November, 2019; v1 submitted 10 June, 2013;
originally announced June 2013.
-
BAMBI: blind accelerated multimodal Bayesian inference
Authors:
Philip Graff,
Farhan Feroz,
Michael P. Hobson,
Anthony Lasenby
Abstract:
In this paper we present an algorithm for rapid Bayesian analysis that combines the benefits of nested sampling and artificial neural networks. The blind accelerated multimodal Bayesian inference (BAMBI) algorithm implements the MultiNest package for nested sampling as well as the training of an artificial neural network (NN) to learn the likelihood function. In the case of computationally expensi…
▽ More
In this paper we present an algorithm for rapid Bayesian analysis that combines the benefits of nested sampling and artificial neural networks. The blind accelerated multimodal Bayesian inference (BAMBI) algorithm implements the MultiNest package for nested sampling as well as the training of an artificial neural network (NN) to learn the likelihood function. In the case of computationally expensive likelihoods, this allows the substitution of a much more rapid approximation in order to increase significantly the speed of the analysis. We begin by demonstrating, with a few toy examples, the ability of a NN to learn complicated likelihood surfaces. BAMBI's ability to decrease running time for Bayesian inference is then demonstrated in the context of estimating cosmological parameters from Wilkinson Microwave Anisotropy Probe and other observations. We show that valuable speed increases are achieved in addition to obtaining NNs trained on the likelihood functions for the different model and data combinations. These NNs can then be used for an even faster follow-up analysis using the same likelihood and different priors. This is a fully general algorithm that can be applied, without any pre-processing, to other problems with computationally expensive likelihood functions.
△ Less
Submitted 17 February, 2012; v1 submitted 13 October, 2011;
originally announced October 2011.
-
Comment on "Bayesian evidence: can we beat MultiNest using traditional MCMC methods", by Rutger van Haasteren (arXiv:0911.2150)
Authors:
F. Feroz,
M. P. Hobson,
R. Trotta
Abstract:
In arXiv:0911.2150, Rutger van Haasteren seeks to criticize the nested sampling algorithm for Bayesian data analysis in general and its MultiNest implementation in particular. He introduces a new method for evidence evaluation based on the idea of Voronoi tessellation and requiring samples from the posterior distribution obtained through MCMC based methods. He compares its accuracy and efficienc…
▽ More
In arXiv:0911.2150, Rutger van Haasteren seeks to criticize the nested sampling algorithm for Bayesian data analysis in general and its MultiNest implementation in particular. He introduces a new method for evidence evaluation based on the idea of Voronoi tessellation and requiring samples from the posterior distribution obtained through MCMC based methods. He compares its accuracy and efficiency with MultiNest, concluding that it outperforms MultiNest in several cases. This comparison is completely unfair since the proposed method can not perform the complete Bayesian data analysis including posterior exploration and evidence evaluation on its own while MultiNest allows one to perform Bayesian data analysis end to end. Furthermore, their criticism of nested sampling (and in turn MultiNest) is based on a few conceptual misunderstandings of the algorithm. Here we seek to set the record straight.
△ Less
Submitted 8 January, 2010; v1 submitted 5 January, 2010;
originally announced January 2010.