-
Robust and integrative Bayesian neural networks for likelihood-free parameter inference
Authors:
Fredrik Wrede,
Robin Eriksson,
Richard Jiang,
Linda Petzold,
Stefan Engblom,
Andreas Hellander,
Prashant Singh
Abstract:
State-of-the-art neural network-based methods for learning summary statistics have delivered promising results for simulation-based likelihood-free parameter inference. Existing approaches require density estimation as a post-processing step building upon deterministic neural networks, and do not take network prediction uncertainty into account. This work proposes a robust integrated approach that…
▽ More
State-of-the-art neural network-based methods for learning summary statistics have delivered promising results for simulation-based likelihood-free parameter inference. Existing approaches require density estimation as a post-processing step building upon deterministic neural networks, and do not take network prediction uncertainty into account. This work proposes a robust integrated approach that learns summary statistics using Bayesian neural networks, and directly estimates the posterior density using categorical distributions. An adaptive sampling scheme selects simulation locations to efficiently and iteratively refine the predictive posterior of the network conditioned on observations. This allows for more efficient and robust convergence on comparatively large prior spaces. We demonstrate our approach on benchmark examples and compare against related methods.
△ Less
Submitted 7 May, 2021; v1 submitted 12 February, 2021;
originally announced February 2021.
-
Distributed and Adaptive Fast Multipole Method In Three Dimensions
Authors:
Jonathan Bull,
Stefan Engblom
Abstract:
We develop a general distributed implementation of an adaptive fast multipole method in three space dimensions. We rely on a balanced type of adaptive space discretisation which supports a highly transparent and fully distributed implementation. A complexity analysis indicates favorable scaling properties and numerical experiments on up to 512 cores and 1 billion source points verify them. The par…
▽ More
We develop a general distributed implementation of an adaptive fast multipole method in three space dimensions. We rely on a balanced type of adaptive space discretisation which supports a highly transparent and fully distributed implementation. A complexity analysis indicates favorable scaling properties and numerical experiments on up to 512 cores and 1 billion source points verify them. The parameters controlling the algorithm are subject to in-depth experiments and the performance response to the input parameters implies that the overall implementation is well-suited to automated tuning.
△ Less
Submitted 12 February, 2020;
originally announced February 2020.
-
Flash X-ray diffraction imaging in 3D: a proposed analysis pipeline
Authors:
**g Liu,
Stefan Engblom,
Carl Nettelblad
Abstract:
Modern Flash X-ray diffraction Imaging (FXI) acquires diffraction signals from single biomolecules at a high repetition rate from X-ray Free Electron Lasers (XFELs), easily obtaining millions of 2D diffraction patterns from a single experiment. Due to the stochastic nature of FXI experiments and the massive volumes of data, retrieving 3D electron densities from raw 2D diffraction patterns is a cha…
▽ More
Modern Flash X-ray diffraction Imaging (FXI) acquires diffraction signals from single biomolecules at a high repetition rate from X-ray Free Electron Lasers (XFELs), easily obtaining millions of 2D diffraction patterns from a single experiment. Due to the stochastic nature of FXI experiments and the massive volumes of data, retrieving 3D electron densities from raw 2D diffraction patterns is a challenging and time-consuming task.
We propose a semi-automatic data analysis pipeline for FXI experiments, which includes four steps: hit finding and preliminary filtering, pattern classification, 3D Fourier reconstruction, and post analysis. We also include a recently developed bootstrap methodology in the post-analysis step for uncertainty analysis and quality control. To achieve the best possible resolution, we further suggest using background subtraction, signal windowing, and convex optimization techniques when retrieving the Fourier phases in the post-analysis step.
As an application example, we quantified the 3D electron structure of the PR772 virus using the proposed data-analysis pipeline. The retrieved structure was above the detector-edge resolution and clearly showed the pseudo-icosahedral capsid of the PR772.
△ Less
Submitted 12 February, 2020; v1 submitted 30 October, 2019;
originally announced October 2019.
-
Supervised Classification Methods for Flash X-ray single particle diffraction Imaging
Authors:
**g Liu,
Gijs van der Schot,
Stefan Engblom
Abstract:
Current Flash X-ray single-particle diffraction Imaging (FXI) experiments, which operate on modern X-ray Free Electron Lasers (XFELs), can record millions of interpretable diffraction patterns from individual biomolecules per day. Due to the stochastic nature of the XFELs, those patterns will to a varying degree include scatterings from contaminated samples. Also, the heterogeneity of the sample b…
▽ More
Current Flash X-ray single-particle diffraction Imaging (FXI) experiments, which operate on modern X-ray Free Electron Lasers (XFELs), can record millions of interpretable diffraction patterns from individual biomolecules per day. Due to the stochastic nature of the XFELs, those patterns will to a varying degree include scatterings from contaminated samples. Also, the heterogeneity of the sample biomolecules is unavoidable and complicates data processing. Reducing the data volumes and selecting high-quality single-molecule patterns are therefore critical steps in the experimental set-up.
In this paper, we present two supervised template-based learning methods for classifying FXI patterns. Our Eigen-Image and Log-Likelihood classifier can find the best-matched template for a single-molecule pattern within a few milliseconds. It is also straightforward to parallelize them so as to fully match the XFEL repetition rate, thereby enabling processing at site.
△ Less
Submitted 25 October, 2018;
originally announced October 2018.
-
Stochastic simulation of pattern formation in growing tissue: a multilevel approach
Authors:
Stefan Engblom
Abstract:
We take up the challenge of designing realistic computational models of large interacting cell populations. The goal is essentially to bring Gillespie's celebrated stochastic methodology to the level of an interacting population of cells. Specifically, we are interested in how the gold standard of single cell computational modeling, here taken to be spatial stochastic reaction-diffusion models, ma…
▽ More
We take up the challenge of designing realistic computational models of large interacting cell populations. The goal is essentially to bring Gillespie's celebrated stochastic methodology to the level of an interacting population of cells. Specifically, we are interested in how the gold standard of single cell computational modeling, here taken to be spatial stochastic reaction-diffusion models, may be efficiently coupled with a similar approach at the cell population level.
Concretely, we target a recently proposed set of pathways for pattern formation involving Notch-Delta signaling mechanisms. These involve cell-to-cell communication as mediated both via direct membrane contact sites as well as via cellular protrusions. We explain how to simulate the process in growing tissue using a multilevel approach and we discuss implications for future development of the associated computational methods.
△ Less
Submitted 7 June, 2018; v1 submitted 3 February, 2018;
originally announced February 2018.
-
Assessing Uncertainties in X-ray Single-particle Three-dimensional reconstructions
Authors:
Stefan Engblom,
Carl Nettelblad,
**g Liu
Abstract:
Modern technology for producing extremely bright and coherent X-ray laser pulses provides the possibility to acquire a large number of diffraction patterns from individual biological nanoparticles, including proteins, viruses, and DNA. These two-dimensional diffraction patterns can be practically reconstructed and retrieved down to a resolution of a few \angstrom. In principle, a sufficiently larg…
▽ More
Modern technology for producing extremely bright and coherent X-ray laser pulses provides the possibility to acquire a large number of diffraction patterns from individual biological nanoparticles, including proteins, viruses, and DNA. These two-dimensional diffraction patterns can be practically reconstructed and retrieved down to a resolution of a few \angstrom. In principle, a sufficiently large collection of diffraction patterns will contain the required information for a full three-dimensional reconstruction of the biomolecule. The computational methodology for this reconstruction task is still under development and highly resolved reconstructions have not yet been produced.
We analyze the Expansion-Maximization-Compression scheme, the current state of the art approach for this very challenging application, by isolating different sources of uncertainty. Through numerical experiments on synthetic data we evaluate their respective impact. We reach conclusions of relevance for handling actual experimental data, as well as pointing out certain improvements to the underlying estimation algorithm.
We also introduce a practically applicable computational methodology in the form of bootstrap procedures for assessing reconstruction uncertainty in the real data case. We evaluate the sharpness of this approach and argue that this type of procedure will be critical in the near future when handling the increasing amount of data.
△ Less
Submitted 2 January, 2017;
originally announced January 2017.
-
Fast event-based epidemiological simulations on national scales
Authors:
Pavol Bauer,
Stefan Engblom,
Stefan Widgren
Abstract:
We present a computational modeling framework for data-driven simulations and analysis of infectious disease spread in large populations. For the purpose of efficient simulations, we devise a parallel solution algorithm targeting multi-socket shared memory architectures. The model integrates infectious dynamics as continuous-time Markov chains and available data such as animal movements or aging a…
▽ More
We present a computational modeling framework for data-driven simulations and analysis of infectious disease spread in large populations. For the purpose of efficient simulations, we devise a parallel solution algorithm targeting multi-socket shared memory architectures. The model integrates infectious dynamics as continuous-time Markov chains and available data such as animal movements or aging are incorporated as externally defined events. To bring out parallelism and accelerate the computations, we decompose the spatial domain and optimize cross-boundary communication using dependency-aware task scheduling. Using registered livestock data at a high spatio-temporal resolution, we demonstrate that our approach not only is resilient to varying model configurations, but also scales on all physical cores at realistic work loads. Finally, we show that these very features enable the solution of inverse problems on national scales.
△ Less
Submitted 27 January, 2016; v1 submitted 10 February, 2015;
originally announced February 2015.
-
Machine learning for ultrafast X-ray diffraction patterns on large-scale GPU clusters
Authors:
Tomas Ekeberg,
Stefan Engblom,
**g Liu
Abstract:
The classical method of determining the atomic structure of complex molecules by analyzing diffraction patterns is currently undergoing drastic developments. Modern techniques for producing extremely bright and coherent X-ray lasers allow a beam of streaming particles to be intercepted and hit by an ultrashort high energy X-ray beam. Through machine learning methods the data thus collected can be…
▽ More
The classical method of determining the atomic structure of complex molecules by analyzing diffraction patterns is currently undergoing drastic developments. Modern techniques for producing extremely bright and coherent X-ray lasers allow a beam of streaming particles to be intercepted and hit by an ultrashort high energy X-ray beam. Through machine learning methods the data thus collected can be transformed into a three-dimensional volumetric intensity map of the particle itself. The computational complexity associated with this problem is very high such that clusters of data parallel accelerators are required.
We have implemented a distributed and highly efficient algorithm for inversion of large collections of diffraction patterns targeting clusters of hundreds of GPUs. With the expected enormous amount of diffraction data to be produced in the foreseeable future, this is the required scale to approach real time processing of data at the beam site. Using both real and synthetic data we look at the scaling properties of the application and discuss the overall computational viability of this exciting and novel imaging technique.
△ Less
Submitted 16 December, 2014; v1 submitted 11 September, 2014;
originally announced September 2014.
-
Fast Matlab compatible sparse assembly on multicore computers
Authors:
Stefan Engblom,
Dimitar Lukarski
Abstract:
We develop and implement in this paper a fast sparse assembly algorithm, the fundamental operation which creates a compressed matrix from raw index data. Since it is often a quite demanding and sometimes critical operation, it is of interest to design a highly efficient implementation. We show how to do this, and moreover, we show how our implementation can be parallelized to utilize the power of…
▽ More
We develop and implement in this paper a fast sparse assembly algorithm, the fundamental operation which creates a compressed matrix from raw index data. Since it is often a quite demanding and sometimes critical operation, it is of interest to design a highly efficient implementation. We show how to do this, and moreover, we show how our implementation can be parallelized to utilize the power of modern multicore computers. Our freely available code, fully Matlab compatible, achieves about a factor of 5 times in speedup on a typical 6-core machine and 10 times on a dual-socket 16 core machine compared to the built-in serial implementation.
△ Less
Submitted 23 October, 2015; v1 submitted 4 June, 2014;
originally announced June 2014.
-
Dynamic autotuning of adaptive fast multipole methods on hybrid multicore CPU & GPU systems
Authors:
Marcus Holm,
Stefan Engblom,
Anders Goude,
Sverker Holmgren
Abstract:
We discuss an implementation of adaptive fast multipole methods targeting hybrid multicore CPU- and GPU-systems. From previous experiences with the computational profile of our version of the fast multipole algorithm, suitable parts are off-loaded to the GPU, while the remaining parts are threaded and executed concurrently by the CPU. The parameters defining the algorithm affects the performance a…
▽ More
We discuss an implementation of adaptive fast multipole methods targeting hybrid multicore CPU- and GPU-systems. From previous experiences with the computational profile of our version of the fast multipole algorithm, suitable parts are off-loaded to the GPU, while the remaining parts are threaded and executed concurrently by the CPU. The parameters defining the algorithm affects the performance and by measuring this effect we are able to dynamically balance the algorithm towards optimal performance. Our setup uses the dynamic nature of the computations and is therefore of general character.
△ Less
Submitted 17 March, 2014; v1 submitted 5 November, 2013;
originally announced November 2013.
-
Adaptive fast multipole methods on the GPU
Authors:
Anders Goude,
Stefan Engblom
Abstract:
We present a highly general implementation of fast multipole methods on graphics processing units (GPUs). Our two-dimensional double precision code features an asymmetric type of adaptive space discretization leading to a particularly elegant and flexible implementation. All steps of the multipole algorithm are efficiently performed on the GPU, including the initial phase which assembles the topol…
▽ More
We present a highly general implementation of fast multipole methods on graphics processing units (GPUs). Our two-dimensional double precision code features an asymmetric type of adaptive space discretization leading to a particularly elegant and flexible implementation. All steps of the multipole algorithm are efficiently performed on the GPU, including the initial phase which assembles the topological information of the input data. Through careful timing experiments we investigate the effects of the various peculiarities of the GPU architecture.
△ Less
Submitted 8 October, 2012; v1 submitted 21 May, 2012;
originally announced May 2012.
-
On well-separated sets and fast multipole methods
Authors:
Stefan Engblom
Abstract:
The notion of well-separated sets is crucial in fast multipole methods as the main idea is to approximate the interaction between such sets via cluster expansions. We revisit the one-parameter multipole acceptance criterion in a general setting and derive a relative error estimate. This analysis benefits asymmetric versions of the method, where the division of the multipole boxes is more liberal t…
▽ More
The notion of well-separated sets is crucial in fast multipole methods as the main idea is to approximate the interaction between such sets via cluster expansions. We revisit the one-parameter multipole acceptance criterion in a general setting and derive a relative error estimate. This analysis benefits asymmetric versions of the method, where the division of the multipole boxes is more liberal than in conventional codes. Such variants offer a particularly elegant implementation with a balanced multipole tree, a feature which might be very favorable on modern computer architectures.
△ Less
Submitted 10 August, 2011; v1 submitted 11 June, 2010;
originally announced June 2010.