-
Simulation-Based Inference for Global Health Decisions
Authors:
Christian Schroeder de Witt,
Bradley Gram-Hansen,
Nantas Nardelli,
Andrew Gambardella,
Rob Zinkov,
Puneet Dokania,
N. Siddharth,
Ana Belen Espinosa-Gonzalez,
Ara Darzi,
Philip Torr,
Atılım Güneş Baydin
Abstract:
The COVID-19 pandemic has highlighted the importance of in-silico epidemiological modelling in predicting the dynamics of infectious diseases to inform health policy and decision makers about suitable prevention and containment strategies. Work in this setting involves solving challenging inference and control problems in individual-based models of ever increasing complexity. Here we discuss recen…
▽ More
The COVID-19 pandemic has highlighted the importance of in-silico epidemiological modelling in predicting the dynamics of infectious diseases to inform health policy and decision makers about suitable prevention and containment strategies. Work in this setting involves solving challenging inference and control problems in individual-based models of ever increasing complexity. Here we discuss recent breakthroughs in machine learning, specifically in simulation-based inference, and explore its potential as a novel venue for model calibration to support the design and evaluation of public health interventions. To further stimulate research, we are develo** software interfaces that turn two cornerstone COVID-19 and malaria epidemiology models COVID-sim, (https://github.com/mrc-ide/covid-sim/) and OpenMalaria (https://github.com/SwissTPH/openmalaria) into probabilistic programs, enabling efficient interpretable Bayesian inference within those simulators.
△ Less
Submitted 14 May, 2020;
originally announced May 2020.
-
Amortized Rejection Sampling in Universal Probabilistic Programming
Authors:
Saeid Naderiparizi,
Adam Ścibior,
Andreas Munk,
Mehrdad Ghadiri,
Atılım Güneş Baydin,
Bradley Gram-Hansen,
Christian Schroeder de Witt,
Robert Zinkov,
Philip H. S. Torr,
Tom Rainforth,
Yee Whye Teh,
Frank Wood
Abstract:
Naive approaches to amortized inference in probabilistic programs with unbounded loops can produce estimators with infinite variance. This is particularly true of importance sampling inference in programs that explicitly include rejection sampling as part of the user-programmed generative procedure. In this paper we develop a new and efficient amortized importance sampling estimator. We prove fini…
▽ More
Naive approaches to amortized inference in probabilistic programs with unbounded loops can produce estimators with infinite variance. This is particularly true of importance sampling inference in programs that explicitly include rejection sampling as part of the user-programmed generative procedure. In this paper we develop a new and efficient amortized importance sampling estimator. We prove finite variance of our estimator and empirically demonstrate our method's correctness and efficiency compared to existing alternatives on generative programs containing rejection sampling loops and discuss how to implement our method in a generic probabilistic programming framework.
△ Less
Submitted 28 March, 2022; v1 submitted 20 October, 2019;
originally announced October 2019.
-
Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale
Authors:
Atılım Güneş Baydin,
Lei Shao,
Wahid Bhimji,
Lukas Heinrich,
Lawrence Meadows,
Jialin Liu,
Andreas Munk,
Saeid Naderiparizi,
Bradley Gram-Hansen,
Gilles Louppe,
Mingfei Ma,
Xiaohui Zhao,
Philip Torr,
Victor Lee,
Kyle Cranmer,
Prabhat,
Frank Wood
Abstract:
Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL frame…
▽ More
Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL framework that couples directly to existing scientific simulators through a cross-platform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deep-learning-based inference compilation (IC) engines for tractable inference. To guide IC inference, we perform distributed training of a dynamic 3DCNN--LSTM architecture with a PyTorch-MPI-based framework on 1,024 32-core CPU nodes of the Cori supercomputer with a global minibatch size of 128k: achieving a performance of 450 Tflop/s through enhancements to PyTorch. We demonstrate a Large Hadron Collider (LHC) use-case with the C++ Sherpa simulator and achieve the largest-scale posterior inference in a Turing-complete PPL.
△ Less
Submitted 27 August, 2019; v1 submitted 7 July, 2019;
originally announced July 2019.
-
Hijacking Malaria Simulators with Probabilistic Programming
Authors:
Bradley Gram-Hansen,
Christian Schröder de Witt,
Tom Rainforth,
Philip H. S. Torr,
Yee Whye Teh,
Atılım Güneş Baydin
Abstract:
Epidemiology simulations have become a fundamental tool in the fight against the epidemics of various infectious diseases like AIDS and malaria. However, the complicated and stochastic nature of these simulators can mean their output is difficult to interpret, which reduces their usefulness to policymakers. In this paper, we introduce an approach that allows one to treat a large class of populatio…
▽ More
Epidemiology simulations have become a fundamental tool in the fight against the epidemics of various infectious diseases like AIDS and malaria. However, the complicated and stochastic nature of these simulators can mean their output is difficult to interpret, which reduces their usefulness to policymakers. In this paper, we introduce an approach that allows one to treat a large class of population-based epidemiology simulators as probabilistic generative models. This is achieved by hijacking the internal random number generator calls, through the use of a universal probabilistic programming system (PPS). In contrast to other methods, our approach can be easily retrofitted to simulators written in popular industrial programming frameworks. We demonstrate that our method can be used for interpretable introspection and inference, thus shedding light on black-box simulators. This reinstates much-needed trust between policymakers and evidence-based methods.
△ Less
Submitted 29 May, 2019;
originally announced May 2019.
-
LF-PPL: A Low-Level First Order Probabilistic Programming Language for Non-Differentiable Models
Authors:
Yuan Zhou,
Bradley J. Gram-Hansen,
Tobias Kohn,
Tom Rainforth,
Hongseok Yang,
Frank Wood
Abstract:
We develop a new Low-level, First-order Probabilistic Programming Language (LF-PPL) suited for models containing a mix of continuous, discrete, and/or piecewise-continuous variables. The key success of this language and its compilation scheme is in its ability to automatically distinguish parameters the density function is discontinuous with respect to, while further providing runtime checks for b…
▽ More
We develop a new Low-level, First-order Probabilistic Programming Language (LF-PPL) suited for models containing a mix of continuous, discrete, and/or piecewise-continuous variables. The key success of this language and its compilation scheme is in its ability to automatically distinguish parameters the density function is discontinuous with respect to, while further providing runtime checks for boundary crossings. This enables the introduction of new inference engines that are able to exploit gradient information, while remaining efficient for models which are not everywhere differentiable. We demonstrate this ability by incorporating a discontinuous Hamiltonian Monte Carlo (DHMC) inference engine that is able to deliver automated and efficient inference for non-differentiable models. Our system is backed up by a mathematical formalism that ensures that any model expressed in this language has a density with measure zero discontinuities to maintain the validity of the inference engine.
△ Less
Submitted 6 March, 2019;
originally announced March 2019.
-
Map** Informal Settlements in Develo** Countries using Machine Learning and Low Resolution Multi-spectral Data
Authors:
Bradley Gram-Hansen,
Patrick Helber,
Indhu Varatharajan,
Faiza Azam,
Alejandro Coca-Castro,
Veronika Kopackova,
Piotr Bilinski
Abstract:
Informal settlements are home to the most socially and economically vulnerable people on the planet. In order to deliver effective economic and social aid, non-government organizations (NGOs), such as the United Nations Children's Fund (UNICEF), require detailed maps of the locations of informal settlements. However, data regarding informal and formal settlements is primarily unavailable and if av…
▽ More
Informal settlements are home to the most socially and economically vulnerable people on the planet. In order to deliver effective economic and social aid, non-government organizations (NGOs), such as the United Nations Children's Fund (UNICEF), require detailed maps of the locations of informal settlements. However, data regarding informal and formal settlements is primarily unavailable and if available is often incomplete. This is due, in part, to the cost and complexity of gathering data on a large scale. To address these challenges, we, in this work, provide three contributions. 1) A brand new machine learning data-set, purposely developed for informal settlement detection. 2) We show that it is possible to detect informal settlements using freely available low-resolution (LR) data, in contrast to previous studies that use very-high resolution (VHR) satellite and aerial imagery, something that is cost-prohibitive for NGOs. 3) We demonstrate two effective classification schemes on our curated data set, one that is cost-efficient for NGOs and another that is cost-prohibitive for NGOs, but has additional utility. We integrate these schemes into a semi-automated pipeline that converts either a LR or VHR satellite image into a binary map that encodes the locations of informal settlements.
△ Less
Submitted 30 May, 2019; v1 submitted 3 January, 2019;
originally announced January 2019.
-
Map** Informal Settlements in Develo** Countries with Multi-resolution, Multi-spectral Data
Authors:
Patrick Helber,
Bradley Gram-Hansen,
Indhu Varatharajan,
Faiza Azam,
Alejandro Coca-Castro,
Veronika Kopackova,
Piotr Bilinski
Abstract:
Detecting and map** informal settlements encompasses several of the United Nations sustainable development goals. This is because informal settlements are home to the most socially and economically vulnerable people on the planet. Thus, understanding where these settlements are is of paramount importance to both government and non-government organizations (NGOs), such as the United Nations Child…
▽ More
Detecting and map** informal settlements encompasses several of the United Nations sustainable development goals. This is because informal settlements are home to the most socially and economically vulnerable people on the planet. Thus, understanding where these settlements are is of paramount importance to both government and non-government organizations (NGOs), such as the United Nations Children's Fund (UNICEF), who can use this information to deliver effective social and economic aid. We propose two effective methods for detecting and map** the locations of informal settlements. One uses only low-resolution (LR), freely available, Sentinel-2 multispectral satellite imagery with noisy annotations, whilst the other is a deep learning approach that uses only costly very-high-resolution (VHR) satellite imagery. To our knowledge, we are the first to map informal settlements successfully with low-resolution satellite imagery. We extensively evaluate and compare the proposed methods. Please find additional material at https://frontierdevelopmentlab.github.io/informal-settlements/.
△ Less
Submitted 30 November, 2018;
originally announced December 2018.
-
Generating Material Maps to Map Informal Settlements
Authors:
Patrick Helber,
Bradley Gram-Hansen,
Indhu Varatharajan,
Faiza Azam,
Alejandro Coca-Castro,
Veronika Kopackova,
Piotr Bilinski
Abstract:
Detecting and map** informal settlements encompasses several of the United Nations sustainable development goals. This is because informal settlements are home to the most socially and economically vulnerable people on the planet. Thus, understanding where these settlements are is of paramount importance to both government and non-government organizations (NGOs), such as the United Nations Child…
▽ More
Detecting and map** informal settlements encompasses several of the United Nations sustainable development goals. This is because informal settlements are home to the most socially and economically vulnerable people on the planet. Thus, understanding where these settlements are is of paramount importance to both government and non-government organizations (NGOs), such as the United Nations Children's Fund (UNICEF), who can use this information to deliver effective social and economic aid. We propose a method that detects and maps the locations of informal settlements using only freely available, Sentinel-2 low-resolution satellite spectral data and socio-economic data. This is in contrast to previous studies that only use costly very-high resolution (VHR) satellite and aerial imagery. We show how we can detect informal settlements by combining both domain knowledge and machine learning techniques, to build a classifier that looks for known roofing materials used in informal settlements. Please find additional material at https://frontierdevelopmentlab.github.io/informal-settlements/.
△ Less
Submitted 30 May, 2019; v1 submitted 30 November, 2018;
originally announced December 2018.
-
Efficient Probabilistic Inference in the Quest for Physics Beyond the Standard Model
Authors:
Atılım Güneş Baydin,
Lukas Heinrich,
Wahid Bhimji,
Lei Shao,
Saeid Naderiparizi,
Andreas Munk,
Jialin Liu,
Bradley Gram-Hansen,
Gilles Louppe,
Lawrence Meadows,
Philip Torr,
Victor Lee,
Prabhat,
Kyle Cranmer,
Frank Wood
Abstract:
We present a novel probabilistic programming framework that couples directly to existing large-scale simulators through a cross-platform probabilistic execution protocol, which allows general-purpose inference engines to record and control random number draws within simulators in a language-agnostic way. The execution of existing simulators as probabilistic programs enables highly interpretable po…
▽ More
We present a novel probabilistic programming framework that couples directly to existing large-scale simulators through a cross-platform probabilistic execution protocol, which allows general-purpose inference engines to record and control random number draws within simulators in a language-agnostic way. The execution of existing simulators as probabilistic programs enables highly interpretable posterior inference in the structured model defined by the simulator code base. We demonstrate the technique in particle physics, on a scientifically accurate simulation of the tau lepton decay, which is a key ingredient in establishing the properties of the Higgs boson. Inference efficiency is achieved via inference compilation where a deep recurrent neural network is trained to parameterize proposal distributions and control the stochastic simulator in a sequential importance sampling scheme, at a fraction of the computational cost of a Markov chain Monte Carlo baseline.
△ Less
Submitted 17 February, 2020; v1 submitted 20 July, 2018;
originally announced July 2018.
-
Hamiltonian Monte Carlo for Probabilistic Programs with Discontinuities
Authors:
Bradley Gram-Hansen,
Yuan Zhou,
Tobias Kohn,
Tom Rainforth,
Hongseok Yang,
Frank Wood
Abstract:
Hamiltonian Monte Carlo (HMC) is arguably the dominant statistical inference algorithm used in most popular "first-order differentiable" Probabilistic Programming Languages (PPLs). However, the fact that HMC uses derivative information causes complications when the target distribution is non-differentiable with respect to one or more of the latent variables. In this paper, we show how to use exten…
▽ More
Hamiltonian Monte Carlo (HMC) is arguably the dominant statistical inference algorithm used in most popular "first-order differentiable" Probabilistic Programming Languages (PPLs). However, the fact that HMC uses derivative information causes complications when the target distribution is non-differentiable with respect to one or more of the latent variables. In this paper, we show how to use extensions to HMC to perform inference in probabilistic programs that contain discontinuities. To do this, we design a Simple first-order Probabilistic Programming Language (SPPL) that contains a sufficient set of language restrictions together with a compilation scheme. This enables us to preserve both the statistical and syntactic interpretation of if-else statements in the probabilistic program, within the scope of first-order PPLs. We also provide a corresponding mathematical formalism that ensures any joint density denoted in such a language has a suitably low measure of discontinuities.
△ Less
Submitted 2 January, 2019; v1 submitted 7 April, 2018;
originally announced April 2018.