Search | arXiv e-print repository

Trans-Dimensional Generative Modeling via Jump Diffusion Models

Authors: Andrew Campbell, William Harvey, Christian Weilbach, Valentin De Bortoli, Tom Rainforth, Arnaud Doucet

Abstract: We propose a new class of generative models that naturally handle data of varying dimensionality by jointly modeling the state and dimension of each datapoint. The generative process is formulated as a jump diffusion process that makes jumps between different dimensional spaces. We first define a dimension destroying forward noising process, before deriving the dimension creating time-reversed gen… ▽ More We propose a new class of generative models that naturally handle data of varying dimensionality by jointly modeling the state and dimension of each datapoint. The generative process is formulated as a jump diffusion process that makes jumps between different dimensional spaces. We first define a dimension destroying forward noising process, before deriving the dimension creating time-reversed generative process along with a novel evidence lower bound training objective for learning to approximate it. Simulating our learned approximation to the time-reversed generative process then provides an effective way of sampling data of varying dimensionality by jointly generating state values and dimensions. We demonstrate our approach on molecular and video datasets of varying dimensionality, reporting better compatibility with test-time diffusion guidance imputation tasks and improved interpolation capabilities versus fixed dimensional models that generate state values and dimensions separately. △ Less

Submitted 30 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: 41 pages, 11 figures, 8 tables; NeurIPS 2023

arXiv:2010.01274 [pdf, other]

Assisting the Adversary to Improve GAN Training

Authors: Andreas Munk, William Harvey, Frank Wood

Abstract: Some of the most popular methods for improving the stability and performance of GANs involve constraining or regularizing the discriminator. In this paper we consider a largely overlooked regularization technique which we refer to as the Adversary's Assistant (AdvAs). We motivate this using a different perspective to that of prior work. Specifically, we consider a common mismatch between theoretic… ▽ More Some of the most popular methods for improving the stability and performance of GANs involve constraining or regularizing the discriminator. In this paper we consider a largely overlooked regularization technique which we refer to as the Adversary's Assistant (AdvAs). We motivate this using a different perspective to that of prior work. Specifically, we consider a common mismatch between theoretical analysis and practice: analysis often assumes that the discriminator reaches its optimum on each iteration. In practice, this is essentially never true, often leading to poor gradient estimates for the generator. To address this, AdvAs is a theoretically motivated penalty imposed on the generator based on the norm of the gradients used to train the discriminator. This encourages the generator to move towards points where the discriminator is optimal. We demonstrate the effect of applying AdvAs to several GAN objectives, datasets and network architectures. The results indicate a reduction in the mismatch between theory and practice and that AdvAs can lead to improvement of GAN training, as measured by FID scores. △ Less

Submitted 8 December, 2020; v1 submitted 3 October, 2020; originally announced October 2020.

arXiv:2003.13221 [pdf, other]

doi 10.3389/frai.2021.550603

Planning as Inference in Epidemiological Models

Authors: Frank Wood, Andrew Warrington, Saeid Naderiparizi, Christian Weilbach, Vaden Masrani, William Harvey, Adam Scibior, Boyan Beronov, John Grefenstette, Duncan Campbell, Ali Nasseri

Abstract: In this work we demonstrate how to automate parts of the infectious disease-control policy-making process via performing inference in existing epidemiological models. The kind of inference tasks undertaken include computing the posterior distribution over controllable, via direct policy-making choices, simulation model parameters that give rise to acceptable disease progression outcomes. Among oth… ▽ More In this work we demonstrate how to automate parts of the infectious disease-control policy-making process via performing inference in existing epidemiological models. The kind of inference tasks undertaken include computing the posterior distribution over controllable, via direct policy-making choices, simulation model parameters that give rise to acceptable disease progression outcomes. Among other things, we illustrate the use of a probabilistic programming language that automates inference in existing simulators. Neither the full capabilities of this tool for automating inference nor its utility for planning is widely disseminated at the current time. Timely gains in understanding about how such simulation-based models and inference automation tools applied in support of policymaking could lead to less economically damaging policy prescriptions, particularly during the current COVID-19 pandemic. △ Less

Submitted 15 September, 2021; v1 submitted 30 March, 2020; originally announced March 2020.

Comments: Revisions

Journal ref: Front Artif Intell. 2021; 4: 550603

arXiv:1910.11961 [pdf, other]

Attention for Inference Compilation

Authors: William Harvey, Andreas Munk, Atılım Güneş Baydin, Alexander Bergholm, Frank Wood

Abstract: We present a new approach to automatic amortized inference in universal probabilistic programs which improves performance compared to current methods. Our approach is a variation of inference compilation (IC) which leverages deep neural networks to approximate a posterior distribution over latent variables in a probabilistic program. A challenge with existing IC network architectures is that they… ▽ More We present a new approach to automatic amortized inference in universal probabilistic programs which improves performance compared to current methods. Our approach is a variation of inference compilation (IC) which leverages deep neural networks to approximate a posterior distribution over latent variables in a probabilistic program. A challenge with existing IC network architectures is that they can fail to model long-range dependencies between latent variables. To address this, we introduce an attention mechanism that attends to the most salient variables previously sampled in the execution of a probabilistic program. We demonstrate that the addition of attention allows the proposal distributions to better match the true posterior, enhancing inference about latent variables in simulators. △ Less

Submitted 25 October, 2019; originally announced October 2019.

arXiv:1906.05462 [pdf, other]

Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training

Authors: William Harvey, Michael Teng, Frank Wood

Abstract: Hard visual attention is a promising approach to reduce the computational burden of modern computer vision methodologies. Hard attention mechanisms are typically non-differentiable. They can be trained with reinforcement learning but the high-variance training this entails hinders more widespread application. We show how hard attention for image classification can be framed as a Bayesian optimal e… ▽ More Hard visual attention is a promising approach to reduce the computational burden of modern computer vision methodologies. Hard attention mechanisms are typically non-differentiable. They can be trained with reinforcement learning but the high-variance training this entails hinders more widespread application. We show how hard attention for image classification can be framed as a Bayesian optimal experimental design (BOED) problem. From this perspective, the optimal locations to attend to are those which provide the greatest expected reduction in the entropy of the classification distribution. We introduce methodology from the BOED literature to approximate this optimal behaviour, and use it to generate `near-optimal' sequences of attention locations. We then show how to use such sequences to partially supervise, and therefore speed up, the training of a hard attention mechanism. Although generating these sequences is computationally expensive, they can be reused by any other networks later trained on the same task. △ Less

Submitted 14 June, 2020; v1 submitted 12 June, 2019; originally announced June 2019.

Comments: 11 pages, 6 figures + appendix with 9 pages, 7 figures.Submitted to NeurIPS 2020

arXiv:1710.06366 [pdf, other]

doi 10.1111/rssc.12338

Improving the identification of antigenic sites in the H1N1 Influenza virus through accounting for the experimental structure in a sparse hierarchical Bayesian model

Authors: Vinny Davies, William T. Harvey, Richard Reeve, Dirk Husmeier

Abstract: Understanding how genetic changes allow emerging virus strains to escape the protection afforded by vaccination is vital for the maintenance of effective vaccines. In the current work, we use structural and phylogenetic differences between pairs of virus strains to identify important antigenic sites on the surface of the influenza A(H1N1) virus through the prediction of haemagglutination inhibitio… ▽ More Understanding how genetic changes allow emerging virus strains to escape the protection afforded by vaccination is vital for the maintenance of effective vaccines. In the current work, we use structural and phylogenetic differences between pairs of virus strains to identify important antigenic sites on the surface of the influenza A(H1N1) virus through the prediction of haemagglutination inhibition (HI) assay, pairwise measures of the antigenic similarity of virus strains. We propose a sparse hierarchical Bayesian model that can deal with the pairwise structure and inherent experimental variability in the H1N1 data through the introduction of latent variables. The latent variables represent the underlying HI assay measurement of any given pair of virus strains and help account for the fact that for any HI assay measurement between the same pair of virus strains, the difference in the viral sequence remains the same. Through accurately representing the structure of the H1N1 data, the model is able to select virus sites which are antigenic, while its latent structure achieves the computational efficiency required to deal with large virus sequence data, as typically available for the influenza virus. In addition to the latent variable model, we also propose a new method, block integrated Widely Applicable Information Criterion (biWAIC), for selecting between competing models. We show how this allows us to effectively select the random effects when used with the proposed model and apply both methods to an A(H1N1) dataset. △ Less

Submitted 6 October, 2017; originally announced October 2017.

Journal ref: J. R. Stat. Soc. C (2019)

Showing 1–6 of 6 results for author: Harvey, W