-
Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design
Authors:
Andrew Campbell,
Jason Yim,
Regina Barzilay,
Tom Rainforth,
Tommi Jaakkola
Abstract:
Combining discrete and continuous data is an important capability for generative models. We present Discrete Flow Models (DFMs), a new flow-based model of discrete data that provides the missing link in enabling flow-based generative models to be applied to multimodal continuous and discrete data problems. Our key insight is that the discrete equivalent of continuous space flow matching can be rea…
▽ More
Combining discrete and continuous data is an important capability for generative models. We present Discrete Flow Models (DFMs), a new flow-based model of discrete data that provides the missing link in enabling flow-based generative models to be applied to multimodal continuous and discrete data problems. Our key insight is that the discrete equivalent of continuous space flow matching can be realized using Continuous Time Markov Chains. DFMs benefit from a simple derivation that includes discrete diffusion models as a specific instance while allowing improved performance over existing diffusion-based approaches. We utilize our DFMs method to build a multimodal flow-based modeling framework. We apply this capability to the task of protein co-design, wherein we learn a model for jointly generating protein structure and sequence. Our approach achieves state-of-the-art co-design performance while allowing the same multimodal model to be used for flexible generation of the sequence or structure.
△ Less
Submitted 5 June, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Improved motif-scaffolding with SE(3) flow matching
Authors:
Jason Yim,
Andrew Campbell,
Emile Mathieu,
Andrew Y. K. Foong,
Michael Gastegger,
José Jiménez-Luna,
Sarah Lewis,
Victor Garcia Satorras,
Bastiaan S. Veeling,
Frank Noé,
Regina Barzilay,
Tommi S. Jaakkola
Abstract:
Protein design often begins with knowledge of a desired function from a motif which motif-scaffolding aims to construct a functional protein around. Recently, generative models have achieved breakthrough success in designing scaffolds for a diverse range of motifs. However, the generated scaffolds tend to lack structural diversity, which can hinder success in wet-lab validation. In this work, we e…
▽ More
Protein design often begins with knowledge of a desired function from a motif which motif-scaffolding aims to construct a functional protein around. Recently, generative models have achieved breakthrough success in designing scaffolds for a diverse range of motifs. However, the generated scaffolds tend to lack structural diversity, which can hinder success in wet-lab validation. In this work, we extend FrameFlow, an SE(3) flow matching model for protein backbone generation, to perform motif-scaffolding with two complementary approaches. The first is motif amortization, in which FrameFlow is trained with the motif as input using a data augmentation strategy. The second is motif guidance, which performs scaffolding using an estimate of the conditional score from FrameFlow, and requires no additional training. Both approaches achieve an equivalent or higher success rate than previous state-of-the-art methods, with 2.5 times more structurally diverse scaffolds. Code: https://github.com/ microsoft/frame-flow.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
Trans-Dimensional Generative Modeling via Jump Diffusion Models
Authors:
Andrew Campbell,
William Harvey,
Christian Weilbach,
Valentin De Bortoli,
Tom Rainforth,
Arnaud Doucet
Abstract:
We propose a new class of generative models that naturally handle data of varying dimensionality by jointly modeling the state and dimension of each datapoint. The generative process is formulated as a jump diffusion process that makes jumps between different dimensional spaces. We first define a dimension destroying forward noising process, before deriving the dimension creating time-reversed gen…
▽ More
We propose a new class of generative models that naturally handle data of varying dimensionality by jointly modeling the state and dimension of each datapoint. The generative process is formulated as a jump diffusion process that makes jumps between different dimensional spaces. We first define a dimension destroying forward noising process, before deriving the dimension creating time-reversed generative process along with a novel evidence lower bound training objective for learning to approximate it. Simulating our learned approximation to the time-reversed generative process then provides an effective way of sampling data of varying dimensionality by jointly generating state values and dimensions. We demonstrate our approach on molecular and video datasets of varying dimensionality, reporting better compatibility with test-time diffusion guidance imputation tasks and improved interpolation capabilities versus fixed dimensional models that generate state values and dimensions separately.
△ Less
Submitted 30 October, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Diffusion Schrödinger Bridge Matching
Authors:
Yuyang Shi,
Valentin De Bortoli,
Andrew Campbell,
Arnaud Doucet
Abstract:
Solving transport problems, i.e. finding a map transporting one given distribution to another, has numerous applications in machine learning. Novel mass transport methods motivated by generative modeling have recently been proposed, e.g. Denoising Diffusion Models (DDMs) and Flow Matching Models (FMMs) implement such a transport through a Stochastic Differential Equation (SDE) or an Ordinary Diffe…
▽ More
Solving transport problems, i.e. finding a map transporting one given distribution to another, has numerous applications in machine learning. Novel mass transport methods motivated by generative modeling have recently been proposed, e.g. Denoising Diffusion Models (DDMs) and Flow Matching Models (FMMs) implement such a transport through a Stochastic Differential Equation (SDE) or an Ordinary Differential Equation (ODE). However, while it is desirable in many applications to approximate the deterministic dynamic Optimal Transport (OT) map which admits attractive properties, DDMs and FMMs are not guaranteed to provide transports close to the OT map. In contrast, Schrödinger bridges (SBs) compute stochastic dynamic map**s which recover entropy-regularized versions of OT. Unfortunately, existing numerical methods approximating SBs either scale poorly with dimension or accumulate errors across iterations. In this work, we introduce Iterative Markovian Fitting (IMF), a new methodology for solving SB problems, and Diffusion Schrödinger Bridge Matching (DSBM), a novel numerical algorithm for computing IMF iterates. DSBM significantly improves over previous SB numerics and recovers as special/limiting cases various recent transport methods. We demonstrate the performance of DSBM on a variety of problems.
△ Less
Submitted 11 December, 2023; v1 submitted 29 March, 2023;
originally announced March 2023.
-
DBGDGM: Dynamic Brain Graph Deep Generative Model
Authors:
Alexander Campbell,
Simeon Spasov,
Nicola Toschi,
Pietro Lio
Abstract:
Graphs are a natural representation of brain activity derived from functional magnetic imaging (fMRI) data. It is well known that clusters of anatomical brain regions, known as functional connectivity networks (FCNs), encode temporal relationships which can serve as useful biomarkers for understanding brain function and dysfunction. Previous works, however, ignore the temporal dynamics of the brai…
▽ More
Graphs are a natural representation of brain activity derived from functional magnetic imaging (fMRI) data. It is well known that clusters of anatomical brain regions, known as functional connectivity networks (FCNs), encode temporal relationships which can serve as useful biomarkers for understanding brain function and dysfunction. Previous works, however, ignore the temporal dynamics of the brain and focus on static graphs. In this paper, we propose a dynamic brain graph deep generative model (DBGDGM) which simultaneously clusters brain regions into temporally evolving communities and learns dynamic unsupervised node embeddings. Specifically, DBGDGM represents brain graph nodes as embeddings sampled from a distribution over communities that evolve over time. We parameterise this community distribution using neural networks that learn from subject and node embeddings as well as past community assignments. Experiments demonstrate DBGDGM outperforms baselines in graph generation, dynamic link prediction, and is comparable for graph classification. Finally, an analysis of the learnt community distributions reveals overlap with known FCNs reported in neuroscience literature.
△ Less
Submitted 26 January, 2023;
originally announced January 2023.
-
DynDepNet: Learning Time-Varying Dependency Structures from fMRI Data via Dynamic Graph Structure Learning
Authors:
Alexander Campbell,
Antonio Giuliano Zippo,
Luca Passamonti,
Nicola Toschi,
Pietro Lio
Abstract:
Graph neural networks (GNNs) have demonstrated success in learning representations of brain graphs derived from functional magnetic resonance imaging (fMRI) data. However, existing GNN methods assume brain graphs are static over time and the graph adjacency matrix is known prior to model training. These assumptions contradict evidence that brain graphs are time-varying with a connectivity structur…
▽ More
Graph neural networks (GNNs) have demonstrated success in learning representations of brain graphs derived from functional magnetic resonance imaging (fMRI) data. However, existing GNN methods assume brain graphs are static over time and the graph adjacency matrix is known prior to model training. These assumptions contradict evidence that brain graphs are time-varying with a connectivity structure that depends on the choice of functional connectivity measure. Incorrectly representing fMRI data with noisy brain graphs can adversely affect GNN performance. To address this, we propose DynDepNet, a novel method for learning the optimal time-varying dependency structure of fMRI data induced by downstream prediction tasks. Experiments on real-world fMRI datasets, for the task of sex classification, demonstrate that DynDepNet achieves state-of-the-art results, outperforming the best baseline in terms of accuracy by approximately 8 and 6 percentage points, respectively. Furthermore, analysis of the learned dynamic graphs reveals prediction-related brain regions consistent with existing neuroscience literature.
△ Less
Submitted 9 July, 2023; v1 submitted 27 September, 2022;
originally announced September 2022.
-
A Continuous Time Framework for Discrete Denoising Models
Authors:
Andrew Campbell,
Joe Benton,
Valentin De Bortoli,
Tom Rainforth,
George Deligiannidis,
Arnaud Doucet
Abstract:
We provide the first complete continuous time framework for denoising diffusion models of discrete data. This is achieved by formulating the forward noising process and corresponding reverse time generative process as Continuous Time Markov Chains (CTMCs). The model can be efficiently trained using a continuous time version of the ELBO. We simulate the high dimensional CTMC using techniques develo…
▽ More
We provide the first complete continuous time framework for denoising diffusion models of discrete data. This is achieved by formulating the forward noising process and corresponding reverse time generative process as Continuous Time Markov Chains (CTMCs). The model can be efficiently trained using a continuous time version of the ELBO. We simulate the high dimensional CTMC using techniques developed in chemical physics and exploit our continuous time framework to derive high performance samplers that we show can outperform discrete time methods for discrete data. The continuous time treatment also enables us to derive a novel theoretical result bounding the error between the generated sample distribution and the true data distribution.
△ Less
Submitted 14 October, 2022; v1 submitted 30 May, 2022;
originally announced May 2022.
-
Online Variational Filtering and Parameter Learning
Authors:
Andrew Campbell,
Yuyang Shi,
Tom Rainforth,
Arnaud Doucet
Abstract:
We present a variational method for online state estimation and parameter learning in state-space models (SSMs), a ubiquitous class of latent variable models for sequential data. As per standard batch variational techniques, we use stochastic gradients to simultaneously optimize a lower bound on the log evidence with respect to both model parameters and a variational approximation of the states' p…
▽ More
We present a variational method for online state estimation and parameter learning in state-space models (SSMs), a ubiquitous class of latent variable models for sequential data. As per standard batch variational techniques, we use stochastic gradients to simultaneously optimize a lower bound on the log evidence with respect to both model parameters and a variational approximation of the states' posterior distribution. However, unlike existing approaches, our method is able to operate in an entirely online manner, such that historic observations do not require revisitation after being incorporated and the cost of updates at each time step remains constant, despite the growing dimensionality of the joint posterior distribution of the states. This is achieved by utilizing backward decompositions of this joint posterior distribution and of its variational approximation, combined with Bellman-type recursions for the evidence lower bound and its gradients. We demonstrate the performance of this methodology across several examples, including high-dimensional SSMs and sequential Variational Auto-Encoders.
△ Less
Submitted 14 June, 2022; v1 submitted 26 October, 2021;
originally announced October 2021.
-
Markov-Restricted Analysis of Randomized Trials with Non-Monotone Missing Binary Outcomes: Sensitivity Analysis and Identification Results
Authors:
Daniel O. Scharfstein,
Jaron J. R. Lee,
Aidan McDermott,
Aimee Campbell,
Edward Nunes,
Abigail G. Matthews,
Ilya Shpitser
Abstract:
Scharfstein et al. (2021) developed a sensitivity analysis model for analyzing randomized trials with repeatedly measured binary outcomes that are subject to nonmonotone missingness. Their approach becomes computationally intractable when the number of repeated measured is large (e.g., greater than 15). In this paper, we repair this problem by introducing an $m$th-order Markovian restriction. We e…
▽ More
Scharfstein et al. (2021) developed a sensitivity analysis model for analyzing randomized trials with repeatedly measured binary outcomes that are subject to nonmonotone missingness. Their approach becomes computationally intractable when the number of repeated measured is large (e.g., greater than 15). In this paper, we repair this problem by introducing an $m$th-order Markovian restriction. We establish an identification by representing the model as a directed acyclic graph (DAG). We illustrate our methodology in the context of a randomized trial designed to evaluate a web-delivered psychosocial intervention to reduce substance use, assessed by testing urine samples twice weekly for 12 weeks, among patients entering outpatient addiction treatment. We evaluate the finite sample properties of our method in a realistic simulation study. Our methods have been integrated into the R package entitled slabm.
△ Less
Submitted 18 May, 2021;
originally announced May 2021.
-
Transfer Learning for Activity Recognition in Mobile Health
Authors:
Yuchao Ma,
Andrew T. Campbell,
Diane J. Cook,
John Lach,
Shwetak N. Patel,
Thomas Ploetz,
Majid Sarrafzadeh,
Donna Spruijt-Metz,
Hassan Ghasemzadeh
Abstract:
While activity recognition from inertial sensors holds potential for mobile health, differences in sensing platforms and user movement patterns cause performance degradation. Aiming to address these challenges, we propose a transfer learning framework, TransFall, for sensor-based activity recognition. TransFall's design contains a two-tier data transformation, a label estimation layer, and a model…
▽ More
While activity recognition from inertial sensors holds potential for mobile health, differences in sensing platforms and user movement patterns cause performance degradation. Aiming to address these challenges, we propose a transfer learning framework, TransFall, for sensor-based activity recognition. TransFall's design contains a two-tier data transformation, a label estimation layer, and a model generation layer to recognize activities for the new scenario. We validate TransFall analytically and empirically.
△ Less
Submitted 12 July, 2020;
originally announced July 2020.
-
tvGP-VAE: Tensor-variate Gaussian Process Prior Variational Autoencoder
Authors:
Alex Campbell,
Pietro Liò
Abstract:
Variational autoencoders (VAEs) are a powerful class of deep generative latent variable model for unsupervised representation learning on high-dimensional data. To ensure computational tractability, VAEs are often implemented with a univariate standard Gaussian prior and a mean-field Gaussian variational posterior distribution. This results in a vector-valued latent variables that are agnostic to…
▽ More
Variational autoencoders (VAEs) are a powerful class of deep generative latent variable model for unsupervised representation learning on high-dimensional data. To ensure computational tractability, VAEs are often implemented with a univariate standard Gaussian prior and a mean-field Gaussian variational posterior distribution. This results in a vector-valued latent variables that are agnostic to the original data structure which might be highly correlated across and within multiple dimensions. We propose a tensor-variate extension to the VAE framework, the tensor-variate Gaussian process prior variational autoencoder (tvGP-VAE), which replaces the standard univariate Gaussian prior and posterior distributions with tensor-variate Gaussian processes. The tvGP-VAE is able to explicitly model correlation structures via the use of kernel functions over the dimensions of tensor-valued latent variables. Using spatiotemporally correlated image time series as an example, we show that the choice of which correlation structures to explicitly represent in the latent space has a significant impact on model performance in terms of reconstruction.
△ Less
Submitted 8 June, 2020;
originally announced June 2020.
-
Parallel Tempering via Simulated Tempering Without Normalizing Constants
Authors:
Biljana Jonoska Stojkova,
David A. Campbell
Abstract:
In this paper we develop a new general Bayesian methodology that simultaneously estimates parameters of interest and the marginal likelihood of the model. The proposed methodology builds on Simulated Tempering, which is a powerful algorithm that enables sampling from multi-modal distributions. However, Simulated Tempering comes with the practical limitation of needing to specify a prior for the te…
▽ More
In this paper we develop a new general Bayesian methodology that simultaneously estimates parameters of interest and the marginal likelihood of the model. The proposed methodology builds on Simulated Tempering, which is a powerful algorithm that enables sampling from multi-modal distributions. However, Simulated Tempering comes with the practical limitation of needing to specify a prior for the temperature along a chosen discretization schedule that will allow calculation of normalizing constants at each temperature. Our proposed model defines the prior for the temperature so as to remove the need for calculating normalizing constants at each temperature and thereby enables a continuous temperature schedule, while preserving the sampling efficiency of the Simulated Tempering algorithm. The resulting algorithm simultaneously estimates parameters while estimating marginal likelihoods through thermodynamic integration. We illustrate the applicability of the new algorithm to different examples involving mixture models of Gaussian distributions and ordinary differential equation models.
△ Less
Submitted 30 May, 2019;
originally announced May 2019.
-
Incremental Mixture Importance Sampling with Shotgun optimization
Authors:
Biljana Jonoska Stojkova,
David A. Campbell
Abstract:
This paper proposes a general optimization strategy, which combines results from different optimization or parameter estimation methods to overcome shortcomings of a single method. Shotgun optimization is developed as a framework which employs different optimization strategies, criteria, or conditional targets to enable wider likelihood exploration. The introduced Shotgun optimization approach is…
▽ More
This paper proposes a general optimization strategy, which combines results from different optimization or parameter estimation methods to overcome shortcomings of a single method. Shotgun optimization is developed as a framework which employs different optimization strategies, criteria, or conditional targets to enable wider likelihood exploration. The introduced Shotgun optimization approach is embedded into an incremental mixture importance sampling algorithm to produce improved posterior samples for multimodal densities and creates robustness in cases where the likelihood and prior are in disagreement. Despite using different optimization approaches, the samples are combined into samples from a single target posterior. The diversity of the framework is demonstrated on parameter estimation from differential equation models employing diverse strategies including numerical solutions and approximations thereof. Additionally the approach is demonstrated on mixtures of discrete and continuous parameters and is shown to ease estimation from synthetic likelihood models. R code of the implemented examples is stored in a zipped archive (codeSubmit.zip).
△ Less
Submitted 13 November, 2017;
originally announced November 2017.
-
Sequentially Constrained Monte Carlo
Authors:
Shirin Golchi,
David A. Campbell
Abstract:
Constraints can be interpreted in a broad sense as any kind of explicit restriction over the parameters. While some constraints are defined directly on the parameter space, when they are instead defined by known behaviour on the model, transformation of constraints into features on the parameter space may not be possible. Difficulties in sampling from the posterior distribution as a result of inco…
▽ More
Constraints can be interpreted in a broad sense as any kind of explicit restriction over the parameters. While some constraints are defined directly on the parameter space, when they are instead defined by known behaviour on the model, transformation of constraints into features on the parameter space may not be possible. Difficulties in sampling from the posterior distribution as a result of incorporation of constraints into the model is a common challenge leading to truncations in the parameter space and inefficient sampling algorithms. We propose a variant of sequential Monte Carlo algorithm for posterior sampling in presence of constraints by defining a sequence of densities through the imposition of the constraint. Particles generated from an unconstrained or mildly constrained distribution are filtered and moved through sampling and resampling steps to obtain a sample from the fully constrained target distribution. General and model specific forms of constraints enforcing strategies are defined. The Sequentially Constrained Monte Carlo algorithm is demonstrated on constraints defined by monotonicity of a function, densities constrained to low dimensional manifolds, adherence to a theoretically derived model, and model feature matching.
△ Less
Submitted 25 February, 2015; v1 submitted 29 October, 2014;
originally announced October 2014.
-
Transdimensional Approximate Bayesian Computation for Inference on Invasive Species Models with Latent Variables of Unknown Dimension
Authors:
Oksana A. Chkrebtii,
Erin K. Cameron,
David A. Campbell,
Erin M. Bayne
Abstract:
Accurate information on patterns of introduction and spread of non-native species is essential for making predictions and management decisions. In many cases, estimating unknown rates of introduction and spread from observed data requires evaluating intractable variable-dimensional integrals. In general, inference on the large class of models containing latent variables of large or variable dimens…
▽ More
Accurate information on patterns of introduction and spread of non-native species is essential for making predictions and management decisions. In many cases, estimating unknown rates of introduction and spread from observed data requires evaluating intractable variable-dimensional integrals. In general, inference on the large class of models containing latent variables of large or variable dimension precludes exact sampling techniques. Approximate Bayesian computation (ABC) methods provide an alternative to exact sampling but rely on inefficient conditional simulation of the latent variables. To accomplish this task efficiently, a new transdimensional Monte Carlo sampler is developed for approximate Bayesian model inference and used to estimate rates of introduction and spread for the non-native earthworm species Dendrobaena octaedra (Savigny) along roads in the boreal forest of northern Alberta. Using low and high estimates of introduction and spread rates, the extent of earthworm invasions in northeastern Alberta was simulated to project the proportion of suitable habitat invaded in the year following data collection.
△ Less
Submitted 30 December, 2014; v1 submitted 10 October, 2013;
originally announced October 2013.
-
Monotone Function Estimation for Computer Experiments
Authors:
Shirin Golchi,
Derek R. Bingham,
Hugh Chipman,
David A. Campbell
Abstract:
In statistical modeling of computer experiments sometimes prior information is available about the underlying function. For example, the physical system simulated by the computer code may be known to be monotone with respect to some or all inputs. We develop a Bayesian approach to Gaussian process modelling capable of incorporating monotonicity information for computer model emulation. Markov chai…
▽ More
In statistical modeling of computer experiments sometimes prior information is available about the underlying function. For example, the physical system simulated by the computer code may be known to be monotone with respect to some or all inputs. We develop a Bayesian approach to Gaussian process modelling capable of incorporating monotonicity information for computer model emulation. Markov chain Monte Carlo methods are used to sample from the posterior distribution of the process given the simulator output and monotonicity information. The performance of the proposed approach in terms of predictive accuracy and uncertainty quantification is demonstrated in a number of simulated examples as well as a real queueing system application.
△ Less
Submitted 14 June, 2014; v1 submitted 15 September, 2013;
originally announced September 2013.
-
Bayesian Solution Uncertainty Quantification for Differential Equations
Authors:
Oksana A. Chkrebtii,
David A. Campbell,
Ben Calderhead,
Mark A. Girolami
Abstract:
We explore probability modelling of discretization uncertainty for system states defined implicitly by ordinary or partial differential equations. Accounting for this uncertainty can avoid posterior under-coverage when likelihoods are constructed from a coarsely discretized approximation to system equations. A formalism is proposed for inferring a fixed but a priori unknown model trajectory throug…
▽ More
We explore probability modelling of discretization uncertainty for system states defined implicitly by ordinary or partial differential equations. Accounting for this uncertainty can avoid posterior under-coverage when likelihoods are constructed from a coarsely discretized approximation to system equations. A formalism is proposed for inferring a fixed but a priori unknown model trajectory through Bayesian updating of a prior process conditional on model information. A one-step-ahead sampling scheme for interrogating the model is described, its consistency and first order convergence properties are proved, and its computational complexity is shown to be proportional to that of numerical explicit one-step solvers. Examples illustrate the flexibility of this framework to deal with a wide variety of complex and large-scale systems. Within the calibration problem, discretization uncertainty defines a layer in the Bayesian hierarchy, and a Markov chain Monte Carlo algorithm that targets this posterior distribution is presented. This formalism is used for inference on the JAK-STAT delay differential equation model of protein dynamics from indirectly observed measurements. The discussion outlines implications for the new field of probabilistic numerics.
△ Less
Submitted 23 October, 2016; v1 submitted 10 June, 2013;
originally announced June 2013.