-
VADA: a Data-Driven Simulator for Nanopore Sequencing
Authors:
Jonas Niederle,
Simon Koop,
Marc Pagès-Gallego,
Vlado Menkovski
Abstract:
Nanopore sequencing offers the ability for real-time analysis of long DNA sequences at a low cost, enabling new applications such as early detection of cancer. Due to the complex nature of nanopore measurements and the high cost of obtaining ground truth datasets, there is a need for nanopore simulators. Existing simulators rely on handcrafted rules and parameters and do not learn an internal repr…
▽ More
Nanopore sequencing offers the ability for real-time analysis of long DNA sequences at a low cost, enabling new applications such as early detection of cancer. Due to the complex nature of nanopore measurements and the high cost of obtaining ground truth datasets, there is a need for nanopore simulators. Existing simulators rely on handcrafted rules and parameters and do not learn an internal representation that would allow for analysing underlying biological factors of interest. Instead, we propose VADA, a purely data-driven method for simulating nanopores based on an autoregressive latent variable model. We embed subsequences of DNA and introduce a conditional prior to address the challenge of a collapsing conditioning. We introduce an auxiliary regressor on the latent variable to encourage our model to learn an informative latent representation. We empirically demonstrate that our model achieves competitive simulation performance on experimental nanopore data. Moreover, we show we have learned an informative latent representation that is predictive of the DNA labels. We hypothesize that other biological factors of interest, beyond the DNA labels, can potentially be extracted from such a learned latent representation.
△ Less
Submitted 26 June, 2024; v1 submitted 12 April, 2024;
originally announced April 2024.
-
Equivariant Neural Simulators for Stochastic Spatiotemporal Dynamics
Authors:
Koen Minartz,
Yoeri Poels,
Simon Koop,
Vlado Menkovski
Abstract:
Neural networks are emerging as a tool for scalable data-driven simulation of high-dimensional dynamical systems, especially in settings where numerical methods are infeasible or computationally expensive. Notably, it has been shown that incorporating domain symmetries in deterministic neural simulators can substantially improve their accuracy, sample efficiency, and parameter efficiency. However,…
▽ More
Neural networks are emerging as a tool for scalable data-driven simulation of high-dimensional dynamical systems, especially in settings where numerical methods are infeasible or computationally expensive. Notably, it has been shown that incorporating domain symmetries in deterministic neural simulators can substantially improve their accuracy, sample efficiency, and parameter efficiency. However, to incorporate symmetries in probabilistic neural simulators that can simulate stochastic phenomena, we need a model that produces equivariant distributions over trajectories, rather than equivariant function approximations. In this paper, we propose Equivariant Probabilistic Neural Simulation (EPNS), a framework for autoregressive probabilistic modeling of equivariant distributions over system evolutions. We use EPNS to design models for a stochastic n-body system and stochastic cellular dynamics. Our results show that EPNS considerably outperforms existing neural network-based methods for probabilistic simulation. More specifically, we demonstrate that incorporating equivariance in EPNS improves simulation quality, data efficiency, rollout stability, and uncertainty quantification. We conclude that EPNS is a promising method for efficient and effective data-driven probabilistic simulation in a diverse range of domains.
△ Less
Submitted 30 October, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Neural Langevin Dynamics: towards interpretable Neural Stochastic Differential Equations
Authors:
Simon M. Koop,
Mark A. Peletier,
Jacobus W. Portegies,
Vlado Menkovski
Abstract:
Neural Stochastic Differential Equations (NSDE) have been trained as both Variational Autoencoders, and as GANs. However, the resulting Stochastic Differential Equations can be hard to interpret or analyse due to the generic nature of the drift and diffusion fields. By restricting our NSDE to be of the form of Langevin dynamics, and training it as a VAE, we obtain NSDEs that lend themselves to mor…
▽ More
Neural Stochastic Differential Equations (NSDE) have been trained as both Variational Autoencoders, and as GANs. However, the resulting Stochastic Differential Equations can be hard to interpret or analyse due to the generic nature of the drift and diffusion fields. By restricting our NSDE to be of the form of Langevin dynamics, and training it as a VAE, we obtain NSDEs that lend themselves to more elaborate analysis and to a wider range of visualisation techniques than a generic NSDE. More specifically, we obtain an energy landscape, the minima of which are in one-to-one correspondence with latent states underlying the used data. This not only allows us to detect states underlying the data dynamics in an unsupervised manner, but also to infer the distribution of time spent in each state according to the learned SDE. More in general, restricting an NSDE to Langevin dynamics enables the use of a large set of tools from computational molecular dynamics for the analysis of the obtained results.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
New Young Star Candidates in BRC 27 and BRC 34
Authors:
L. M. Rebull,
C. H. Johnson,
J. C. Gibbs,
M. Linahan,
D. Sartore,
R. Laher,
M. Legassie,
J. D. Armstrong,
L. E. Allen,
P. McGehee,
D. L. Padgett,
S. Aryal,
K. S. Badura,
T. S. Canakapalli,
S. Carlson,
M. Clark,
N. Ezyk,
J. Fagan,
N. Killingstad,
S. Koop,
T. McCanna,
M. M. Nishida,
T. R. Nuthmann,
A. O'Bryan,
A. PUllinger
, et al. (4 additional authors not shown)
Abstract:
We used archival Spitzer Space Telescope mid-infrared data to search for young stellar objects (YSOs) in the immediate vicinity of two bright-rimmed clouds, BRC 27 (part of CMa R1) and BRC 34 (part of the IC 1396 complex). These regions both appear to be actively forming young stars, perhaps triggered by the proximate OB stars. In BRC 27, we find clear infrared excesses around 22 of the 26 YSOs or…
▽ More
We used archival Spitzer Space Telescope mid-infrared data to search for young stellar objects (YSOs) in the immediate vicinity of two bright-rimmed clouds, BRC 27 (part of CMa R1) and BRC 34 (part of the IC 1396 complex). These regions both appear to be actively forming young stars, perhaps triggered by the proximate OB stars. In BRC 27, we find clear infrared excesses around 22 of the 26 YSOs or YSO candidates identified in the literature, and identify 16 new YSO candidates that appear to have IR excesses. In BRC 34, the one literature-identified YSO has an IR excess, and we suggest 13 new YSO candidates in this region, including a new Class I object. Considering the entire ensemble, both BRCs are likely of comparable ages, within the uncertainties of small number statistics and without spectroscopy to confirm or refute the YSO candidates. Similarly, no clear conclusions can yet be drawn about any possible age gradients that may be present across the BRCs.
△ Less
Submitted 6 November, 2012;
originally announced November 2012.