Search | arXiv e-print repository

Transferable Boltzmann Generators

Abstract: The generation of equilibrium samples of molecular systems has been a long-standing problem in statistical physics. Boltzmann Generators are a generative machine learning method that addresses this issue by learning a transformation via a normalizing flow from a simple prior distribution to the target Boltzmann distribution of interest. Recently, flow matching has been employed to train Boltzmann… ▽ More The generation of equilibrium samples of molecular systems has been a long-standing problem in statistical physics. Boltzmann Generators are a generative machine learning method that addresses this issue by learning a transformation via a normalizing flow from a simple prior distribution to the target Boltzmann distribution of interest. Recently, flow matching has been employed to train Boltzmann Generators for small molecular systems in Cartesian coordinates. We extend this work and propose a first framework for Boltzmann Generators that are transferable across chemical space, such that they predict zero-shot Boltzmann distributions for test molecules without being retrained for these systems. These transferable Boltzmann Generators allow approximate sampling from the target distribution of unseen systems, as well as efficient reweighting to the target Boltzmann distribution. The transferability of the proposed framework is evaluated on dipeptides, where we show that it generalizes efficiently to unseen systems. Furthermore, we demonstrate that our proposed architecture enhances the efficiency of Boltzmann Generators trained on single molecular systems. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2401.04082 [pdf, other]

Improved motif-scaffolding with SE(3) flow matching

Authors: Jason Yim, Andrew Campbell, Emile Mathieu, Andrew Y. K. Foong, Michael Gastegger, José Jiménez-Luna, Sarah Lewis, Victor Garcia Satorras, Bastiaan S. Veeling, Frank Noé, Regina Barzilay, Tommi S. Jaakkola

Abstract: Protein design often begins with knowledge of a desired function from a motif which motif-scaffolding aims to construct a functional protein around. Recently, generative models have achieved breakthrough success in designing scaffolds for a diverse range of motifs. However, the generated scaffolds tend to lack structural diversity, which can hinder success in wet-lab validation. In this work, we e… ▽ More Protein design often begins with knowledge of a desired function from a motif which motif-scaffolding aims to construct a functional protein around. Recently, generative models have achieved breakthrough success in designing scaffolds for a diverse range of motifs. However, the generated scaffolds tend to lack structural diversity, which can hinder success in wet-lab validation. In this work, we extend FrameFlow, an SE(3) flow matching model for protein backbone generation, to perform motif-scaffolding with two complementary approaches. The first is motif amortization, in which FrameFlow is trained with the motif as input using a data augmentation strategy. The second is motif guidance, which performs scaffolding using an estimate of the conditional score from FrameFlow, and requires no additional training. Both approaches achieve an equivalent or higher success rate than previous state-of-the-art methods, with 2.5 times more structurally diverse scaffolds. Code: https://github.com/ microsoft/frame-flow. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: Preprint. Code: https://github.com/ microsoft/frame-flow

arXiv:2310.18278 [pdf, other]

Navigating protein landscapes with a machine-learned transferable coarse-grained model

Authors: Nicholas E. Charron, Felix Musil, Andrea Guljas, Yaoyi Chen, Klara Bonneau, Aldo S. Pasos-Trejo, Jacopo Venturin, Daria Gusew, Iryna Zaporozhets, Andreas Krämer, Clark Templeton, Atharva Kelkar, Aleksander E. P. Durumeric, Simon Olsson, Adrià Pérez, Maciej Majewski, Brooke E. Husic, Ankit Patel, Gianni De Fabritiis, Frank Noé, Cecilia Clementi

Abstract: The most popular and universally predictive protein simulation models employ all-atom molecular dynamics (MD), but they come at extreme computational cost. The development of a universal, computationally efficient coarse-grained (CG) model with similar prediction performance has been a long-standing challenge. By combining recent deep learning methods with a large and diverse training set of all-a… ▽ More The most popular and universally predictive protein simulation models employ all-atom molecular dynamics (MD), but they come at extreme computational cost. The development of a universal, computationally efficient coarse-grained (CG) model with similar prediction performance has been a long-standing challenge. By combining recent deep learning methods with a large and diverse training set of all-atom protein simulations, we here develop a bottom-up CG force field with chemical transferability, which can be used for extrapolative molecular dynamics on new sequences not used during model parametrization. We demonstrate that the model successfully predicts folded structures, intermediates, metastable folded and unfolded basins, and the fluctuations of intrinsically disordered proteins while it is several orders of magnitude faster than an all-atom model. This showcases the feasibility of a universal and computationally efficient machine-learned CG model for proteins. △ Less

Submitted 27 October, 2023; originally announced October 2023.

arXiv:2309.05878 [pdf, other]

Reaction coordinate flows for model reduction of molecular kinetics

Authors: Hao Wu, Frank Noé

Abstract: In this work, we introduce a flow based machine learning approach, called reaction coordinate (RC) flow, for discovery of low-dimensional kinetic models of molecular systems. The RC flow utilizes a normalizing flow to design the coordinate transformation and a Brownian dynamics model to approximate the kinetics of RC, where all model parameters can be estimated in a data-driven manner. In contrast… ▽ More In this work, we introduce a flow based machine learning approach, called reaction coordinate (RC) flow, for discovery of low-dimensional kinetic models of molecular systems. The RC flow utilizes a normalizing flow to design the coordinate transformation and a Brownian dynamics model to approximate the kinetics of RC, where all model parameters can be estimated in a data-driven manner. In contrast to existing model reduction methods for molecular kinetics, RC flow offers a trainable and tractable model of reduced kinetics in continuous time and space due to the invertibility of the normalizing flow. Furthermore, the Brownian dynamics-based reduced kinetic model investigated in this work yields a readily discernible representation of metastable states within the phase space of the molecular system. Numerical experiments demonstrate how effectively the proposed method discovers interpretable and accurate low-dimensional representations of given full-state kinetics from simulations. △ Less

Submitted 11 September, 2023; originally announced September 2023.

arXiv:2306.15030 [pdf, other]

Equivariant flow matching

Authors: Leon Klein, Andreas Krämer, Frank Noé

Abstract: Normalizing flows are a class of deep generative models that are especially interesting for modeling probability distributions in physics, where the exact likelihood of flows allows reweighting to known target energy functions and computing unbiased observables. For instance, Boltzmann generators tackle the long-standing sampling problem in statistical physics by training flows to produce equilibr… ▽ More Normalizing flows are a class of deep generative models that are especially interesting for modeling probability distributions in physics, where the exact likelihood of flows allows reweighting to known target energy functions and computing unbiased observables. For instance, Boltzmann generators tackle the long-standing sampling problem in statistical physics by training flows to produce equilibrium samples of many-body systems such as small molecules and proteins. To build effective models for such systems, it is crucial to incorporate the symmetries of the target energy into the model, which can be achieved by equivariant continuous normalizing flows (CNFs). However, CNFs can be computationally expensive to train and generate samples from, which has hampered their scalability and practical application. In this paper, we introduce equivariant flow matching, a new training objective for equivariant CNFs that is based on the recently proposed optimal transport flow matching. Equivariant flow matching exploits the physical symmetries of the target energy for efficient, simulation-free training of equivariant CNFs. We demonstrate the effectiveness of flow matching on rotation and permutation invariant many-particle systems and a small molecule, alanine dipeptide, where for the first time we obtain a Boltzmann generator with significant sampling efficiency without relying on tailored internal coordinate featurization. Our results show that the equivariant flow matching objective yields flows with shorter integration paths, improved sampling efficiency, and higher scalability compared to existing methods. △ Less

Submitted 23 November, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

arXiv:2302.07071 [pdf, other]

Statistically Optimal Force Aggregation for Coarse-Graining Molecular Dynamics

Authors: Andreas Krämer, Aleksander P. Durumeric, Nicholas E. Charron, Yaoyi Chen, Cecilia Clementi, Frank Noé

Abstract: Machine-learned coarse-grained (CG) models have the potential for simulating large molecular complexes beyond what is possible with atomistic molecular dynamics. However, training accurate CG models remains a challenge. A widely used methodology for learning CG force-fields maps forces from all-atom molecular dynamics to the CG representation and matches them with a CG force-field on average. We s… ▽ More Machine-learned coarse-grained (CG) models have the potential for simulating large molecular complexes beyond what is possible with atomistic molecular dynamics. However, training accurate CG models remains a challenge. A widely used methodology for learning CG force-fields maps forces from all-atom molecular dynamics to the CG representation and matches them with a CG force-field on average. We show that there is flexibility in how to map all-atom forces to the CG representation, and that the most commonly used map** methods are statistically inefficient and potentially even incorrect in the presence of constraints in the all-atom simulation. We define an optimization statement for force map**s and demonstrate that substantially improved CG force-fields can be learned from the same simulation data when using optimized force maps. The method is demonstrated on the miniproteins Chignolin and Tryptophan Cage and published as open-source code. △ Less

Submitted 14 February, 2023; originally announced February 2023.

Comments: 44 pages, 19 figures

arXiv:2302.01170 [pdf, other]

Timewarp: Transferable Acceleration of Molecular Dynamics by Learning Time-Coarsened Dynamics

Authors: Leon Klein, Andrew Y. K. Foong, Tor Erlend Fjelde, Bruno Mlodozeniec, Marc Brockschmidt, Sebastian Nowozin, Frank Noé, Ryota Tomioka

Abstract: Molecular dynamics (MD) simulation is a widely used technique to simulate molecular systems, most commonly at the all-atom resolution where equations of motion are integrated with timesteps on the order of femtoseconds ($1\textrm{fs}=10^{-15}\textrm{s}$). MD is often used to compute equilibrium properties, which requires sampling from an equilibrium distribution such as the Boltzmann distribution.… ▽ More Molecular dynamics (MD) simulation is a widely used technique to simulate molecular systems, most commonly at the all-atom resolution where equations of motion are integrated with timesteps on the order of femtoseconds ($1\textrm{fs}=10^{-15}\textrm{s}$). MD is often used to compute equilibrium properties, which requires sampling from an equilibrium distribution such as the Boltzmann distribution. However, many important processes, such as binding and folding, occur over timescales of milliseconds or beyond, and cannot be efficiently sampled with conventional MD. Furthermore, new MD simulations need to be performed for each molecular system studied. We present Timewarp, an enhanced sampling method which uses a normalising flow as a proposal distribution in a Markov chain Monte Carlo method targeting the Boltzmann distribution. The flow is trained offline on MD trajectories and learns to make large steps in time, simulating the molecular dynamics of $10^{5} - 10^{6}\:\textrm{fs}$. Crucially, Timewarp is transferable between molecular systems: once trained, we show that it generalises to unseen small peptides (2-4 amino acids) at all-atom resolution, exploring their metastable states and providing wall-clock acceleration of sampling compared to standard MD. Our method constitutes an important step towards general, transferable algorithms for accelerating MD. △ Less

Submitted 1 December, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

arXiv:2301.11355 [pdf, other]

Rigid Body Flows for Sampling Molecular Crystal Structures

Authors: Jonas Köhler, Michele Invernizzi, Pim de Haan, Frank Noé

Abstract: Normalizing flows (NF) are a class of powerful generative models that have gained popularity in recent years due to their ability to model complex distributions with high flexibility and expressiveness. In this work, we introduce a new type of normalizing flow that is tailored for modeling positions and orientations of multiple objects in three-dimensional space, such as molecules in a crystal. Ou… ▽ More Normalizing flows (NF) are a class of powerful generative models that have gained popularity in recent years due to their ability to model complex distributions with high flexibility and expressiveness. In this work, we introduce a new type of normalizing flow that is tailored for modeling positions and orientations of multiple objects in three-dimensional space, such as molecules in a crystal. Our approach is based on two key ideas: first, we define smooth and expressive flows on the group of unit quaternions, which allows us to capture the continuous rotational motion of rigid bodies; second, we use the double cover property of unit quaternions to define a proper density on the rotation group. This ensures that our model can be trained using standard likelihood-based methods or variational inference with respect to a thermodynamic target density. We evaluate the method by training Boltzmann generators for two molecular examples, namely the multi-modal density of a tetrahedral system in an external field and the ice XI phase in the TIP4P water model. Our flows can be combined with flows operating on the internal degrees of freedom of molecules and constitute an important step towards the modeling of distributions of many interacting molecules. △ Less

Submitted 7 June, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

Comments: International Conference on Machine Learning, 2023

arXiv:2208.12590 [pdf, other]

doi 10.1038/s41570-023-00516-8

Ab-initio quantum chemistry with neural-network wavefunctions

Authors: Jan Hermann, James Spencer, Kenny Choo, Antonio Mezzacapo, W. M. C. Foulkes, David Pfau, Giuseppe Carleo, Frank Noé

Abstract: Machine learning and specifically deep-learning methods have outperformed human capabilities in many pattern recognition and data processing problems, in game playing, and now also play an increasingly important role in scientific discovery. A key application of machine learning in the molecular sciences is to learn potential energy surfaces or force fields from ab-initio solutions of the electron… ▽ More Machine learning and specifically deep-learning methods have outperformed human capabilities in many pattern recognition and data processing problems, in game playing, and now also play an increasingly important role in scientific discovery. A key application of machine learning in the molecular sciences is to learn potential energy surfaces or force fields from ab-initio solutions of the electronic Schrödinger equation using datasets obtained with density functional theory, coupled cluster, or other quantum chemistry methods. Here we review a recent and complementary approach: using machine learning to aid the direct solution of quantum chemistry problems from first principles. Specifically, we focus on quantum Monte Carlo (QMC) methods that use neural network ansatz functions in order to solve the electronic Schrödinger equation, both in first and second quantization, computing ground and excited states, and generalizing over multiple nuclear configurations. Compared to existing quantum chemistry methods, these new deep QMC methods have the potential to generate highly accurate solutions of the Schrödinger equation at relatively modest computational cost. △ Less

Submitted 26 August, 2022; originally announced August 2022.

Comments: review, 17 pages, 6 figures

Journal ref: Nat Rev Chem 7, 692-709 (2023)

arXiv:2203.09472 [pdf, other]

doi 10.1038/s41467-022-35534-5

Electronic excited states in deep variational Monte Carlo

Authors: Mike Entwistle, Zeno Schätzle, Paolo A. Erdman, Jan Hermann, Frank Noé

Abstract: Obtaining accurate ground and low-lying excited states of electronic systems is crucial in a multitude of important applications. One ab initio method for solving the Schrödinger equation that scales favorably for large systems is variational quantum Monte Carlo (QMC). The recently introduced deep QMC approach uses ansatzes represented by deep neural networks and generates nearly exact ground-stat… ▽ More Obtaining accurate ground and low-lying excited states of electronic systems is crucial in a multitude of important applications. One ab initio method for solving the Schrödinger equation that scales favorably for large systems is variational quantum Monte Carlo (QMC). The recently introduced deep QMC approach uses ansatzes represented by deep neural networks and generates nearly exact ground-state solutions for molecules containing up to a few dozen electrons, with the potential to scale to much larger systems where other highly accurate methods are not feasible. In this paper, we extend one such ansatz (PauliNet) to compute electronic excited states. We demonstrate our method on various small atoms and molecules and consistently achieve high accuracy for low-lying states. To highlight the method's potential, we compute the first excited state of the much larger benzene molecule, as well as the conical intersection of ethylene, with PauliNet matching results of more expensive high-level methods. △ Less

Submitted 18 January, 2023; v1 submitted 17 March, 2022; originally announced March 2022.

arXiv:2110.15013 [pdf, other]

doi 10.1088/2632-2153/ac3de0

Deeptime: a Python library for machine learning dynamical models from time series data

Authors: Moritz Hoffmann, Martin Scherer, Tim Hempel, Andreas Mardt, Brian de Silva, Brooke E. Husic, Stefan Klus, Hao Wu, Nathan Kutz, Steven L. Brunton, Frank Noé

Abstract: Generation and analysis of time-series data is relevant to many quantitative fields ranging from economics to fluid mechanics. In the physical sciences, structures such as metastable and coherent sets, slow relaxation processes, collective variables dominant transition pathways or manifolds and channels of probability flow can be of great importance for understanding and characterizing the kinetic… ▽ More Generation and analysis of time-series data is relevant to many quantitative fields ranging from economics to fluid mechanics. In the physical sciences, structures such as metastable and coherent sets, slow relaxation processes, collective variables dominant transition pathways or manifolds and channels of probability flow can be of great importance for understanding and characterizing the kinetic, thermodynamic and mechanistic properties of the system. Deeptime is a general purpose Python library offering various tools to estimate dynamical models based on time-series data including conventional linear learning methods, such as Markov state models (MSMs), Hidden Markov Models and Koopman models, as well as kernel and deep learning approaches such as VAMPnets and deep MSMs. The library is largely compatible with scikit-learn, having a range of Estimator classes for these different models, but in contrast to scikit-learn also provides deep Model classes, e.g. in the case of an MSM, which provide a multitude of analysis methods to compute interesting thermodynamic, kinetic and dynamical quantities, such as free energies, relaxation times and transition paths. The library is designed for ease of use but also easily maintainable and extensible code. In this paper we introduce the main features and structure of the deeptime software. △ Less

Submitted 11 December, 2021; v1 submitted 28 October, 2021; originally announced October 2021.

Journal ref: Machine Learning: Science and Technology, Volume 3, Number 1, 2021

arXiv:2110.00351 [pdf, other]

Smooth Normalizing Flows

Authors: Jonas Köhler, Andreas Krämer, Frank Noé

Abstract: Normalizing flows are a promising tool for modeling probability distributions in physical systems. While state-of-the-art flows accurately approximate distributions and energies, applications in physics additionally require smooth energies to compute forces and higher-order derivatives. Furthermore, such densities are often defined on non-trivial topologies. A recent example are Boltzmann Generato… ▽ More Normalizing flows are a promising tool for modeling probability distributions in physical systems. While state-of-the-art flows accurately approximate distributions and energies, applications in physics additionally require smooth energies to compute forces and higher-order derivatives. Furthermore, such densities are often defined on non-trivial topologies. A recent example are Boltzmann Generators for generating 3D-structures of peptides and small proteins. These generative models leverage the space of internal coordinates (dihedrals, angles, and bonds), which is a product of hypertori and compact intervals. In this work, we introduce a class of smooth mixture transformations working on both compact intervals and hypertori. Mixture transformations employ root-finding methods to invert them in practice, which has so far prevented bi-directional flow training. To this end, we show that parameter gradients and forces of such inverses can be computed from forward evaluations via the inverse function theorem. We demonstrate two advantages of such smooth flows: they allow training by force matching to simulation data and can be used as potentials in molecular dynamics simulations. △ Less

Submitted 30 November, 2021; v1 submitted 1 October, 2021; originally announced October 2021.

Comments: Neural Information Proceessing Systems (NeurIPS) 2021

arXiv:2106.07492 [pdf, other]

doi 10.1063/5.0059915

Machine Learning Implicit Solvation for Molecular Dynamics

Authors: Yaoyi Chen, Andreas Krämer, Nicholas E. Charron, Brooke E. Husic, Cecilia Clementi, Frank Noé

Abstract: Accurate modeling of the solvent environment for biological molecules is crucial for computational biology and drug design. A popular approach to achieve long simulation time scales for large system sizes is to incorporate the effect of the solvent in a mean-field fashion with implicit solvent models. However, a challenge with existing implicit solvent models is that they often lack accuracy or ce… ▽ More Accurate modeling of the solvent environment for biological molecules is crucial for computational biology and drug design. A popular approach to achieve long simulation time scales for large system sizes is to incorporate the effect of the solvent in a mean-field fashion with implicit solvent models. However, a challenge with existing implicit solvent models is that they often lack accuracy or certain physical properties compared to explicit solvent models, as the many-body effects of the neglected solvent molecules is difficult to model as a mean field. Here, we leverage machine learning (ML) and multi-scale coarse graining (CG) in order to learn implicit solvent models that can approximate the energetic and thermodynamic properties of a given explicit solvent model with arbitrary accuracy, given enough training data. Following the previous ML--CG models CGnet and CGSchnet, we introduce ISSNet, a graph neural network, to model the implicit solvent potential of mean force. ISSNet can learn from explicit solvent simulation data and be readily applied to MD simulations. We compare the solute conformational distributions under different solvation treatments for two peptide systems. The results indicate that ISSNet models can outperform widely-used generalized Born and surface area models in reproducing the thermodynamics of small protein systems with respect to explicit solvent. The success of this novel method demonstrates the potential benefit of applying machine learning methods in accurate modeling of solvent effects for in silico research and biomedical applications. △ Less

Submitted 26 August, 2021; v1 submitted 14 June, 2021; originally announced June 2021.

Comments: 16 pages, 6 figures

Journal ref: J. Chem. Phys. 155, 084101 (2021)

arXiv:2103.17233 [pdf, other]

doi 10.1088/2632-2153/ac14ad

Symmetric and antisymmetric kernels for machine learning problems in quantum physics and chemistry

Authors: Stefan Klus, Patrick Gelß, Feliks Nüske, Frank Noé

Abstract: We derive symmetric and antisymmetric kernels by symmetrizing and antisymmetrizing conventional kernels and analyze their properties. In particular, we compute the feature space dimensions of the resulting polynomial kernels, prove that the reproducing kernel Hilbert spaces induced by symmetric and antisymmetric Gaussian kernels are dense in the space of symmetric and antisymmetric functions, and… ▽ More We derive symmetric and antisymmetric kernels by symmetrizing and antisymmetrizing conventional kernels and analyze their properties. In particular, we compute the feature space dimensions of the resulting polynomial kernels, prove that the reproducing kernel Hilbert spaces induced by symmetric and antisymmetric Gaussian kernels are dense in the space of symmetric and antisymmetric functions, and propose a Slater determinant representation of the antisymmetric Gaussian kernel, which allows for an efficient evaluation even if the state space is high-dimensional. Furthermore, we show that by exploiting symmetries or antisymmetries the size of the training data set can be significantly reduced. The results are illustrated with guiding examples and simple quantum physics and chemistry applications. △ Less

Submitted 26 June, 2021; v1 submitted 31 March, 2021; originally announced March 2021.

Journal ref: Machine Learning: Science and Technology, Volume 2, Number 4, 2021

arXiv:2010.07033 [pdf, other]

Training Invertible Linear Layers through Rank-One Perturbations

Authors: Andreas Krämer, Jonas Köhler, Frank Noé

Abstract: Many types of neural network layers rely on matrix properties such as invertibility or orthogonality. Retaining such properties during optimization with gradient-based stochastic optimizers is a challenging task, which is usually addressed by either reparameterization of the affected parameters or by directly optimizing on the manifold. This work presents a novel approach for training invertible l… ▽ More Many types of neural network layers rely on matrix properties such as invertibility or orthogonality. Retaining such properties during optimization with gradient-based stochastic optimizers is a challenging task, which is usually addressed by either reparameterization of the affected parameters or by directly optimizing on the manifold. This work presents a novel approach for training invertible linear layers. In lieu of directly optimizing the network parameters, we train rank-one perturbations and add them to the actual weight matrices infrequently. This P$^{4}$Inv update allows kee** track of inverses and determinants without ever explicitly computing them. We show how such invertible blocks improve the mixing and thus the mode separation of the resulting normalizing flows. Furthermore, we outline how the P$^4$ concept can be utilized to retain properties other than invertibility. △ Less

Submitted 30 November, 2020; v1 submitted 14 October, 2020; originally announced October 2020.

Comments: 17 pages, 10 figures

MSC Class: 68T07; 82-10

arXiv:2010.05316 [pdf, other]

doi 10.1063/5.0032836

Convergence to the fixed-node limit in deep variational Monte Carlo

Authors: Zeno Schätzle, Jan Hermann, Frank Noé

Abstract: Variational quantum Monte Carlo (QMC) is an ab-initio method for solving the electronic Schrödinger equation that is exact in principle, but limited by the flexibility of the available ansatzes in practice. The recently introduced deep QMC approach, specifically two deep-neural-network ansatzes PauliNet and FermiNet, allows variational QMC to reach the accuracy of diffusion QMC, but little is unde… ▽ More Variational quantum Monte Carlo (QMC) is an ab-initio method for solving the electronic Schrödinger equation that is exact in principle, but limited by the flexibility of the available ansatzes in practice. The recently introduced deep QMC approach, specifically two deep-neural-network ansatzes PauliNet and FermiNet, allows variational QMC to reach the accuracy of diffusion QMC, but little is understood about the convergence behavior of such ansatzes. Here, we analyze how deep variational QMC approaches the fixed-node limit with increasing network size. First, we demonstrate that a deep neural network can overcome the limitations of a small basis set and reach the mean-field complete-basis-set limit. Moving to electron correlation, we then perform an extensive hyperparameter scan of a deep Jastrow factor for LiH and H$_4$ and find that variational energies at the fixed-node limit can be obtained with a sufficiently large network. Finally, we benchmark mean-field and many-body ansatzes on H$_2$O, increasing the fraction of recovered fixed-node correlation energy of single-determinant Slater--Jastrow-type ansatzes by half an order of magnitude compared to previous variational QMC results and demonstrate that a single-determinant Slater--Jastrow--backflow version of the ansatz overcomes the fixed-node limitations. This analysis helps understanding the superb accuracy of deep variational ansatzes in comparison to the traditional trial wavefunctions at the respective level of theory, and will guide future improvements of the neural network architectures in deep QMC. △ Less

Submitted 25 March, 2021; v1 submitted 11 October, 2020; originally announced October 2020.

Comments: This article may be downloaded for personal use only. Any other use requires prior permission of the author and AIP Publishing. This article appeared in J. Chem. Phys., vol. 154, no. 12, p. 124108, Mar. 2021 and may be found at https://doi.org/10.1063/5.0032836

arXiv:2008.08461 [pdf, other]

Relevance of Rotationally Equivariant Convolutions for Predicting Molecular Properties

Authors: Benjamin Kurt Miller, Mario Geiger, Tess E. Smidt, Frank Noé

Abstract: Equivariant neural networks (ENNs) are graph neural networks embedded in $\mathbb{R}^3$ and are well suited for predicting molecular properties. The ENN library e3nn has customizable convolutions, which can be designed to depend only on distances between points, or also on angular features, making them rotationally invariant, or equivariant, respectively. This paper studies the practical value of… ▽ More Equivariant neural networks (ENNs) are graph neural networks embedded in $\mathbb{R}^3$ and are well suited for predicting molecular properties. The ENN library e3nn has customizable convolutions, which can be designed to depend only on distances between points, or also on angular features, making them rotationally invariant, or equivariant, respectively. This paper studies the practical value of including angular dependencies for molecular property prediction directly via an ablation study with \texttt{e3nn} and the QM9 data set. We find that, for fixed network depth and parameter count, adding angular features decreased test error by an average of 23%. Meanwhile, increasing network depth decreased test error by only 4% on average, implying that rotationally equivariant layers are comparatively parameter efficient. We present an explanation of the accuracy improvement on the dipole moment, the target which benefited most from the introduction of angular features. △ Less

Submitted 24 November, 2020; v1 submitted 19 August, 2020; originally announced August 2020.

Comments: Machine Learning for Molecules Workshop at NeurIPS 2020, NeurIPS workshop on Interpretable Inductive Biases and Physically Structured Learning

arXiv:2007.11412 [pdf, other]

doi 10.1063/5.0026133

Coarse Graining Molecular Dynamics with Graph Neural Networks

Authors: Brooke E. Husic, Nicholas E. Charron, Dominik Lemm, Jiang Wang, Adrià Pérez, Maciej Majewski, Andreas Krämer, Yaoyi Chen, Simon Olsson, Gianni de Fabritiis, Frank Noé, Cecilia Clementi

Abstract: Coarse graining enables the investigation of molecular dynamics for larger systems and at longer timescales than is possible at atomic resolution. However, a coarse graining model must be formulated such that the conclusions we draw from it are consistent with the conclusions we would draw from a model at a finer level of detail. It has been proven that a force matching scheme defines a thermodyna… ▽ More Coarse graining enables the investigation of molecular dynamics for larger systems and at longer timescales than is possible at atomic resolution. However, a coarse graining model must be formulated such that the conclusions we draw from it are consistent with the conclusions we would draw from a model at a finer level of detail. It has been proven that a force matching scheme defines a thermodynamically consistent coarse-grained model for an atomistic system in the variational limit. Wang et al. [ACS Cent. Sci. 5, 755 (2019)] demonstrated that the existence of such a variational limit enables the use of a supervised machine learning framework to generate a coarse-grained force field, which can then be used for simulation in the coarse-grained space. Their framework, however, requires the manual input of molecular features upon which to machine learn the force field. In the present contribution, we build upon the advance of Wang et al.and introduce a hybrid architecture for the machine learning of coarse-grained force fields that learns their own features via a subnetwork that leverages continuous filter convolutions on a graph neural network architecture. We demonstrate that this framework succeeds at reproducing the thermodynamics for small biomolecular systems. Since the learned molecular representations are inherently transferable, the architecture presented here sets the stage for the development of machine-learned, coarse-grained force fields that are transferable across molecular systems. △ Less

Submitted 6 November, 2020; v1 submitted 22 July, 2020; originally announced July 2020.

Comments: 17 pages, 9 figures

arXiv:2006.02425 [pdf, other]

Equivariant Flows: Exact Likelihood Generative Learning for Symmetric Densities

Authors: Jonas Köhler, Leon Klein, Frank Noé

Abstract: Normalizing flows are exact-likelihood generative neural networks which approximately transform samples from a simple prior distribution to samples of the probability distribution of interest. Recent work showed that such generative models can be utilized in statistical mechanics to sample equilibrium states of many-body systems in physics and chemistry. To scale and generalize these results, it i… ▽ More Normalizing flows are exact-likelihood generative neural networks which approximately transform samples from a simple prior distribution to samples of the probability distribution of interest. Recent work showed that such generative models can be utilized in statistical mechanics to sample equilibrium states of many-body systems in physics and chemistry. To scale and generalize these results, it is essential that the natural symmetries in the probability density -- in physics defined by the invariances of the target potential -- are built into the flow. We provide a theoretical sufficient criterion showing that the distribution generated by \textit{equivariant} normalizing flows is invariant with respect to these symmetries by design. Furthermore, we propose building blocks for flows which preserve symmetries which are usually found in physical/chemical many-body particle systems. Using benchmark systems motivated from molecular physics, we demonstrate that those symmetry preserving flows can provide better generalization capabilities and sampling efficiency. △ Less

Submitted 26 October, 2020; v1 submitted 3 June, 2020; originally announced June 2020.

arXiv:2005.01851 [pdf, other]

doi 10.1063/5.0007276

Ensemble Learning of Coarse-Grained Molecular Dynamics Force Fields with a Kernel Approach

Authors: Jiang Wang, Stefan Chmiela, Klaus-Robert Müller, Frank Noè, Cecilia Clementi

Abstract: Gradient-domain machine learning (GDML) is an accurate and efficient approach to learn a molecular potential and associated force field based on the kernel ridge regression algorithm. Here, we demonstrate its application to learn an effective coarse-grained (CG) model from all-atom simulation data in a sample efficient manner. The coarse-grained force field is learned by following the thermodynami… ▽ More Gradient-domain machine learning (GDML) is an accurate and efficient approach to learn a molecular potential and associated force field based on the kernel ridge regression algorithm. Here, we demonstrate its application to learn an effective coarse-grained (CG) model from all-atom simulation data in a sample efficient manner. The coarse-grained force field is learned by following the thermodynamic consistency principle, here by minimizing the error between the predicted coarse-grained force and the all-atom mean force in the coarse-grained coordinates. Solving this problem by GDML directly is impossible because coarse-graining requires averaging over many training data points, resulting in impractical memory requirements for storing the kernel matrices. In this work, we propose a data-efficient and memory-saving alternative. Using ensemble learning and stratified sampling, we propose a 2-layer training scheme that enables GDML to learn an effective coarse-grained model. We illustrate our method on a simple biomolecular system, alanine dipeptide, by reconstructing the free energy landscape of a coarse-grained variant of this molecule. Our novel GDML training scheme yields a smaller free energy error than neural networks when the training set is small, and a comparably high accuracy when the training set is sufficiently large. △ Less

Submitted 4 May, 2020; originally announced May 2020.

Comments: 14 pages, 6 figures

arXiv:2002.06707 [pdf, other]

Stochastic Normalizing Flows

Authors: Hao Wu, Jonas Köhler, Frank Noé

Abstract: The sampling of probability distributions specified up to a normalization constant is an important problem in both machine learning and statistical mechanics. While classical stochastic sampling methods such as Markov Chain Monte Carlo (MCMC) or Langevin Dynamics (LD) can suffer from slow mixing times there is a growing interest in using normalizing flows in order to learn the transformation of a… ▽ More The sampling of probability distributions specified up to a normalization constant is an important problem in both machine learning and statistical mechanics. While classical stochastic sampling methods such as Markov Chain Monte Carlo (MCMC) or Langevin Dynamics (LD) can suffer from slow mixing times there is a growing interest in using normalizing flows in order to learn the transformation of a simple prior distribution to the given target distribution. Here we propose a generalized and combined approach to sample target densities: Stochastic Normalizing Flows (SNF) -- an arbitrary sequence of deterministic invertible functions and stochastic sampling blocks. We show that stochasticity overcomes expressivity limitations of normalizing flows resulting from the invertibility constraint, whereas trainable transformations between sampling steps improve efficiency of pure MCMC/LD along the flow. By invoking ideas from non-equilibrium statistical mechanics we derive an efficient training procedure by which both the sampler's and the flow's parameters can be optimized end-to-end, and by which we can compute exact importance weights without having to marginalize out the randomness of the stochastic blocks. We illustrate the representational power, sampling efficiency and asymptotic correctness of SNFs on several benchmarks including applications to sampling molecular systems in equilibrium. △ Less

Submitted 26 October, 2020; v1 submitted 16 February, 2020; originally announced February 2020.

arXiv:1911.09811 [pdf, other]

Machine learning for protein folding and dynamics

Authors: Frank Noé, Gianni De Fabritiis, Cecilia Clementi

Abstract: Many aspects of the study of protein folding and dynamics have been affected by the recent advances in machine learning. Methods for the prediction of protein structures from their sequences are now heavily based on machine learning tools. The way simulations are performed to explore the energy landscape of protein systems is also changing as force-fields are started to be designed by means of mac… ▽ More Many aspects of the study of protein folding and dynamics have been affected by the recent advances in machine learning. Methods for the prediction of protein structures from their sequences are now heavily based on machine learning tools. The way simulations are performed to explore the energy landscape of protein systems is also changing as force-fields are started to be designed by means of machine learning methods. These methods are also used to extract the essential information from large simulation datasets and to enhance the sampling of rare events such as folding/unfolding transitions. While significant challenges still need to be tackled, we expect these methods to play an important role on the study of protein folding and dynamics in the near future. We discuss here the recent advances on all these fronts and the questions that need to be addressed for machine learning approaches to become mainstream in protein simulation. △ Less

Submitted 21 November, 2019; originally announced November 2019.

arXiv:1910.03131 [pdf, other]

Generating valid Euclidean distance matrices

Authors: Moritz Hoffmann, Frank Noé

Abstract: Generating point clouds, e.g., molecular structures, in arbitrary rotations, translations, and enumerations remains a challenging task. Meanwhile, neural networks utilizing symmetry invariant layers have been shown to be able to optimize their training objective in a data-efficient way. In this spirit, we present an architecture which allows to produce valid Euclidean distance matrices, which by c… ▽ More Generating point clouds, e.g., molecular structures, in arbitrary rotations, translations, and enumerations remains a challenging task. Meanwhile, neural networks utilizing symmetry invariant layers have been shown to be able to optimize their training objective in a data-efficient way. In this spirit, we present an architecture which allows to produce valid Euclidean distance matrices, which by construction are already invariant under rotation and translation of the described object. Motivated by the goal to generate molecular structures in Cartesian space, we use this architecture to construct a Wasserstein GAN utilizing a permutation invariant critic network. This makes it possible to generate molecular structures in a one-shot fashion by producing Euclidean distance matrices which have a three-dimensional embedding. △ Less

Submitted 14 November, 2019; v1 submitted 7 October, 2019; originally announced October 2019.

Comments: 9 pages, 5 figures

arXiv:1910.00753 [pdf, other]

Equivariant Flows: sampling configurations for multi-body systems with symmetric energies

Authors: Jonas Köhler, Leon Klein, Frank Noé

Abstract: Flows are exact-likelihood generative neural networks that transform samples from a simple prior distribution to the samples of the probability distribution of interest. Boltzmann Generators (BG) combine flows and statistical mechanics to sample equilibrium states of strongly interacting many-body systems such as proteins with 1000 atoms. In order to scale and generalize these results, it is essen… ▽ More Flows are exact-likelihood generative neural networks that transform samples from a simple prior distribution to the samples of the probability distribution of interest. Boltzmann Generators (BG) combine flows and statistical mechanics to sample equilibrium states of strongly interacting many-body systems such as proteins with 1000 atoms. In order to scale and generalize these results, it is essential that the natural symmetries of the probability density - in physics defined by the invariances of the energy function - are built into the flow. Here we develop theoretical tools for constructing such equivariant flows and demonstrate that a BG that is equivariant with respect to rotations and particle permutations can generalize to sampling nontrivially new configurations where a nonequivariant BG cannot. △ Less

Submitted 1 October, 2019; originally announced October 2019.

arXiv:1909.08423 [pdf, other]

doi 10.1038/s41557-020-0544-y

Deep neural network solution of the electronic Schrödinger equation

Authors: Jan Hermann, Zeno Schätzle, Frank Noé

Abstract: [New and updated results were published in Nature Chemistry, doi:10.1038/s41557-020-0544-y.] The electronic Schrödinger equation describes fundamental properties of molecules and materials, but can only be solved analytically for the hydrogen atom. The numerically exact full configuration-interaction method is exponentially expensive in the number of electrons. Quantum Monte Carlo is a possible wa… ▽ More [New and updated results were published in Nature Chemistry, doi:10.1038/s41557-020-0544-y.] The electronic Schrödinger equation describes fundamental properties of molecules and materials, but can only be solved analytically for the hydrogen atom. The numerically exact full configuration-interaction method is exponentially expensive in the number of electrons. Quantum Monte Carlo is a possible way out: it scales well to large molecules, can be parallelized, and its accuracy has, as yet, only been limited by the flexibility of the used wave function ansatz. Here we propose PauliNet, a deep-learning wave function ansatz that achieves nearly exact solutions of the electronic Schrödinger equation. PauliNet has a multireference Hartree-Fock solution built in as a baseline, incorporates the physics of valid wave functions, and is trained using variational quantum Monte Carlo (VMC). PauliNet outperforms comparable state-of-the-art VMC ansatzes for atoms, diatomic molecules and a strongly-correlated hydrogen chain by a margin and is yet computationally efficient. We anticipate that thanks to the favourable scaling with system size, this method may become a new leading method for highly accurate electronic-strucutre calculations on medium-sized molecular systems. △ Less

Submitted 23 September, 2020; v1 submitted 16 September, 2019; originally announced September 2019.

Comments: Add notice about new results in journal-published version

arXiv:1904.07752 [pdf, other]

doi 10.1063/1.5100267

Kernel methods for detecting coherent structures in dynamical data

Authors: Stefan Klus, Brooke E. Husic, Mattes Mollenhauer, Frank Noé

Abstract: We illustrate relationships between classical kernel-based dimensionality reduction techniques and eigendecompositions of empirical estimates of reproducing kernel Hilbert space (RKHS) operators associated with dynamical systems. In particular, we show that kernel canonical correlation analysis (CCA) can be interpreted in terms of kernel transfer operators and that it can be obtained by optimizing… ▽ More We illustrate relationships between classical kernel-based dimensionality reduction techniques and eigendecompositions of empirical estimates of reproducing kernel Hilbert space (RKHS) operators associated with dynamical systems. In particular, we show that kernel canonical correlation analysis (CCA) can be interpreted in terms of kernel transfer operators and that it can be obtained by optimizing the variational approach for Markov processes (VAMP) score. As a result, we show that coherent sets of particle trajectories can be computed by kernel CCA. We demonstrate the efficiency of this approach with several examples, namely the well-known Bickley jet, ocean drifter data, and a molecular dynamics problem with a time-dependent potential. Finally, we propose a straightforward generalization of dynamic mode decomposition (DMD) called coherent mode decomposition (CMD). Our results provide a generic machine learning approach to the computation of coherent sets with an objective score that can be used for cross-validation and the comparison of different methods. △ Less

Submitted 7 October, 2019; v1 submitted 16 April, 2019; originally announced April 2019.

arXiv:1812.07669 [pdf, other]

Machine Learning for Molecular Dynamics on Long Timescales

Authors: Frank Noé

Abstract: Molecular Dynamics (MD) simulation is widely used to analyze the properties of molecules and materials. Most practical applications, such as comparison with experimental measurements, designing drug molecules, or optimizing materials, rely on statistical quantities, which may be prohibitively expensive to compute from direct long-time MD simulations. Classical Machine Learning (ML) techniques have… ▽ More Molecular Dynamics (MD) simulation is widely used to analyze the properties of molecules and materials. Most practical applications, such as comparison with experimental measurements, designing drug molecules, or optimizing materials, rely on statistical quantities, which may be prohibitively expensive to compute from direct long-time MD simulations. Classical Machine Learning (ML) techniques have already had a profound impact on the field, especially for learning low-dimensional models of the long-time dynamics and for devising more efficient sampling schemes for computing long-time statistics. Novel ML methods have the potential to revolutionize long-timescale MD and to obtain interpretable models. ML concepts such as statistical estimator theory, end-to-end learning, representation learning and active learning are highly interesting for the MD researcher and will help to develop new solutions to hard MD problems. With the aim of better connecting the MD and ML research areas and spawning new research on this interface, we define the learning problems in long-timescale MD, present successful approaches and outline some of the unsolved ML problems in this application field. △ Less

Submitted 18 December, 2018; originally announced December 2018.

arXiv:1812.01736 [pdf, other]

Machine Learning of coarse-grained Molecular Dynamics Force Fields

Authors: Jiang Wang, Simon Olsson, Christoph Wehmeyer, Adria Perez, Nicholas E. Charron, Gianni de Fabritiis, Frank Noe, Cecilia Clementi

Abstract: Atomistic or ab-initio molecular dynamics simulations are widely used to predict thermodynamics and kinetics and relate them to molecular structure. A common approach to go beyond the time- and length-scales accessible with such computationally expensive simulations is the definition of coarse-grained molecular models. Existing coarse-graining approaches define an effective interaction potential t… ▽ More Atomistic or ab-initio molecular dynamics simulations are widely used to predict thermodynamics and kinetics and relate them to molecular structure. A common approach to go beyond the time- and length-scales accessible with such computationally expensive simulations is the definition of coarse-grained molecular models. Existing coarse-graining approaches define an effective interaction potential to match defined properties of high-resolution models or experimental data. In this paper, we reformulate coarse-graining as a supervised machine learning problem. We use statistical learning theory to decompose the coarse-graining error and cross-validation to select and compare the performance of different models. We introduce CGnets, a deep learning approach, that learns coarse-grained free energy functions and can be trained by a force matching scheme. CGnets maintain all physically relevant invariances and allow one to incorporate prior physics knowledge to avoid sampling of unphysical structures. We show that CGnets can capture all-atom explicit-solvent free energy surfaces with models using only a few coarse-grained beads and no solvent, while classical coarse-graining methods fail to capture crucial features of the free energy surface. Thus, CGnets are able to capture multi-body terms that emerge from the dimensionality reduction. △ Less

Submitted 3 April, 2019; v1 submitted 4 December, 2018; originally announced December 2018.

arXiv:1812.01729 [pdf, other]

Boltzmann Generators -- Sampling Equilibrium States of Many-Body Systems with Deep Learning

Authors: Frank Noé, Simon Olsson, Jonas Köhler, Hao Wu

Abstract: Computing equilibrium states in condensed-matter many-body systems, such as solvated proteins, is a long-standing challenge. Lacking methods for generating statistically independent equilibrium samples in "one shot", vast computational effort is invested for simulating these system in small steps, e.g., using Molecular Dynamics. Combining deep learning and statistical mechanics, we here develop Bo… ▽ More Computing equilibrium states in condensed-matter many-body systems, such as solvated proteins, is a long-standing challenge. Lacking methods for generating statistically independent equilibrium samples in "one shot", vast computational effort is invested for simulating these system in small steps, e.g., using Molecular Dynamics. Combining deep learning and statistical mechanics, we here develop Boltzmann Generators, that are shown to generate unbiased one-shot equilibrium samples of representative condensed matter systems and proteins. Boltzmann Generators use neural networks to learn a coordinate transformation of the complex configurational equilibrium distribution to a distribution that can be easily sampled. Accurate computation of free energy differences and discovery of new configurations are demonstrated, providing a statistical mechanics tool that can avoid rare events during sampling without prior knowledge of reaction coordinates. △ Less

Submitted 12 July, 2019; v1 submitted 4 December, 2018; originally announced December 2018.

arXiv:1811.11714 [pdf, other]

doi 10.1063/1.5083040

Variational Selection of Features for Molecular Kinetics

Authors: Martin K. Scherer, Brooke E. Husic, Moritz Hoffmann, Fabian Paul, Hao Wu, Frank Noé

Abstract: The modeling of atomistic biomolecular simulations using kinetic models such as Markov state models (MSMs) has had many notable algorithmic advances in recent years. The variational principle has opened the door for a nearly fully automated toolkit for selecting models that predict the long-time kinetics from molecular dynamics simulations. However, one yet-unoptimized step of the pipeline involve… ▽ More The modeling of atomistic biomolecular simulations using kinetic models such as Markov state models (MSMs) has had many notable algorithmic advances in recent years. The variational principle has opened the door for a nearly fully automated toolkit for selecting models that predict the long-time kinetics from molecular dynamics simulations. However, one yet-unoptimized step of the pipeline involves choosing the features, or collective variables, from which the model should be constructed. In order to build intuitive models, these collective variables are often sought to be interpretable and familiar features, such as torsional angles or contact distances in a protein structure. However, previous approaches for evaluating the chosen features rely on constructing a full MSM, which in turn requires additional hyperparameters to be chosen, and hence leads to a computationally expensive framework. Here, we present a method to optimize the feature choice directly, without requiring the construction of the final kinetic model. We demonstrate our rigorous preprocessing algorithm on a canonical set of twelve fast-folding protein simulations, and show that our procedure leads to more efficient model selection. △ Less

Submitted 25 April, 2019; v1 submitted 28 November, 2018; originally announced November 2018.

Comments: 13 pages, 8 figures

Journal ref: J. Chem. Phys. 2019, 150, 194108

arXiv:1805.07601 [pdf, other]

Deep Generative Markov State Models

Authors: Hao Wu, Andreas Mardt, Luca Pasquali, Frank Noe

Abstract: We propose a deep generative Markov State Model (DeepGenMSM) learning framework for inference of metastable dynamical systems and prediction of trajectories. After unsupervised training on time series data, the model contains (i) a probabilistic encoder that maps from high-dimensional configuration space to a small-sized vector indicating the membership to metastable (long-lived) states, (ii) a Ma… ▽ More We propose a deep generative Markov State Model (DeepGenMSM) learning framework for inference of metastable dynamical systems and prediction of trajectories. After unsupervised training on time series data, the model contains (i) a probabilistic encoder that maps from high-dimensional configuration space to a small-sized vector indicating the membership to metastable (long-lived) states, (ii) a Markov chain that governs the transitions between metastable states and facilitates analysis of the long-time dynamics, and (iii) a generative part that samples the conditional distribution of configurations in the next time step. The model can be operated in a recursive fashion to generate trajectories to predict the system evolution from a defined starting state and propose new configurations. The DeepGenMSM is demonstrated to provide accurate estimates of the long-time kinetics and generate valid distributions for molecular dynamics (MD) benchmark systems. Remarkably, we show that DeepGenMSMs are able to make long time-steps in molecular configuration space and generate physically realistic structures in regions that were not seen in training data. △ Less

Submitted 11 January, 2019; v1 submitted 19 May, 2018; originally announced May 2018.

Report number: wu01

arXiv:1710.11239 [pdf, other]

doi 10.1063/1.5011399

Time-lagged autoencoders: Deep learning of slow collective variables for molecular kinetics

Authors: Christoph Wehmeyer, Frank Noé

Abstract: Inspired by the success of deep learning techniques in the physical and chemical sciences, we apply a modification of an autoencoder type deep neural network to the task of dimension reduction of molecular dynamics data. We can show that our time-lagged autoencoder reliably finds low-dimensional embeddings for high-dimensional feature spaces which capture the slow dynamics of the underlying stocha… ▽ More Inspired by the success of deep learning techniques in the physical and chemical sciences, we apply a modification of an autoencoder type deep neural network to the task of dimension reduction of molecular dynamics data. We can show that our time-lagged autoencoder reliably finds low-dimensional embeddings for high-dimensional feature spaces which capture the slow dynamics of the underlying stochastic processes - beyond the capabilities of linear dimension reduction techniques. △ Less

Submitted 30 October, 2017; originally announced October 2017.

arXiv:1710.06012 [pdf, other]

doi 10.1038/s41467-017-02388-1

VAMPnets: Deep learning of molecular kinetics

Authors: Andreas Mardt, Luca Pasquali, Hao Wu, Frank Noé

Abstract: There is an increasing demand for computing the relevant structures, equilibria and long-timescale kinetics of biomolecular processes, such as protein-drug binding, from high-throughput molecular dynamics simulations. Current methods employ transformation of simulated coordinates into structural features, dimension reduction, clustering the dimension-reduced data, and estimation of a Markov state… ▽ More There is an increasing demand for computing the relevant structures, equilibria and long-timescale kinetics of biomolecular processes, such as protein-drug binding, from high-throughput molecular dynamics simulations. Current methods employ transformation of simulated coordinates into structural features, dimension reduction, clustering the dimension-reduced data, and estimation of a Markov state model or related model of the interconversion rates between molecular structures. This handcrafted approach demands a substantial amount of modeling expertise, as poor decisions at any step will lead to large modeling errors. Here we employ the variational approach for Markov processes (VAMP) to develop a deep learning framework for molecular kinetics using neural networks, dubbed VAMPnets. A VAMPnet encodes the entire map** from molecular coordinates to Markov states, thus combining the whole data processing pipeline in a single end-to-end framework. Our method performs equally or better than state-of-the art Markov modeling methods and provides easily interpretable few-state kinetic models. △ Less

Submitted 20 December, 2017; v1 submitted 16 October, 2017; originally announced October 2017.

arXiv:1707.04659 [pdf, other]

Variational approach for learning Markov processes from time series data

Authors: Hao Wu, Frank Noé

Abstract: Inference, prediction and control of complex dynamical systems from time series is important in many areas, including financial markets, power grid management, climate and weather modeling, or molecular dynamics. The analysis of such highly nonlinear dynamical systems is facilitated by the fact that we can often find a (generally nonlinear) transformation of the system coordinates to features in w… ▽ More Inference, prediction and control of complex dynamical systems from time series is important in many areas, including financial markets, power grid management, climate and weather modeling, or molecular dynamics. The analysis of such highly nonlinear dynamical systems is facilitated by the fact that we can often find a (generally nonlinear) transformation of the system coordinates to features in which the dynamics can be excellently approximated by a linear Markovian model. Moreover, the large number of system variables often change collectively on large time- and length-scales, facilitating a low-dimensional analysis in feature space. In this paper, we introduce a variational approach for Markov processes (VAMP) that allows us to find optimal feature map**s and optimal Markovian models of the dynamics from given time series data. The key insight is that the best linear model can be obtained from the top singular components of the Koopman operator. This leads to the definition of a family of score functions called VAMP-r which can be calculated from data, and can be employed to optimize a Markovian model. In addition, based on the relationship between the variational scores and approximation errors of Koopman operators, we propose a new VAMP-E score, which can be applied to cross-validation for hyper-parameter optimization and model selection in VAMP. VAMP is valid for both reversible and nonreversible processes and for stationary and non-stationary processes or realizations. △ Less

Submitted 15 August, 2019; v1 submitted 14 July, 2017; originally announced July 2017.

arXiv:1610.06773 [pdf, other]

doi 10.1063/1.4979344

Variational Koopman models: slow collective variables and molecular kinetics from short off-equilibrium simulations

Authors: Hao Wu, Feliks Nüske, Fabian Paul, Stefan Klus, Peter Koltai, Frank Noé

Abstract: Markov state models (MSMs) and Master equation models are popular approaches to approximate molecular kinetics, equilibria, metastable states, and reaction coordinates in terms of a state space discretization usually obtained by clustering. Recently, a powerful generalization of MSMs has been introduced, the variational approach (VA) of molecular kinetics and its special case the time-lagged indep… ▽ More Markov state models (MSMs) and Master equation models are popular approaches to approximate molecular kinetics, equilibria, metastable states, and reaction coordinates in terms of a state space discretization usually obtained by clustering. Recently, a powerful generalization of MSMs has been introduced, the variational approach (VA) of molecular kinetics and its special case the time-lagged independent component analysis (TICA), which allow us to approximate slow collective variables and molecular kinetics by linear combinations of smooth basis functions or order parameters. While it is known how to estimate MSMs from trajectories whose starting points are not sampled from an equilibrium ensemble, this has not yet been the case for TICA and the VA. Previous estimates from short trajectories, have been strongly biased and thus not variationally optimal. Here, we employ Koopman operator theory and ideas from dynamic mode decomposition (DMD) to extend the VA and TICA to non-equilibrium data. The main insight is that the VA and TICA provide a coefficient matrix that we call Koopman model, as it approximates the underlying dynamical (Koopman) operator in conjunction with the basis set used. This Koopman model can be used to compute a stationary vector to reweight the data to equilibrium. From such a Koopman-reweighted sample, equilibrium expectation values and variationally optimal reversible Koopman models can be constructed even with short simulations. The Koopman model can be used to propagate densities, and its eigenvalue decomposition provide estimates of relaxation timescales and slow collective variables for dimension reduction. Koopman models are generalizations of Markov state models, TICA and the linear VA and allow molecular kinetics to be described without a cluster discretization. △ Less

Submitted 22 January, 2017; v1 submitted 20 October, 2016; originally announced October 2016.

arXiv:1603.01640 [pdf, ps, other]

Reversible Markov chain estimation using convex-concave programming

Authors: Benjamin Trendelkamp-Schroer, Hao Wu, Frank Noe

Abstract: We present a convex-concave reformulation of the reversible Markov chain estimation problem and outline an efficient numerical scheme for the solution of the resulting problem based on a primal-dual interior point method for monotone variational inequalities. Extensions to situations in which information about the stationary vector is available can also be solved via the convex- concave reformulat… ▽ More We present a convex-concave reformulation of the reversible Markov chain estimation problem and outline an efficient numerical scheme for the solution of the resulting problem based on a primal-dual interior point method for monotone variational inequalities. Extensions to situations in which information about the stationary vector is available can also be solved via the convex- concave reformulation. The method can be generalized and applied to the discrete transition matrix reweighting analysis method to perform inference from independent chains with specified couplings between the stationary probabilities. The proposed approach offers a significant speed-up compared to a fixed-point iteration for a number of relevant applications. △ Less

Submitted 4 March, 2016; originally announced March 2016.

Comments: 17pages, 2 figures

MSC Class: 62M05; 65K15; 62F30; 62P10

arXiv:1507.05990 [pdf, other]

Estimation and uncertainty of reversible Markov models

Authors: Benjamin Trendelkamp-Schroer, Hao Wu, Fabian Paul, Frank Noé

Abstract: Reversibility is a key concept in Markov models and Master-equation models of molecular kinetics. The analysis and interpretation of the transition matrix encoding the kinetic properties of the model relies heavily on the reversibility property. The estimation of a reversible transition matrix from simulation data is therefore crucial to the successful application of the previously developed theor… ▽ More Reversibility is a key concept in Markov models and Master-equation models of molecular kinetics. The analysis and interpretation of the transition matrix encoding the kinetic properties of the model relies heavily on the reversibility property. The estimation of a reversible transition matrix from simulation data is therefore crucial to the successful application of the previously developed theory. In this work we discuss methods for the maximum likelihood estimation of transition matrices from finite simulation data and present a new algorithm for the estimation if reversibility with respect to a given stationary vector is desired. We also develop new methods for the Bayesian posterior inference of reversible transition matrices with and without given stationary vector taking into account the need for a suitable prior distribution preserving the meta- stable features of the observed process during posterior inference. All algorithms here are implemented in the PyEMMA software - http://pyemma.org - as of version 2.0. △ Less

Submitted 21 September, 2015; v1 submitted 19 July, 2015; originally announced July 2015.

arXiv:1409.6439 [pdf, ps, other]

doi 10.1103/PhysRevX.6.011009

Efficient estimation of rare-event kinetics

Authors: Benjamin Trendelkamp-Schroer, Frank Noe

Abstract: The efficient calculation of rare-event kinetics in complex dynamical systems, such as the rate and pathways of ligand dissociation from a protein, is a generally unsolved problem. Markov state models can systematically integrate ensembles of short simulations and thus effectively parallelize the computational effort, but the rare events of interest still need to be spontaneously sampled in the da… ▽ More The efficient calculation of rare-event kinetics in complex dynamical systems, such as the rate and pathways of ligand dissociation from a protein, is a generally unsolved problem. Markov state models can systematically integrate ensembles of short simulations and thus effectively parallelize the computational effort, but the rare events of interest still need to be spontaneously sampled in the data. Enhanced sampling approaches, such as parallel tempering or umbrella sampling, can accelerate the computation of equilibrium expectations massively - but sacrifice the ability to compute dynamical expectations. In this work we establish a principle to combine knowledge of the equilibrium distribution with kinetics from fast "downhill" relaxation trajectories using reversible Markov models. This approach is general as it does not invoke any specific dynamical model, and can provide accurate estimates of the rare event kinetics. Large gains in sampling efficiency can be achieved whenever one direction of the process occurs more rapid than its reverse, making the approach especially attractive for downhill processes such as folding and binding in biomolecules. △ Less

Submitted 22 July, 2015; v1 submitted 23 September, 2014; originally announced September 2014.

Comments: 16 pages, 14 figures

Journal ref: Phys. Rev. X 6, 011009 (2016)

Showing 1–38 of 38 results for author: Noe, F