-
BAD-NEUS: Rapidly converging trajectory stratification
Authors:
John Strahan,
Chatipat Lorpaiboon,
Jonathan Weare,
Aaron R. Dinner
Abstract:
An issue for molecular dynamics simulations is that events of interest often involve timescales that are much longer than the simulation time step, which is set by the fastest timescales of the model. Because of this timescale separation, direct simulation of many events is prohibitively computationally costly. This issue can be overcome by aggregating information from many relatively short simula…
▽ More
An issue for molecular dynamics simulations is that events of interest often involve timescales that are much longer than the simulation time step, which is set by the fastest timescales of the model. Because of this timescale separation, direct simulation of many events is prohibitively computationally costly. This issue can be overcome by aggregating information from many relatively short simulations that sample segments of trajectories involving events of interest. This is the strategy of Markov state models (MSMs) and related approaches, but such methods suffer from approximation error because the variables defining the states generally do not capture the dynamics fully. By contrast, once converged, the weighted ensemble (WE) method aggregates information from trajectory segments so as to yield unbiased estimates of both thermodynamic and kinetic statistics. Unfortunately, errors decay no faster than unbiased simulation in WE. Here we introduce a theoretical framework for describing WE that shows that introduction of an element of stratification, as in nonequilibrium umbrella sampling (NEUS), accelerates convergence. Then, building on ideas from MSMs and related methods, we propose an improved stratification that allows approximation error to be reduced systematically. We show that the improved stratification can decrease simulation times required to achieve a desired precision by orders of magnitude.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Accurate estimates of dynamical statistics using memory
Authors:
Chatipat Lorpaiboon,
Spencer C. Guo,
John Strahan,
Jonathan Weare,
Aaron R. Dinner
Abstract:
Many chemical reactions and molecular processes occur on timescales that are significantly longer than those accessible by direct simulation. One successful approach to estimating dynamical statistics for such processes is to use many short time series observations of the system to construct a Markov state model (MSM), which approximates the dynamics of the system as memoryless transitions between…
▽ More
Many chemical reactions and molecular processes occur on timescales that are significantly longer than those accessible by direct simulation. One successful approach to estimating dynamical statistics for such processes is to use many short time series observations of the system to construct a Markov state model (MSM), which approximates the dynamics of the system as memoryless transitions between a set of discrete states. The dynamical Galerkin approximation (DGA) generalizes MSMs for the problem of calculating dynamical statistics, such as committors and mean first passage times, by replacing the set of discrete states with a projection onto a basis. Because the projected dynamics are generally not memoryless, the Markov approximation can result in significant systematic error. Inspired by quasi-Markov state models, which employ the generalized master equation to encode memory resulting from the projection, we reformulate DGA to account for memory and analyze its performance on two systems: a two-dimensional triple well and helix-to-helix transitions of the AIB$_9$ peptide. We demonstrate that our method is robust to the choice of basis and can decrease the time series length required to obtain accurate kinetics by an order of magnitude.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Inexact iterative numerical linear algebra for neural network-based spectral estimation and rare-event prediction
Authors:
John Strahan,
Spencer C. Guo,
Chatipat Lorpaiboon,
Aaron R. Dinner,
Jonathan Weare
Abstract:
Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics such as the likelihood and average time of events (predictio…
▽ More
Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics such as the likelihood and average time of events (predictions). Here we develop inexact iterative linear algebra methods for computing these eigenfunctions (spectral estimation) and making predictions from a data set of short trajectories sampled at finite intervals. We demonstrate the methods on a low-dimensional model that facilitates visualization and a high-dimensional model of a biomolecular system. Implications for the prediction problem in reinforcement learning are discussed.
△ Less
Submitted 20 July, 2023; v1 submitted 22 March, 2023;
originally announced March 2023.
-
Augmented Transition Path Theory for Sequences of Events
Authors:
Chatipat Lorpaiboon,
Jonathan Weare,
Aaron R. Dinner
Abstract:
Transition path theory provides a statistical description of the dynamics of a reaction in terms of local spatial quantities. In its original formulation, it is limited to reactions that consist of trajectories flowing from a reactant set A to a product set B. We extend the basic concepts and principles of transition path theory to reactions in which trajectories exhibit a specified sequence of ev…
▽ More
Transition path theory provides a statistical description of the dynamics of a reaction in terms of local spatial quantities. In its original formulation, it is limited to reactions that consist of trajectories flowing from a reactant set A to a product set B. We extend the basic concepts and principles of transition path theory to reactions in which trajectories exhibit a specified sequence of events and illustrate the utility of this generalization on examples.
△ Less
Submitted 29 July, 2022; v1 submitted 10 May, 2022;
originally announced May 2022.
-
Long-timescale predictions from short-trajectory data: A benchmark analysis of the trp-cage miniprotein
Authors:
John Strahan,
Adam Antoszewski,
Chatipat Lorpaiboon,
Bodhi P. Vani,
Jonathan Weare,
Aaron R. Dinner
Abstract:
Elucidating physical mechanisms with statistical confidence from molecular dynamics simulations can be challenging owing to the many degrees of freedom that contribute to collective motions. To address this issue, we recently introduced a dynamical Galerkin approximation (DGA) [Thiede et al. J. Phys. Chem. 150, 244111 (2019)], in which chemical kinetic statistics that satisfy equations of dynamica…
▽ More
Elucidating physical mechanisms with statistical confidence from molecular dynamics simulations can be challenging owing to the many degrees of freedom that contribute to collective motions. To address this issue, we recently introduced a dynamical Galerkin approximation (DGA) [Thiede et al. J. Phys. Chem. 150, 244111 (2019)], in which chemical kinetic statistics that satisfy equations of dynamical operators are represented by a basis expansion. Here, we reformulate this approach, clarifying (and reducing) the dependence on the choice of lag time. We present a new projection of the reactive current onto collective variables and provide improved estimators for rates and committors. We also present simple procedures for constructing suitable smoothly varying basis functions from arbitrary molecular features. To evaluate estimators and basis sets numerically, we generate and carefully validate a dataset of short trajectories for the unfolding and folding of the trp-cage miniprotein, a well-studied system. Our analysis demonstrates a comprehensive strategy for characterizing reaction pathways quantitatively.
△ Less
Submitted 8 September, 2020;
originally announced September 2020.
-
Integrated VAC: A robust strategy for identifying eigenfunctions of dynamical operators
Authors:
Chatipat Lorpaiboon,
Erik Henning Thiede,
Robert J. Webber,
Jonathan Weare,
Aaron R. Dinner
Abstract:
One approach to analyzing the dynamics of a physical system is to search for long-lived patterns in its motions. This approach has been particularly successful for molecular dynamics data, where slowly decorrelating patterns can indicate large-scale conformational changes. Detecting such patterns is the central objective of the variational approach to conformational dynamics (VAC), as well as the…
▽ More
One approach to analyzing the dynamics of a physical system is to search for long-lived patterns in its motions. This approach has been particularly successful for molecular dynamics data, where slowly decorrelating patterns can indicate large-scale conformational changes. Detecting such patterns is the central objective of the variational approach to conformational dynamics (VAC), as well as the related methods of time-lagged independent component analysis and Markov state modeling. In VAC, the search for slowly decorrelating patterns is formalized as a variational problem solved by the eigenfunctions of the system's transition operator. VAC computes solutions to this variational problem by optimizing a linear or nonlinear model of the eigenfunctions using time series data. Here, we build on VAC's success by addressing two practical limitations. First, VAC can give poor eigenfunction estimates when the lag time parameter is chosen poorly. Second, VAC can overfit when using flexible parameterizations such as artificial neural networks with insufficient regularization. To address these issues, we propose an extension that we call integrated VAC (IVAC). IVAC integrates over multiple lag times before solving the variational problem, making its results more robust and reproducible than VAC's.
△ Less
Submitted 9 September, 2020; v1 submitted 15 July, 2020;
originally announced July 2020.