-
The surprising efficiency of temporal difference learning for rare event prediction
Authors:
Xiaoou Cheng,
Jonathan Weare
Abstract:
We quantify the efficiency of temporal difference (TD) learning over the direct, or Monte Carlo (MC), estimator for policy evaluation in reinforcement learning, with an emphasis on estimation of quantities related to rare events. Policy evaluation is complicated in the rare event setting by the long timescale of the event and by the need for \emph{relative accuracy} in estimates of very small valu…
▽ More
We quantify the efficiency of temporal difference (TD) learning over the direct, or Monte Carlo (MC), estimator for policy evaluation in reinforcement learning, with an emphasis on estimation of quantities related to rare events. Policy evaluation is complicated in the rare event setting by the long timescale of the event and by the need for \emph{relative accuracy} in estimates of very small values. Specifically, we focus on least-squares TD (LSTD) prediction for finite state Markov chains, and show that LSTD can achieve relative accuracy far more efficiently than MC. We prove a central limit theorem for the LSTD estimator and upper bound the \emph{relative asymptotic variance} by simple quantities characterizing the connectivity of states relative to the transition probabilities between them. Using this bound, we show that, even when both the timescale of the rare event and the relative accuracy of the MC estimator are exponentially large in the number of states, LSTD maintains a fixed level of relative accuracy with a total number of observed transitions of the Markov chain that is only \emph{polynomially} large in the number of states.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
BAD-NEUS: Rapidly converging trajectory stratification
Authors:
John Strahan,
Chatipat Lorpaiboon,
Jonathan Weare,
Aaron R. Dinner
Abstract:
An issue for molecular dynamics simulations is that events of interest often involve timescales that are much longer than the simulation time step, which is set by the fastest timescales of the model. Because of this timescale separation, direct simulation of many events is prohibitively computationally costly. This issue can be overcome by aggregating information from many relatively short simula…
▽ More
An issue for molecular dynamics simulations is that events of interest often involve timescales that are much longer than the simulation time step, which is set by the fastest timescales of the model. Because of this timescale separation, direct simulation of many events is prohibitively computationally costly. This issue can be overcome by aggregating information from many relatively short simulations that sample segments of trajectories involving events of interest. This is the strategy of Markov state models (MSMs) and related approaches, but such methods suffer from approximation error because the variables defining the states generally do not capture the dynamics fully. By contrast, once converged, the weighted ensemble (WE) method aggregates information from trajectory segments so as to yield unbiased estimates of both thermodynamic and kinetic statistics. Unfortunately, errors decay no faster than unbiased simulation in WE. Here we introduce a theoretical framework for describing WE that shows that introduction of an element of stratification, as in nonequilibrium umbrella sampling (NEUS), accelerates convergence. Then, building on ideas from MSMs and related methods, we propose an improved stratification that allows approximation error to be reduced systematically. We show that the improved stratification can decrease simulation times required to achieve a desired precision by orders of magnitude.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Using Explainable AI and Transfer Learning to understand and predict the maintenance of Atlantic blocking with limited observational data
Authors:
Huan Zhang,
Justin Finkel,
Dorian S. Abbot,
Edwin P. Gerber,
Jonathan Weare
Abstract:
Blocking events are an important cause of extreme weather, especially long-lasting blocking events that trap weather systems in place. The duration of blocking events is, however, underestimated in climate models. Explainable Artificial Intelligence are a class of data analysis methods that can help identify physical causes of prolonged blocking events and diagnose model deficiencies. We demonstra…
▽ More
Blocking events are an important cause of extreme weather, especially long-lasting blocking events that trap weather systems in place. The duration of blocking events is, however, underestimated in climate models. Explainable Artificial Intelligence are a class of data analysis methods that can help identify physical causes of prolonged blocking events and diagnose model deficiencies. We demonstrate this approach on an idealized quasigeostrophic model developed by Marshall and Molteni (1993). We train a convolutional neural network (CNN), and subsequently, build a sparse predictive model for the persistence of Atlantic blocking, conditioned on an initial high-pressure anomaly. Shapley Additive ExPlanation (SHAP) analysis reveals that high-pressure anomalies in the American Southeast and North Atlantic, separated by a trough over Atlantic Canada, contribute significantly to prediction of sustained blocking events in the Atlantic region. This agrees with previous work that identified precursors in the same regions via wave train analysis. When we apply the same CNN to blockings in the ERA5 atmospheric reanalysis, there is insufficient data to accurately predict persistent blocks. We partially overcome this limitation by pre-training the CNN on the plentiful data of the Marshall-Molteni model, and then using Transfer Learning to achieve better predictions than direct training. SHAP analysis before and after transfer learning allows a comparison between the predictive features in the reanalysis and the quasigeostrophic model, quantifying dynamical biases in the idealized model. This work demonstrates the potential for machine learning methods to extract meaningful precursors of extreme weather events and achieve better prediction using limited observational data.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
AI can identify Solar System instability billions of years in advance
Authors:
Dorian S. Abbot,
J. D. Laurence-Chasen,
Robert J. Webber,
David M. Hernandez,
Jonathan Weare
Abstract:
Rare event schemes require an approximation of the probability of the rare event as a function of system state. Finding an appropriate reaction coordinate is typically the most challenging aspect of applying a rare event scheme. Here we develop an artificial intelligence (AI) based reaction coordinate that effectively predicts which of a limited number of simulations of the Solar System will go un…
▽ More
Rare event schemes require an approximation of the probability of the rare event as a function of system state. Finding an appropriate reaction coordinate is typically the most challenging aspect of applying a rare event scheme. Here we develop an artificial intelligence (AI) based reaction coordinate that effectively predicts which of a limited number of simulations of the Solar System will go unstable using a convolutional neural network classifier. The performance of the algorithm does not degrade significantly even 3.5 billion years before the instability. We overcome the class imbalance intrinsic to rare event problems using a combination of minority class oversampling, increased minority class weighting, and pulling multiple non-overlap** training sequences from simulations. Our success suggests that AI may provide a promising avenue for develo** reaction coordinates without detailed theoretical knowledge of the system.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Accurate estimates of dynamical statistics using memory
Authors:
Chatipat Lorpaiboon,
Spencer C. Guo,
John Strahan,
Jonathan Weare,
Aaron R. Dinner
Abstract:
Many chemical reactions and molecular processes occur on timescales that are significantly longer than those accessible by direct simulation. One successful approach to estimating dynamical statistics for such processes is to use many short time series observations of the system to construct a Markov state model (MSM), which approximates the dynamics of the system as memoryless transitions between…
▽ More
Many chemical reactions and molecular processes occur on timescales that are significantly longer than those accessible by direct simulation. One successful approach to estimating dynamical statistics for such processes is to use many short time series observations of the system to construct a Markov state model (MSM), which approximates the dynamics of the system as memoryless transitions between a set of discrete states. The dynamical Galerkin approximation (DGA) generalizes MSMs for the problem of calculating dynamical statistics, such as committors and mean first passage times, by replacing the set of discrete states with a projection onto a basis. Because the projected dynamics are generally not memoryless, the Markov approximation can result in significant systematic error. Inspired by quasi-Markov state models, which employ the generalized master equation to encode memory resulting from the projection, we reformulate DGA to account for memory and analyze its performance on two systems: a two-dimensional triple well and helix-to-helix transitions of the AIB$_9$ peptide. We demonstrate that our method is robust to the choice of basis and can decrease the time series length required to obtain accurate kinetics by an order of magnitude.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Improved Active Learning via Dependent Leverage Score Sampling
Authors:
Atsushi Shimizu,
Xiaoou Cheng,
Christopher Musco,
Jonathan Weare
Abstract:
We show how to obtain improved active learning methods in the agnostic (adversarial noise) setting by combining marginal leverage score sampling with non-independent sampling strategies that promote spatial coverage. In particular, we propose an easily implemented method based on the \emph{pivotal sampling algorithm}, which we test on problems motivated by learning-based methods for parametric PDE…
▽ More
We show how to obtain improved active learning methods in the agnostic (adversarial noise) setting by combining marginal leverage score sampling with non-independent sampling strategies that promote spatial coverage. In particular, we propose an easily implemented method based on the \emph{pivotal sampling algorithm}, which we test on problems motivated by learning-based methods for parametric PDEs and uncertainty quantification. In comparison to independent sampling, our method reduces the number of samples needed to reach a given target accuracy by up to $50\%$. We support our findings with two theoretical results. First, we show that any non-independent leverage score sampling method that obeys a weak \emph{one-sided $\ell_{\infty}$ independence condition} (which includes pivotal sampling) can actively learn $d$ dimensional linear functions with $O(d\log d)$ samples, matching independent sampling. This result extends recent work on matrix Chernoff bounds under $\ell_{\infty}$ independence, and may be of interest for analyzing other sampling strategies beyond pivotal sampling. Second, we show that, for the important case of polynomial regression, our pivotal method obtains an improved bound on $O(d)$ samples.
△ Less
Submitted 4 May, 2024; v1 submitted 7 October, 2023;
originally announced October 2023.
-
Randomly sparsified Richardson iteration is really fast
Authors:
Jonathan Weare,
Robert J. Webber
Abstract:
Recently, a class of algorithms combining classical fixed point iterations with repeated random sparsification of approximate solution vectors has been successfully applied to eigenproblems with matrices as large as $10^{108} \times 10^{108}$. So far, a complete mathematical explanation for their success has proven elusive. Additionally, the methods have not been extended to linear system solves.…
▽ More
Recently, a class of algorithms combining classical fixed point iterations with repeated random sparsification of approximate solution vectors has been successfully applied to eigenproblems with matrices as large as $10^{108} \times 10^{108}$. So far, a complete mathematical explanation for their success has proven elusive. Additionally, the methods have not been extended to linear system solves.
In this paper we propose a new scheme based on repeated random sparsification that is capable of solving linear systems in extremely high dimensions. We provide a complete mathematical analysis of this new algorithm. Our analysis establishes a faster-than-Monte Carlo convergence rate and justifies use of the scheme even when the solution vector itself is too large to store.
△ Less
Submitted 17 November, 2023; v1 submitted 29 September, 2023;
originally announced September 2023.
-
Mercury's chaotic secular evolution as a subdiffusive process
Authors:
Dorian S. Abbot,
Robert J. Webber,
David M. Hernandez,
Sam Hadden,
Jonathan Weare
Abstract:
Mercury's orbit can destabilize, generally resulting in a collision with either Venus or the Sun. Chaotic evolution can cause g1 to decrease to the approximately constant value of g5 and create a resonance. Previous work has approximated the variation in g1 as stochastic diffusion, which leads to a phenomological model that can reproduce the Mercury instability statistics of secular and N-body mod…
▽ More
Mercury's orbit can destabilize, generally resulting in a collision with either Venus or the Sun. Chaotic evolution can cause g1 to decrease to the approximately constant value of g5 and create a resonance. Previous work has approximated the variation in g1 as stochastic diffusion, which leads to a phenomological model that can reproduce the Mercury instability statistics of secular and N-body models on timescales longer than 10 Gyr. Here we show that the diffusive model underpredicts the Mercury instability probability by a factor of 3-10,000 on timescales less than 5 Gyr, the remaining lifespan of the Solar System. This is because g1 exhibits larger variations on short timescales than the diffusive model would suggest. To better model the variations on short timescales, we build a new subdiffusive phenomological model for g1. Subdiffusion is similar to diffusion but exhibits larger displacements on short timescales and smaller displacements on long timescales. We choose model parameters based on the behavior of the g1 trajectories in the N-body simulations, leading to a tuned model that can reproduce Mercury instability statistics from 1-40 Gyr. This work motivates fundamental questions in Solar System dynamics: Why does subdiffusion better approximate the variation in g1 than standard diffusion? Why is there an upper bound on g1, but not a lower bound that would prevent it from reaching g5?
△ Less
Submitted 12 April, 2024; v1 submitted 20 June, 2023;
originally announced June 2023.
-
Inexact iterative numerical linear algebra for neural network-based spectral estimation and rare-event prediction
Authors:
John Strahan,
Spencer C. Guo,
Chatipat Lorpaiboon,
Aaron R. Dinner,
Jonathan Weare
Abstract:
Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics such as the likelihood and average time of events (predictio…
▽ More
Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics such as the likelihood and average time of events (predictions). Here we develop inexact iterative linear algebra methods for computing these eigenfunctions (spectral estimation) and making predictions from a data set of short trajectories sampled at finite intervals. We demonstrate the methods on a low-dimensional model that facilitates visualization and a high-dimensional model of a biomolecular system. Implications for the prediction problem in reinforcement learning are discussed.
△ Less
Submitted 20 July, 2023; v1 submitted 22 March, 2023;
originally announced March 2023.
-
Simple physics and integrators accurately reproduce Mercury instability statistics
Authors:
Dorian S. Abbot,
David M. Hernandez,
Sam Hadden,
Robert J. Webber,
Georgios P. Afentakis,
Jonathan Weare
Abstract:
The long-term stability of the Solar System is an issue of significant scientific and philosophical interest. The mechanism leading to instability is Mercury's eccentricity being pumped up so high that Mercury either collides with Venus or is scattered into the Sun. Previously, only three five-billion-year $N$-body ensembles of the Solar System with thousands of simulations have been run to assess…
▽ More
The long-term stability of the Solar System is an issue of significant scientific and philosophical interest. The mechanism leading to instability is Mercury's eccentricity being pumped up so high that Mercury either collides with Venus or is scattered into the Sun. Previously, only three five-billion-year $N$-body ensembles of the Solar System with thousands of simulations have been run to assess long-term stability. We generate two additional ensembles, each with 2750 members, and make them publicly available at \texttt{https://archive.org/details/@dorianabbot}. We find that accurate Mercury instability statistics can be obtained by (1) including only the Sun and the 8 planets, (2) using a simple Wisdom-Holman scheme without correctors, (3) using a basic representation of general relativity, and (4) using a time step of 3.16 days. By combining our Solar System ensembles with previous ensembles we form a 9,601-member ensemble of ensembles. In this ensemble of ensembles, the logarithm of the frequency of a Mercury instability event increases linearly with time between 1.3 and 5 Gyr, suggesting that a single mechanism is responsible for Mercury instabilities in this time range and that this mechanism becomes more active as time progresses. Our work provides a robust estimate of Mercury instability statistics over the next five billion years, outlines methodologies that may be useful for exoplanet system investigations, and provides two large ensembles of publicly available Solar System integrations that can serve as testbeds for theoretical ideas as well as training sets for artificial intelligence schemes.
△ Less
Submitted 21 February, 2023; v1 submitted 30 December, 2022;
originally announced December 2022.
-
Understanding and eliminating spurious modes in variational Monte Carlo using collective variables
Authors:
Huan Zhang,
Robert J. Webber,
Michael Lindsey,
Timothy C. Berkelbach,
Jonathan Weare
Abstract:
The use of neural network parametrizations to represent the ground state in variational Monte Carlo (VMC) calculations has generated intense interest in recent years. However, as we demonstrate in the context of the periodic Heisenberg spin chain, this approach can produce unreliable wave function approximations. One of the most obvious signs of failure is the occurrence of random, persistent spik…
▽ More
The use of neural network parametrizations to represent the ground state in variational Monte Carlo (VMC) calculations has generated intense interest in recent years. However, as we demonstrate in the context of the periodic Heisenberg spin chain, this approach can produce unreliable wave function approximations. One of the most obvious signs of failure is the occurrence of random, persistent spikes in the energy estimate during training. These energy spikes are caused by regions of configuration space that are over-represented by the wave function density, which are called ``spurious modes'' in the machine learning literature. After exploring these spurious modes in detail, we demonstrate that a collective-variable-based penalization yields a substantially more robust training procedure, preventing the formation of spurious modes and improving the accuracy of energy estimates. Because the penalization scheme is cheap to implement and is not specific to the particular model studied here, it can be extended to other applications of VMC where a reasonable choice of collective variable is available.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
Predicting rare events using neural networks and short-trajectory data
Authors:
John Strahan,
Justin Finkel,
Aaron R. Dinner,
Jonathan Weare
Abstract:
Estimating the likelihood, timing, and nature of events is a major goal of modeling stochastic dynamical systems. When the event is rare in comparison with the timescales of simulation and/or measurement needed to resolve the elemental dynamics, accurate prediction from direct observations becomes challenging. In such cases a more effective approach is to cast statistics of interest as solutions t…
▽ More
Estimating the likelihood, timing, and nature of events is a major goal of modeling stochastic dynamical systems. When the event is rare in comparison with the timescales of simulation and/or measurement needed to resolve the elemental dynamics, accurate prediction from direct observations becomes challenging. In such cases a more effective approach is to cast statistics of interest as solutions to Feynman-Kac equations (partial differential equations). Here, we develop an approach to solve Feynman-Kac equations by training neural networks on short-trajectory data. Our approach is based on a Markov approximation but otherwise avoids assumptions about the underlying model and dynamics. This makes it applicable to treating complex computational models and observational data. We illustrate the advantages of our method using a low-dimensional model that facilitates visualization, and this analysis motivates an adaptive sampling strategy that allows on-the-fly identification of and addition of data to regions important for predicting the statistics of interest. Finally, we demonstrate that we can compute accurate statistics for a 75-dimensional model of sudden stratospheric warming. This system provides a stringent test bed for our method.
△ Less
Submitted 2 March, 2023; v1 submitted 2 August, 2022;
originally announced August 2022.
-
Computing transition path theory quantities with trajectory stratification
Authors:
Bodhi P. Vani,
Jonathan Weare,
Aaron R. Dinner
Abstract:
Transition path theory computes statistics from ensembles of reactive trajectories. A common strategy for sampling reactive trajectories is to control the branching and pruning of trajectories so as to enhance the sampling of low probability segments. However, it can be challenging to apply transition path theory to data from such methods because determining whether configurations and trajectory s…
▽ More
Transition path theory computes statistics from ensembles of reactive trajectories. A common strategy for sampling reactive trajectories is to control the branching and pruning of trajectories so as to enhance the sampling of low probability segments. However, it can be challenging to apply transition path theory to data from such methods because determining whether configurations and trajectory segments are part of reactive trajectories requires looking backward and forward in time. Here, we show how this issue can be overcome efficiently by introducing simple data structures. We illustrate the approach in the context of nonequilibrium umbrella sampling (NEUS), but the strategy is general and can be used to obtain transition path theory statistics from other methods that sample segments of unbiased trajectories.
△ Less
Submitted 22 June, 2022;
originally announced June 2022.
-
Revealing the statistics of extreme events hidden in short weather forecast data
Authors:
Justin Finkel,
Edwin P. Gerber,
Dorian S. Abbot,
Jonathan Weare
Abstract:
Extreme weather events have significant consequences, dominating the impact of climate on society. While high-resolution weather models can forecast many types of extreme events on synoptic timescales, long-term climatological risk assessment is an altogether different problem. A once-in-a-century event takes, on average, 100 years of simulation time to appear just once, far beyond the typical int…
▽ More
Extreme weather events have significant consequences, dominating the impact of climate on society. While high-resolution weather models can forecast many types of extreme events on synoptic timescales, long-term climatological risk assessment is an altogether different problem. A once-in-a-century event takes, on average, 100 years of simulation time to appear just once, far beyond the typical integration length of a weather forecast model. Therefore, this task is left to cheaper, but less accurate, low-resolution or statistical models. But there is untapped potential in weather model output: despite being short in duration, weather forecast ensembles are produced multiple times a week. Integrations are launched with independent perturbations, causing them to spread apart over time and broadly sample phase space. Collectively, these integrations add up to thousands of years of data. We establish methods to extract climatological information from these short weather simulations. Using ensemble hindcasts by the European Center for Medium-range Weather Forecasting (ECMWF) archived in the subseasonal-to-seasonal (S2S) database, we characterize sudden stratospheric warming (SSW) events with multi-centennial return times. Consistent results are found between alternative methods, including basic counting strategies and Markov state modeling. By carefully combining trajectories together, we obtain estimates of SSW frequencies and their seasonal distributions that are consistent with reanalysis-derived estimates for moderately rare events, but with much tighter uncertainty bounds, and which can be extended to events of unprecedented severity that have not yet been observed historically. These methods hold potential for assessing extreme events throughout the climate system, beyond this example of stratospheric extremes.
△ Less
Submitted 23 January, 2023; v1 submitted 10 June, 2022;
originally announced June 2022.
-
Augmented Transition Path Theory for Sequences of Events
Authors:
Chatipat Lorpaiboon,
Jonathan Weare,
Aaron R. Dinner
Abstract:
Transition path theory provides a statistical description of the dynamics of a reaction in terms of local spatial quantities. In its original formulation, it is limited to reactions that consist of trajectories flowing from a reactant set A to a product set B. We extend the basic concepts and principles of transition path theory to reactions in which trajectories exhibit a specified sequence of ev…
▽ More
Transition path theory provides a statistical description of the dynamics of a reaction in terms of local spatial quantities. In its original formulation, it is limited to reactions that consist of trajectories flowing from a reactant set A to a product set B. We extend the basic concepts and principles of transition path theory to reactions in which trajectories exhibit a specified sequence of events and illustrate the utility of this generalization on examples.
△ Less
Submitted 29 July, 2022; v1 submitted 10 May, 2022;
originally announced May 2022.
-
Full Configuration Interaction Excited-State Energies in Large Active Spaces from Subspace Iteration with Repeated Random Sparsification
Authors:
Samuel M. Greene,
Robert J. Webber,
James E. T. Smith,
Jonathan Weare,
Timothy C. Berkelbach
Abstract:
We present a stable and systematically improvable quantum Monte Carlo (QMC) approach to calculating excited-state energies, which we implement using our fast randomized iteration method for the full configuration interaction problem (FCI-FRI). Unlike previous excited-state quantum Monte Carlo methods, our approach, which is an asymmetric variant of subspace iteration, avoids the use of dot product…
▽ More
We present a stable and systematically improvable quantum Monte Carlo (QMC) approach to calculating excited-state energies, which we implement using our fast randomized iteration method for the full configuration interaction problem (FCI-FRI). Unlike previous excited-state quantum Monte Carlo methods, our approach, which is an asymmetric variant of subspace iteration, avoids the use of dot products of random vectors and instead relies upon trial vectors to maintain orthogonality and estimate eigenvalues. By leveraging recent advances, we apply our method to calculate ground- and excited-state energies of strongly correlated molecular systems in large active spaces, including the carbon dimer with 8 electrons in 108 orbitals (8e,108o), an oxo-Mn(salen) transition metal complex (28e,28o), ozone (18e,87o), and butadiene (22e,82o). In the majority of these test cases, our approach yields total excited-state energies that agree with those from state-of-the-art methods -- including heat-bath CI, the density matrix renormalization group approach, and FCIQMC -- to within sub-milliHartree accuracy. In all cases, estimated excitation energies agree to within about 0.1 eV.
△ Less
Submitted 12 October, 2022; v1 submitted 28 January, 2022;
originally announced January 2022.
-
Data-driven transition path analysis yields a statistical understanding of sudden stratospheric warming events in an idealized model
Authors:
Justin Finkel,
Robert J. Webber,
Edwin P. Gerber,
Dorian S. Abbot,
Jonathan Weare
Abstract:
Atmospheric regime transitions are highly impactful as drivers of extreme weather events, but pose two formidable modeling challenges: predicting the next event (weather forecasting), and characterizing the statistics of events of a given severity (the risk climatology). Each event has a different duration and spatial structure, making it hard to define an objective "average event." We argue here…
▽ More
Atmospheric regime transitions are highly impactful as drivers of extreme weather events, but pose two formidable modeling challenges: predicting the next event (weather forecasting), and characterizing the statistics of events of a given severity (the risk climatology). Each event has a different duration and spatial structure, making it hard to define an objective "average event." We argue here that transition path theory (TPT), a stochastic process framework, is an appropriate tool for the task. We demonstrate TPT's capacities on a wave-mean flow model of sudden stratospheric warmings (SSWs) developed by Holton and Mass (1976), which is idealized enough for transparent TPT analysis but complex enough to demonstrate computational scalability. Whereas a recent article (Finkel et al. 2021) studied near-term SSW predictability, the present article uses TPT to link predictability to long-term SSW frequency. This requires not only forecasting forward in time from an initial condition, but also \emph{backward in time} to assess the probability of the initial conditions themselves. TPT enables one to condition the dynamics on the regime transition occurring, and thus visualize its physical drivers with a vector field called the \emph{reactive current}. The reactive current shows that before an SSW, dissipation and stochastic forcing drive a slow decay of vortex strength at lower altitudes. The response of upper-level winds is late and sudden, occurring only after the transition is almost complete from a probabilistic point of view. This case study demonstrates that TPT quantities, visualized in a space of physically meaningful variables, can help one understand the dynamics of regime transitions.
△ Less
Submitted 19 October, 2022; v1 submitted 28 August, 2021;
originally announced August 2021.
-
Rare Event Sampling Improves Mercury Instability Statistics
Authors:
Dorian S. Abbot,
Robert J. Webber,
Sam Hadden,
Darryl Seligman,
Jonathan Weare
Abstract:
Due to the chaotic nature of planetary dynamics, there is a non-zero probability that Mercury's orbit will become unstable in the future. Previous efforts have estimated the probability of this happening between 3 and 5 billion years in the future using a large number of direct numerical simulations with an N-body code, but were not able to obtain accurate estimates before 3 billion years in the f…
▽ More
Due to the chaotic nature of planetary dynamics, there is a non-zero probability that Mercury's orbit will become unstable in the future. Previous efforts have estimated the probability of this happening between 3 and 5 billion years in the future using a large number of direct numerical simulations with an N-body code, but were not able to obtain accurate estimates before 3 billion years in the future because Mercury instability events are too rare. In this paper we use a new rare event sampling technique, Quantile Diffusion Monte Carlo (QDMC), to estimate that the probability of a Mercury instability event in the next 2 billion years is approximately $10^{-4}$ in the REBOUND N-body code. We show that QDMC provides unbiased probability estimates at a computational cost of up to 100 times less than direct numerical simulation. QDMC is easy to implement and could be applied to many problems in planetary dynamics in which it is necessary to estimate the probability of a rare event.
△ Less
Submitted 27 December, 2021; v1 submitted 16 June, 2021;
originally announced June 2021.
-
Ensemble Markov chain Monte Carlo with teleporting walkers
Authors:
Michael Lindsey,
Jonathan Weare,
Anna Zhang
Abstract:
We introduce an ensemble Markov chain Monte Carlo approach to sampling from a probability density with known likelihood. This method upgrades an underlying Markov chain by allowing an ensemble of such chains to interact via a process in which one chain's state is cloned as another's is deleted. This effective teleportation of states can overcome issues of metastability in the underlying chain, as…
▽ More
We introduce an ensemble Markov chain Monte Carlo approach to sampling from a probability density with known likelihood. This method upgrades an underlying Markov chain by allowing an ensemble of such chains to interact via a process in which one chain's state is cloned as another's is deleted. This effective teleportation of states can overcome issues of metastability in the underlying chain, as the scheme enjoys rapid mixing once the modes of the target density have been populated. We derive a mean-field limit for the evolution of the ensemble. We analyze the global and local convergence of this mean-field limit, showing asymptotic convergence independent of the spectral gap of the underlying Markov chain, and moreover we interpret the limiting evolution as a gradient flow. We explain how interaction can be applied selectively to a subset of state variables in order to maintain advantage on very high-dimensional problems. Finally we present the application of our methodology to Bayesian hyperparameter estimation for Gaussian process regression.
△ Less
Submitted 4 June, 2021;
originally announced June 2021.
-
Approximating matrix eigenvalues by subspace iteration with repeated random sparsification
Authors:
Samuel M. Greene,
Robert J. Webber,
Timothy C. Berkelbach,
Jonathan Weare
Abstract:
Traditional numerical methods for calculating matrix eigenvalues are prohibitively expensive for high-dimensional problems. Iterative random sparsification methods allow for the estimation of a single dominant eigenvalue at reduced cost by leveraging repeated random sampling and averaging. We present a general approach to extending such methods for the estimation of multiple eigenvalues and demons…
▽ More
Traditional numerical methods for calculating matrix eigenvalues are prohibitively expensive for high-dimensional problems. Iterative random sparsification methods allow for the estimation of a single dominant eigenvalue at reduced cost by leveraging repeated random sampling and averaging. We present a general approach to extending such methods for the estimation of multiple eigenvalues and demonstrate its performance for several benchmark problems in quantum chemistry.
△ Less
Submitted 2 March, 2022; v1 submitted 22 March, 2021;
originally announced March 2021.
-
Learning forecasts of rare stratospheric transitions from short simulations
Authors:
Justin Finkel,
Robert J. Webber,
Dorian S. Abbot,
Edwin P. Gerber,
Jonathan Weare
Abstract:
Rare events arising in nonlinear atmospheric dynamics remain hard to predict and attribute. We address the problem of forecasting rare events in a prototypical example, Sudden Stratospheric Warmings (SSWs). Approximately once every other winter, the boreal stratospheric polar vortex rapidly breaks down, shifting midlatitude surface weather patterns for months. We focus on two key quantities of int…
▽ More
Rare events arising in nonlinear atmospheric dynamics remain hard to predict and attribute. We address the problem of forecasting rare events in a prototypical example, Sudden Stratospheric Warmings (SSWs). Approximately once every other winter, the boreal stratospheric polar vortex rapidly breaks down, shifting midlatitude surface weather patterns for months. We focus on two key quantities of interest: the probability of an SSW occurring, and the expected lead time if it does occur, as functions of initial condition. These \emph{optimal forecasts} concretely measure the event's progress. Direct numerical simulation can estimate them in principle, but is prohibitively expensive in practice: each rare event requires a long integration to observe, and the cost of each integration grows with model complexity. We describe an alternative approach using integrations that are \emph{short} compared to the timescale of the warming event. We compute the probability and lead time efficiently by solving equations involving the transition operator, which encodes all information about the dynamics. We relate these optimal forecasts to a small number of interpretable physical variables, suggesting optimal measurements for forecasting. We illustrate the methodology on a prototype SSW model developed by Holton and Mass (1976) and modified by stochastic forcing. While highly idealized, this model captures the essential nonlinear dynamics of SSWs and exhibits the key forecasting challenge: the dramatic separation in timescales between a single event and the return time between successive events. Our methodology is designed to fully exploit high-dimensional data from models and observations, and has the potential to identify detailed predictors of many complex rare events in meteorology.
△ Less
Submitted 28 August, 2021; v1 submitted 15 February, 2021;
originally announced February 2021.
-
Long-timescale predictions from short-trajectory data: A benchmark analysis of the trp-cage miniprotein
Authors:
John Strahan,
Adam Antoszewski,
Chatipat Lorpaiboon,
Bodhi P. Vani,
Jonathan Weare,
Aaron R. Dinner
Abstract:
Elucidating physical mechanisms with statistical confidence from molecular dynamics simulations can be challenging owing to the many degrees of freedom that contribute to collective motions. To address this issue, we recently introduced a dynamical Galerkin approximation (DGA) [Thiede et al. J. Phys. Chem. 150, 244111 (2019)], in which chemical kinetic statistics that satisfy equations of dynamica…
▽ More
Elucidating physical mechanisms with statistical confidence from molecular dynamics simulations can be challenging owing to the many degrees of freedom that contribute to collective motions. To address this issue, we recently introduced a dynamical Galerkin approximation (DGA) [Thiede et al. J. Phys. Chem. 150, 244111 (2019)], in which chemical kinetic statistics that satisfy equations of dynamical operators are represented by a basis expansion. Here, we reformulate this approach, clarifying (and reducing) the dependence on the choice of lag time. We present a new projection of the reactive current onto collective variables and provide improved estimators for rates and committors. We also present simple procedures for constructing suitable smoothly varying basis functions from arbitrary molecular features. To evaluate estimators and basis sets numerically, we generate and carefully validate a dataset of short trajectories for the unfolding and folding of the trp-cage miniprotein, a well-studied system. Our analysis demonstrates a comprehensive strategy for characterizing reaction pathways quantitatively.
△ Less
Submitted 8 September, 2020;
originally announced September 2020.
-
Integrated VAC: A robust strategy for identifying eigenfunctions of dynamical operators
Authors:
Chatipat Lorpaiboon,
Erik Henning Thiede,
Robert J. Webber,
Jonathan Weare,
Aaron R. Dinner
Abstract:
One approach to analyzing the dynamics of a physical system is to search for long-lived patterns in its motions. This approach has been particularly successful for molecular dynamics data, where slowly decorrelating patterns can indicate large-scale conformational changes. Detecting such patterns is the central objective of the variational approach to conformational dynamics (VAC), as well as the…
▽ More
One approach to analyzing the dynamics of a physical system is to search for long-lived patterns in its motions. This approach has been particularly successful for molecular dynamics data, where slowly decorrelating patterns can indicate large-scale conformational changes. Detecting such patterns is the central objective of the variational approach to conformational dynamics (VAC), as well as the related methods of time-lagged independent component analysis and Markov state modeling. In VAC, the search for slowly decorrelating patterns is formalized as a variational problem solved by the eigenfunctions of the system's transition operator. VAC computes solutions to this variational problem by optimizing a linear or nonlinear model of the eigenfunctions using time series data. Here, we build on VAC's success by addressing two practical limitations. First, VAC can give poor eigenfunction estimates when the lag time parameter is chosen poorly. Second, VAC can overfit when using flexible parameterizations such as artificial neural networks with insufficient regularization. To address these issues, we propose an extension that we call integrated VAC (IVAC). IVAC integrates over multiple lag times before solving the variational problem, making its results more robust and reproducible than VAC's.
△ Less
Submitted 9 September, 2020; v1 submitted 15 July, 2020;
originally announced July 2020.
-
A metric on directed graphs and Markov chains based on hitting probabilities
Authors:
Zachary M. Boyd,
Nicolas Fraiman,
Jeremy L. Marzuola,
Peter J. Mucha,
Braxton Osting,
Jonathan Weare
Abstract:
The shortest-path, commute time, and diffusion distances on undirected graphs have been widely employed in applications such as dimensionality reduction, link prediction, and trip planning. Increasingly, there is interest in using asymmetric structure of data derived from Markov chains and directed graphs, but few metrics are specifically adapted to this task. We introduce a metric on the state sp…
▽ More
The shortest-path, commute time, and diffusion distances on undirected graphs have been widely employed in applications such as dimensionality reduction, link prediction, and trip planning. Increasingly, there is interest in using asymmetric structure of data derived from Markov chains and directed graphs, but few metrics are specifically adapted to this task. We introduce a metric on the state space of any ergodic, finite-state, time-homogeneous Markov chain and, in particular, on any Markov chain derived from a directed graph. Our construction is based on hitting probabilities, with nearness in the metric space related to the transfer of random walkers from one node to another at stationarity. Notably, our metric is insensitive to shortest and average walk distances, thus giving new information compared to existing metrics. We use possible degeneracies in the metric to develop an interesting structural theory of directed graphs and explore a related quotienting procedure. Our metric can be computed in $O(n^3)$ time, where $n$ is the number of states, and in examples we scale up to $n=10,000$ nodes and $\approx 38M$ edges on a desktop computer. In several examples, we explore the nature of the metric, compare it to alternative methods, and demonstrate its utility for weak recovery of community structure in dense graphs, visualization, structure recovering, dynamics exploration, and multiscale cluster detection.
△ Less
Submitted 18 January, 2021; v1 submitted 25 June, 2020;
originally announced June 2020.
-
Path properties of atmospheric transitions: illustration with a low-order sudden stratospheric warming model
Authors:
Justin Finkel,
Dorian Abbot,
Jonathan Weare
Abstract:
Many rare weather events, including hurricanes, droughts, and floods, dramatically impact human life. To accurately forecast these events and characterize their climatology requires specialized mathematical techniques to fully leverage the limited data that are available. Here we describe \emph{transition path theory} (TPT), a framework originally developed for molecular simulation, and argue that…
▽ More
Many rare weather events, including hurricanes, droughts, and floods, dramatically impact human life. To accurately forecast these events and characterize their climatology requires specialized mathematical techniques to fully leverage the limited data that are available. Here we describe \emph{transition path theory} (TPT), a framework originally developed for molecular simulation, and argue that it is a useful paradigm for develo** mechanistic understanding of rare climate events. TPT provides a method to calculate statistical properties of the paths into the event. As an initial demonstration of the utility of TPT, we analyze a low-order model of sudden stratospheric warming (SSW), a dramatic disturbance to the polar vortex which can induce extreme cold spells at the surface in the midlatitudes. SSW events pose a major challenge for seasonal weather prediction because of their rapid, complex onset and development. Climate models struggle to capture the long-term statistics of SSW, owing to their diversity and intermittent nature. We use a stochastically forced Holton-Mass-type model with two stable states, corresponding to radiative equilibrium and a vacillating SSW-like regime. In this stochastic bistable setting, from certain probabilistic forecasts TPT facilitates estimation of dominant transition pathways and return times of transitions. These "dynamical statistics" are obtained by solving partial differential equations in the model's phase space. With future application to more complex models, TPT and its constituent quantities promise to improve the predictability of extreme weather events, through both generation and principled evaluation of forecasts.
△ Less
Submitted 26 May, 2020;
originally announced May 2020.
-
Error bounds for dynamical spectral estimation
Authors:
Robert J. Webber,
Erik H. Thiede,
Douglas Dow,
Aaron R. Dinner,
Jonathan Weare
Abstract:
Dynamical spectral estimation is a well-established numerical approach for estimating eigenvalues and eigenfunctions of the Markov transition operator from trajectory data. Although the approach has been widely applied in biomolecular simulations, its error properties remain poorly understood. Here we analyze the error of a dynamical spectral estimation method called "the variational approach to c…
▽ More
Dynamical spectral estimation is a well-established numerical approach for estimating eigenvalues and eigenfunctions of the Markov transition operator from trajectory data. Although the approach has been widely applied in biomolecular simulations, its error properties remain poorly understood. Here we analyze the error of a dynamical spectral estimation method called "the variational approach to conformational dynamics" (VAC). We bound the approximation error and estimation error for VAC estimates. Our analysis establishes VAC's convergence properties and suggests new strategies for tuning VAC to improve accuracy.
△ Less
Submitted 24 September, 2020; v1 submitted 5 May, 2020;
originally announced May 2020.
-
Improved Fast Randomized Iteration Approach to Full Configuration Interaction
Authors:
Samuel M. Greene,
Robert J. Webber,
Jonathan Weare,
Timothy C. Berkelbach
Abstract:
We present three modifications to our recently introduced fast randomized iteration method for full configuration interaction (FCI-FRI) and investigate their effects on the method's performance for Ne, H$_2$O, and N$_2$. The initiator approximation, originally developed for full configuration interaction quantum Monte Carlo, significantly reduces statistical error in FCI-FRI when few samples are u…
▽ More
We present three modifications to our recently introduced fast randomized iteration method for full configuration interaction (FCI-FRI) and investigate their effects on the method's performance for Ne, H$_2$O, and N$_2$. The initiator approximation, originally developed for full configuration interaction quantum Monte Carlo, significantly reduces statistical error in FCI-FRI when few samples are used in compression operations, enabling its application to larger chemical systems. The semi-stochastic extension, which involves exactly preserving a fixed subset of elements in each compression, improves statistical efficiency in some cases but reduces it in others. We also developed a new approach to sampling excitations that yields consistent improvements in statistical efficiency and reductions in computational cost. We discuss possible strategies based on our findings for improving the performance of stochastic quantum chemistry methods more generally.
△ Less
Submitted 20 July, 2020; v1 submitted 1 May, 2020;
originally announced May 2020.
-
NWChem: Past, Present, and Future
Authors:
E. Aprà,
E. J. Bylaska,
W. A. de Jong,
N. Govind,
K. Kowalski,
T. P. Straatsma,
M. Valiev,
H. J. J. van Dam,
Y. Alexeev,
J. Anchell,
V. Anisimov,
F. W. Aquino,
R. Atta-Fynn,
J. Autschbach,
N. P. Bauman,
J. C. Becca,
D. E. Bernholdt,
K. Bhaskaran-Nair,
S. Bogatko,
P. Borowski,
J. Boschen,
J. Brabec,
A. Bruner,
E. Cauët,
Y. Chen
, et al. (89 additional authors not shown)
Abstract:
Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties. In this regard, electronic structure packages have played a special role by using first-principledriven methodologies to model complex chemical and materials…
▽ More
Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties. In this regard, electronic structure packages have played a special role by using first-principledriven methodologies to model complex chemical and materials processes. Over the last few decades, the rapid development of computing technologies and the tremendous increase in computational power have offered a unique chance to study complex transformations using sophisticated and predictive many-body techniques that describe correlated behavior of electrons in molecular and condensed phase systems at different levels of theory. In enabling these simulations, novel parallel algorithms have been able to take advantage of computational resources to address the polynomial scaling of electronic structure methods. In this paper, we briefly review the NWChem computational chemistry suite, including its history, design principles, parallel tools, current capabilities, outreach and outlook.
△ Less
Submitted 26 May, 2020; v1 submitted 24 April, 2020;
originally announced April 2020.
-
A Kinetic Monte Carlo Approach for Simulating Cascading Transmission Line Failure
Authors:
Jacob Roth,
David A. Barajas-Solano,
Panos Stinis,
Jonathan Weare,
Mihai Anitescu
Abstract:
In this work, cascading transmission line failures are studied through a dynamical model of the power system operating under fixed conditions. The power grid is modeled as a stochastic dynamical system where first-principles electromechanical dynamics are excited by small Gaussian disturbances in demand and generation around a specified operating point. In this context, a single line failure is in…
▽ More
In this work, cascading transmission line failures are studied through a dynamical model of the power system operating under fixed conditions. The power grid is modeled as a stochastic dynamical system where first-principles electromechanical dynamics are excited by small Gaussian disturbances in demand and generation around a specified operating point. In this context, a single line failure is interpreted in a large deviation context as a first escape event across a surface in phase space defined by line security constraints. The resulting system of stochastic differential equations admits a transverse decomposition of the drift, which leads to considerable simplification in evaluating the quasipotential (rate function) and, consequently, computation of exit rates. Tractable expressions for the rate of transmission line failure in a restricted network are derived from large deviation theory arguments and validated against numerical simulations. Extensions to realistic settings are considered, and individual line failure models are aggregated into a Markov model of cascading failure inspired by chemical kinetics. Cascades are generated by traversing a graph composed of weighted edges representing transitions to degraded network topologies. Numerical results indicate that the Markov model can produce cascades with qualitative power-law properties similar to those observed in empirical cascades.
△ Less
Submitted 15 December, 2019;
originally announced December 2019.
-
Beyond Walkers in Stochastic Quantum Chemistry: Reducing Error using Fast Randomized Iteration
Authors:
Samuel M. Greene,
Robert J. Webber,
Jonathan Weare,
Timothy C. Berkelbach
Abstract:
We introduce a family of methods for the full configuration interaction problem in quantum chemistry, based on the fast randomized iteration (FRI) framework [L.-H. Lim and J. Weare, SIAM Rev. 59, 547 (2017)]. These methods, which we term "FCI-FRI," stochastically impose sparsity during iterations of the power method and can be viewed as a generalization of full configuration interaction quantum Mo…
▽ More
We introduce a family of methods for the full configuration interaction problem in quantum chemistry, based on the fast randomized iteration (FRI) framework [L.-H. Lim and J. Weare, SIAM Rev. 59, 547 (2017)]. These methods, which we term "FCI-FRI," stochastically impose sparsity during iterations of the power method and can be viewed as a generalization of full configuration interaction quantum Monte Carlo (FCIQMC) without walkers. In addition to the multinomial scheme commonly used to sample excitations in FCIQMC, we present a systematic scheme where excitations are not sampled independently. Performing ground-state calculations on five small molecules at fixed cost, we find that the systematic FCI-FRI scheme is 11 to 45 times more statistically efficient than the multinomial FCI-FRI scheme, which is in turn 1.4 to 178 times more statistically efficient than the original FCIQMC algorithm.
△ Less
Submitted 9 July, 2019; v1 submitted 2 May, 2019;
originally announced May 2019.
-
Maximizing simulated tropical cyclone intensity with action minimization
Authors:
David A. Plotkin,
Robert J. Webber,
Morgan E O'Neill,
Jonathan Weare,
Dorian S. Abbot
Abstract:
Direct computer simulation of intense tropical cyclones (TCs) in weather models is limited by computational expense. Intense TCs are rare and have small-scale structures, making it difficult to produce large ensembles of storms at high resolution. Further, models often fail to capture the process of rapid intensification, which is a distinguishing feature of many intense TCs. Understanding rapid i…
▽ More
Direct computer simulation of intense tropical cyclones (TCs) in weather models is limited by computational expense. Intense TCs are rare and have small-scale structures, making it difficult to produce large ensembles of storms at high resolution. Further, models often fail to capture the process of rapid intensification, which is a distinguishing feature of many intense TCs. Understanding rapid intensification is especially important in the context of global warming, which may increase the frequency of intense TCs. To better leverage computational resources for the study of rapid intensification, we introduce an action minimization algorithm applied to the WRF and WRFPLUS models. Action minimization nudges the model into forming more intense TCs than it otherwise would; it does so via the maximum likelihood path in a stochastic formulation of the model, thereby allowing targeted study of intensification mechanisms.
We apply action minimization to simulations of Hurricanes Danny (2015) and Fred (2009) at 6 km resolution to demonstrate that the algorithm consistently intensifies TCs via physically plausible pathways. We show an approximately ten-fold computational savings using action minimization to study the tail of the TC intensification distribution. Further, for Hurricanes Danny and Fred, action minimization produces perturbations that preferentially reduce low-level shear as compared to upper-level shear, at least above a threshold of approximately $4 \mathrm{\ m \ s^{-1}}$. We also demonstrate that asymmetric, time-dependent patterns of heating can cause significant TC intensification beyond symmetric, azimuthally-averaged heating and find a regime of non-linear response to asymmetric heating that has not been extensively studied in previous work.
△ Less
Submitted 1 May, 2019;
originally announced May 2019.
-
Practical rare event sampling for extreme mesoscale weather
Authors:
Robert J. Webber,
David A. Plotkin,
Morgan E O'Neill,
Dorian S. Abbot,
Jonathan Weare
Abstract:
Extreme mesoscale weather, including tropical cyclones, squall lines, and floods, can be enormously damaging and yet challenging to simulate; hence, there is a pressing need for more efficient simulation strategies. Here we present a new rare event sampling algorithm called Quantile Diffusion Monte Carlo (Quantile DMC). Quantile DMC is a simple-to-use algorithm that can sample extreme tail behavio…
▽ More
Extreme mesoscale weather, including tropical cyclones, squall lines, and floods, can be enormously damaging and yet challenging to simulate; hence, there is a pressing need for more efficient simulation strategies. Here we present a new rare event sampling algorithm called Quantile Diffusion Monte Carlo (Quantile DMC). Quantile DMC is a simple-to-use algorithm that can sample extreme tail behavior for a wide class of processes. We demonstrate the advantages of Quantile DMC compared to other sampling methods and discuss practical aspects of implementing Quantile DMC. To test the feasibility of Quantile DMC for extreme mesoscale weather, we sample extremely intense realizations of two historical tropical cyclones, 2010 Hurricane Earl and 2015 Hurricane Joaquin. Our results demonstrate Quantile DMC's potential to provide low-variance extreme weather statistics while highlighting the work that is necessary for Quantile DMC to attain greater efficiency in future applications.
△ Less
Submitted 6 April, 2019;
originally announced April 2019.
-
Symmetry Breaking in Density Functional Theory due to Dirac Exchange for a Hydrogen Molecule
Authors:
Michael Holst,
Houdong Hu,
Jianfeng Lu,
Jeremy L. Marzuola,
Duo Song,
John Weare
Abstract:
We study symmetry breaking in the mean field solutions to the 2 electron hydrogen molecule within Kohn Sham (KS) local spin density function theory with Dirac exchange (the XLDA model). This simplified model shows behavior related to that of the (KS) spin density functional theory (SDFT) predictions in condensed and molecular systems. The Kohn Sham solutions to the constrained SDFT variation probl…
▽ More
We study symmetry breaking in the mean field solutions to the 2 electron hydrogen molecule within Kohn Sham (KS) local spin density function theory with Dirac exchange (the XLDA model). This simplified model shows behavior related to that of the (KS) spin density functional theory (SDFT) predictions in condensed and molecular systems. The Kohn Sham solutions to the constrained SDFT variation problem undergo spontaneous symmetry breaking as the relative strength of the non-convex exchange term increases. This results in the change of the molecular ground state from a paramagnetic state to an antiferromagnetic ground states and a stationary symmetric delocalized 1st excited state. We further characterize the limiting behavior of the minimizer when the strength of the exchange term goes to infinity. This leads to further bifurcations and highly localized states with varying character. The stability of the various solution classes is demonstrated by Hessian analysis. Finite element numerical results provide support for the formal conjectures.
△ Less
Submitted 22 February, 2021; v1 submitted 9 February, 2019;
originally announced February 2019.
-
Galerkin Approximation of Dynamical Quantities using Trajectory Data
Authors:
Erik H. Thiede,
Dimitrios Giannakis,
Aaron R. Dinner,
Jonathan Weare
Abstract:
Understanding chemical mechanisms requires estimating dynamical statistics such as expected hitting times, reaction rates, and committors. Here, we present a general framework for calculating these dynamical quantities by approximating boundary value problems using dynamical operators with a Galerkin expansion. A specific choice of basis set in the expansion corresponds to estimation of dynamical…
▽ More
Understanding chemical mechanisms requires estimating dynamical statistics such as expected hitting times, reaction rates, and committors. Here, we present a general framework for calculating these dynamical quantities by approximating boundary value problems using dynamical operators with a Galerkin expansion. A specific choice of basis set in the expansion corresponds to estimation of dynamical quantities using a Markov state model. More generally, the boundary conditions impose restrictions on the choice of basis sets. We demonstrate how an alternative basis can be constructed using ideas from diffusion maps. In our numerical experiments, this basis gives results of comparable or better accuracy to Markov state models. Additionally, we show that delay embedding can reduce the information lost when projecting the system's dynamics for model construction; this improves estimates of dynamical statistics considerably over the standard practice of increasing the lag time.
△ Less
Submitted 26 February, 2019; v1 submitted 3 October, 2018;
originally announced October 2018.
-
Simulating the stochastic dynamics and cascade failure of power networks
Authors:
Charles Matthews,
Bradly Stadie,
Jonathan Weare,
Mihai Anitescu,
Christopher Demarco
Abstract:
For large-scale power networks, the failure of particular transmission lines can offload power to other lines and cause self-protection trips to activate, instigating a cascade of line failures. In extreme cases, this can bring down the entire network. Learning where the vulnerabilities are and the expected timescales for which failures are likely is an active area of research. In this article we…
▽ More
For large-scale power networks, the failure of particular transmission lines can offload power to other lines and cause self-protection trips to activate, instigating a cascade of line failures. In extreme cases, this can bring down the entire network. Learning where the vulnerabilities are and the expected timescales for which failures are likely is an active area of research. In this article we present a novel stochastic dynamics model for a large-scale power network along with a framework for efficient computer simulation of the model including long timescale events such as cascade failure. We build on an existing Hamiltonian formulation and introduce stochastic forcing and dam** components to simulate small perturbations to the network. Our model and simulation framework allow assessment of the particular weaknesses in a power network that make it susceptible to cascade failure, along with the timescales and mechanism for expected failures.
△ Less
Submitted 6 June, 2018;
originally announced June 2018.
-
Langevin Markov Chain Monte Carlo with stochastic gradients
Authors:
Charles Matthews,
Jonathan Weare
Abstract:
Monte Carlo sampling techniques have broad applications in machine learning, Bayesian posterior inference, and parameter estimation. Often the target distribution takes the form of a product distribution over a dataset with a large number of entries. For sampling schemes utilizing gradient information it is cheaper for the derivative to be approximated using a random small subset of the data, intr…
▽ More
Monte Carlo sampling techniques have broad applications in machine learning, Bayesian posterior inference, and parameter estimation. Often the target distribution takes the form of a product distribution over a dataset with a large number of entries. For sampling schemes utilizing gradient information it is cheaper for the derivative to be approximated using a random small subset of the data, introducing extra noise into the system. We present a new discretization scheme for underdamped Langevin dynamics when utilizing a stochastic (noisy) gradient. This scheme is shown to bias computed averages to second order in the stepsize while giving exact results in the special case of sampling a Gaussian distribution with a normally distributed stochastic gradient.
△ Less
Submitted 17 September, 2019; v1 submitted 22 May, 2018;
originally announced May 2018.
-
Umbrella sampling: a powerful method to sample tails of distributions
Authors:
Charles Matthews,
Jonathan Weare,
Andrey Kravtsov,
Elise Jennings
Abstract:
We present the umbrella sampling (US) technique and show that it can be used to sample extremely low probability areas of the posterior distribution that may be required in statistical analyses of data. In this approach sampling of the target likelihood is split into sampling of multiple biased likelihoods confined within individual umbrella windows. We show that the US algorithm is efficient and…
▽ More
We present the umbrella sampling (US) technique and show that it can be used to sample extremely low probability areas of the posterior distribution that may be required in statistical analyses of data. In this approach sampling of the target likelihood is split into sampling of multiple biased likelihoods confined within individual umbrella windows. We show that the US algorithm is efficient and highly parallel and that it can be easily used with other existing MCMC samplers. The method allows the user to capitalize on their intuition and define umbrella windows and increase sampling accuracy along specific directions in the parameter space. Alternatively, one can define umbrella windows using an approach similar to parallel tempering. We provide a public code that implements umbrella sampling as a standalone python package. We present a number of tests illustrating the power of the US method in sampling low probability areas of the posterior and show that this ability allows a considerably more robust sampling of multi-modal distributions compared to the standard sampling methods. We also present an application of the method in a real world example of deriving cosmological constraints using the supernova type Ia data. We show that umbrella sampling can sample the posterior accurately down to the $\approx 15σ$ credible region in the $Ω_{\rm m}-Ω_Λ$ plane, while for the same computational work the affine-invariant MCMC sampling implemented in the {\tt emcee} code samples the posterior reliably only to $\approx 3σ$.
△ Less
Submitted 13 December, 2017;
originally announced December 2017.
-
Stratification as a general variance reduction method for Markov chain Monte Carlo
Authors:
Aaron R. Dinner,
Erik Thiede,
Brian Van Koten,
Jonathan Weare
Abstract:
The Eigenvector Method for Umbrella Sampling (EMUS) belongs to a popular class of methods in statistical mechanics which adapt the principle of stratified survey sampling to the computation of free energies. We develop a detailed theoretical analysis of EMUS. Based on this analysis, we show that EMUS is an efficient general method for computing averages over arbitrary target distributions. In part…
▽ More
The Eigenvector Method for Umbrella Sampling (EMUS) belongs to a popular class of methods in statistical mechanics which adapt the principle of stratified survey sampling to the computation of free energies. We develop a detailed theoretical analysis of EMUS. Based on this analysis, we show that EMUS is an efficient general method for computing averages over arbitrary target distributions. In particular, we show that EMUS can be dramatically more efficient than direct MCMC when the target distribution is multimodal or when the goal is to compute tail probabilities. To illustrate these theoretical results, we present a tutorial application of the method to a problem from Bayesian statistics.
△ Less
Submitted 19 June, 2020; v1 submitted 23 May, 2017;
originally announced May 2017.
-
Trajectory stratification of stochastic dynamics
Authors:
Aaron R. Dinner,
Jonathan C. Mattingly,
Jeremy O. B. Tempkin,
Brian Van Koten,
Jonathan Weare
Abstract:
We present a general mathematical framework for trajectory stratification for simulating rare events. Trajectory stratification involves decomposing trajectories of the underlying process into fragments limited to restricted regions of state space (strata), computing averages over the distributions of the trajectory fragments within the strata with minimal communication between them, and combining…
▽ More
We present a general mathematical framework for trajectory stratification for simulating rare events. Trajectory stratification involves decomposing trajectories of the underlying process into fragments limited to restricted regions of state space (strata), computing averages over the distributions of the trajectory fragments within the strata with minimal communication between them, and combining those averages with appropriate weights to yield averages with respect to the original underlying process. Our framework reveals the full generality and flexibility of trajectory stratification, and it illuminates a common mathematical structure shared by existing algorithms for sampling rare events. We demonstrate the power of the framework by defining strata in terms of both points in time and path-dependent variables for efficiently estimating averages that were not previously tractable.
△ Less
Submitted 10 November, 2017; v1 submitted 28 October, 2016;
originally announced October 2016.
-
Ensemble preconditioning for Markov chain Monte Carlo simulation
Authors:
Charles Matthews,
Jonathan Weare,
Benedict Leimkuhler
Abstract:
We describe parallel Markov chain Monte Carlo methods that propagate a collective ensemble of paths, with local covariance information calculated from neighboring replicas. The use of collective dynamics eliminates multiplicative noise and stabilizes the dynamics thus providing a practical approach to difficult anisotropic sampling problems in high dimensions. Numerical experiments with model prob…
▽ More
We describe parallel Markov chain Monte Carlo methods that propagate a collective ensemble of paths, with local covariance information calculated from neighboring replicas. The use of collective dynamics eliminates multiplicative noise and stabilizes the dynamics thus providing a practical approach to difficult anisotropic sampling problems in high dimensions. Numerical experiments with model problems demonstrate that dramatic potential speedups, compared to various alternative schemes, are attainable.
△ Less
Submitted 13 July, 2016;
originally announced July 2016.
-
Eigenvector method for umbrella sampling enables error analysis
Authors:
Erik Thiede,
Brian Van Koten,
Jonathan Weare,
Aaron R. Dinner
Abstract:
Umbrella sampling efficiently yields equilibrium averages that depend on exploring rare states of a model by biasing simulations to windows of coordinate values and then combining the resulting data with physical weighting. Here, we introduce a mathematical framework that casts the step of combining the data as an eigenproblem. The advantage to this approach is that it facilitates error analysis.…
▽ More
Umbrella sampling efficiently yields equilibrium averages that depend on exploring rare states of a model by biasing simulations to windows of coordinate values and then combining the resulting data with physical weighting. Here, we introduce a mathematical framework that casts the step of combining the data as an eigenproblem. The advantage to this approach is that it facilitates error analysis. We discuss how the error scales with the number of windows. Then, we derive a central limit theorem for averages that are obtained from umbrella sampling. The central limit theorem suggests an estimator of the error contributions from individual windows, and we develop a simple and computationally inexpensive procedure for implementing it. We demonstrate this estimator for simulations of the alanine dipeptide and show that it emphasizes low free energy pathways between stable states in comparison to existing approaches for assessing error contributions. We discuss the possibility of using the estimator and, more generally, the eigenvector method for umbrella sampling to guide adaptation of the simulation parameters to accelerate convergence.
△ Less
Submitted 14 March, 2016;
originally announced March 2016.
-
Fast randomized iteration: diffusion Monte Carlo through the lens of numerical linear algebra
Authors:
Lek-Heng Lim,
Jonathan Weare
Abstract:
We review the basic outline of the highly successful diffusion Monte Carlo technique commonly used in contexts ranging from electronic structure calculations to rare event simulation and data assimilation, and propose a new class of randomized iterative algorithms based on similar principles to address a variety of common tasks in numerical linear algebra. From the point of view of numerical linea…
▽ More
We review the basic outline of the highly successful diffusion Monte Carlo technique commonly used in contexts ranging from electronic structure calculations to rare event simulation and data assimilation, and propose a new class of randomized iterative algorithms based on similar principles to address a variety of common tasks in numerical linear algebra. From the point of view of numerical linear algebra, the main novelty of the Fast Randomized Iteration schemes described in this article is that they work in either linear or constant cost per iteration (and in total, under appropriate conditions) and are rather versatile: we will show how they apply to solution of linear systems, eigenvalue problems, and matrix exponentiation, in dimensions far beyond the present limits of numerical linear algebra. While traditional iterative methods in numerical linear algebra were created in part to deal with instances where a matrix (of size $\mathcal{O}(n^2)$) is too big to store, the algorithms that we propose are effective even in instances where the solution vector itself (of size $\mathcal{O}(n)$) may be too big to store or manipulate. In fact, our work is motivated by recent DMC based quantum Monte Carlo schemes that have been applied to matrices as large as $10^{108} \times 10^{108}$. We provide basic convergence results, discuss the dependence of these results on the dimension of the system, and demonstrate dramatic cost savings on a range of test problems.
△ Less
Submitted 9 October, 2017; v1 submitted 25 August, 2015;
originally announced August 2015.
-
Sharp entrywise perturbation bounds for Markov chains
Authors:
Erik Thiede,
Brian Van Koten,
Jonathan Weare
Abstract:
For many Markov chains of practical interest, the invariant distribution is extremely sensitive to perturbations of some entries of the transition matrix, but insensitive to others; we give an example of such a chain, motivated by a problem in computational statistical physics. We have derived perturbation bounds on the relative error of the invariant distribution that reveal these variations in s…
▽ More
For many Markov chains of practical interest, the invariant distribution is extremely sensitive to perturbations of some entries of the transition matrix, but insensitive to others; we give an example of such a chain, motivated by a problem in computational statistical physics. We have derived perturbation bounds on the relative error of the invariant distribution that reveal these variations in sensitivity.
Our bounds are sharp, we do not impose any structural assumptions on the transition matrix or on the perturbation, and computing the bounds has the same complexity as computing the invariant distribution or computing other bounds in the literature. Moreover, our bounds have a simple interpretation in terms of hitting times, which can be used to draw intuitive but rigorous conclusions about the sensitivity of a chain to various types of perturbations.
△ Less
Submitted 9 October, 2015; v1 submitted 6 October, 2014;
originally announced October 2014.
-
The Brownian fan
Authors:
Martin Hairer,
Jonathan Weare
Abstract:
We provide a mathematical study of the modified Diffusion Monte Carlo (DMC) algorithm introduced in the companion article \cite{DMC}. DMC is a simulation technique that uses branching particle systems to represent expectations associated with Feynman-Kac formulae. We provide a detailed heuristic explanation of why, in cases in which a stochastic integral appears in the Feynman-Kac formula (e.g. in…
▽ More
We provide a mathematical study of the modified Diffusion Monte Carlo (DMC) algorithm introduced in the companion article \cite{DMC}. DMC is a simulation technique that uses branching particle systems to represent expectations associated with Feynman-Kac formulae. We provide a detailed heuristic explanation of why, in cases in which a stochastic integral appears in the Feynman-Kac formula (e.g. in rare event simulation, continuous time filtering, and other settings), the new algorithm is expected to converge in a suitable sense to a limiting process as the time interval between branching steps goes to 0. The situation studied here stands in stark contrast to the "naïve" generalisation of the DMC algorithm which would lead to an exponential explosion of the number of particles, thus precluding the existence of any finite limiting object. Convergence is shown rigorously in the simplest possible situation of a random walk, biased by a linear potential. The resulting limiting object, which we call the "Brownian fan", is a very natural new mathematical object of independent interest.
△ Less
Submitted 9 April, 2014;
originally announced April 2014.
-
The relaxation of a family of broken bond crystal surface models
Authors:
Jeremy L. Marzuola,
Jonathan Weare
Abstract:
We study the continuum limit of a family of kinetic Monte Carlo models of crystal surface relaxation that includes both the solid-on-solid and discrete Gaussian models. With computational experiments and theoretical arguments we are able to derive several partial differential equation limits identified (or nearly identified) in previous studies and to clarify the correct choice of surface tension…
▽ More
We study the continuum limit of a family of kinetic Monte Carlo models of crystal surface relaxation that includes both the solid-on-solid and discrete Gaussian models. With computational experiments and theoretical arguments we are able to derive several partial differential equation limits identified (or nearly identified) in previous studies and to clarify the correct choice of surface tension appearing in the PDE and the correct scaling regime giving rise to each PDE. We also provide preliminary computational investigations of a number of interesting qualitative features of the large scale behavior of the models.
△ Less
Submitted 11 April, 2013;
originally announced April 2013.
-
Improved diffusion Monte Carlo
Authors:
Martin Hairer,
Jonathan Weare
Abstract:
We propose a modification, based on the RESTART (repetitive simulation trials after reaching thresholds) and DPR (dynamics probability redistribution) rare event simulation algorithms, of the standard diffusion Monte Carlo (DMC) algorithm. The new algorithm has a lower variance per workload, regardless of the regime considered. In particular, it makes it feasible to use DMC in situations where the…
▽ More
We propose a modification, based on the RESTART (repetitive simulation trials after reaching thresholds) and DPR (dynamics probability redistribution) rare event simulation algorithms, of the standard diffusion Monte Carlo (DMC) algorithm. The new algorithm has a lower variance per workload, regardless of the regime considered. In particular, it makes it feasible to use DMC in situations where the "naïve" generalisation of the standard algorithm would be impractical, due to an exponential explosion of its variance. We numerically demonstrate the effectiveness of the new algorithm on a standard rare event simulation problem (probability of an unlikely transition in a Lennard-Jones cluster), as well as a high-frequency data assimilation problem.
△ Less
Submitted 9 April, 2014; v1 submitted 12 July, 2012;
originally announced July 2012.
-
Data assimilation in the low noise regime with application to the Kuroshio
Authors:
Eric Vanden-Eijnden,
Jonathan Weare
Abstract:
On-line data assimilation techniques such as ensemble Kalman filters and particle filters lose accuracy dramatically when presented with an unlikely observation. Such an observation may be caused by an unusually large measurement error or reflect a rare fluctuation in the dynamics of the system. Over a long enough span of time it becomes likely that one or several of these events will occur. Often…
▽ More
On-line data assimilation techniques such as ensemble Kalman filters and particle filters lose accuracy dramatically when presented with an unlikely observation. Such an observation may be caused by an unusually large measurement error or reflect a rare fluctuation in the dynamics of the system. Over a long enough span of time it becomes likely that one or several of these events will occur. Often they are signatures of the most interesting features of the underlying system and their prediction becomes the primary focus of the data assimilation procedure. The Kuroshio or Black Current that runs along the eastern coast of Japan is an example of such a system. It undergoes infrequent but dramatic changes of state between a small meander during which the current remains close to the coast of Japan, and a large meander during which it bulges away from the coast. Because of the important role that the Kuroshio plays in distributing heat and salinity in the surrounding region, prediction of these transitions is of acute interest. Here we focus on a regime in which both the stochastic forcing on the system and the observational noise are small. In this setting large deviation theory can be used to understand why standard filtering methods fail and guide the design of the more effective data assimilation techniques. Motivated by our analysis we propose several data assimilation strategies capable of efficiently handling rare events such as the transitions of the Kuroshio. These techniques are tested on a model of the Kuroshio and shown to perform much better than standard filtering methods.
△ Less
Submitted 14 March, 2014; v1 submitted 22 February, 2012;
originally announced February 2012.
-
Steered Transition Path Sampling
Authors:
Nicholas Guttenberg,
Aaron R. Dinner,
Jonathan Weare
Abstract:
We introduce a path sampling method for obtaining statistical properties of an arbitrary stochastic dynamics. The method works by decomposing a trajectory in time, estimating the probability of satisfying a progress constraint, modifying the dynamics based on that probability, and then reweighting to calculate averages. Because the progress constraint can be formulated in terms of occurrences of e…
▽ More
We introduce a path sampling method for obtaining statistical properties of an arbitrary stochastic dynamics. The method works by decomposing a trajectory in time, estimating the probability of satisfying a progress constraint, modifying the dynamics based on that probability, and then reweighting to calculate averages. Because the progress constraint can be formulated in terms of occurrences of events within time intervals, the method is particularly well suited for controlling the sampling of currents of dynamic events. We demonstrate the method for calculating transition probabilities in barrier crossing problems and survival probabilities in strongly diffusive systems with absorbing states, which are difficult to treat by shooting. We discuss the relation of the algorithm to other methods.
△ Less
Submitted 1 February, 2012;
originally announced February 2012.
-
An Affine-Invariant Sampler for Exoplanet Fitting and Discovery in Radial Velocity Data
Authors:
Fengji Hou,
Jonathan Goodman,
David W. Hogg,
Jonathan Weare,
Christian Schwab
Abstract:
Markov Chain Monte Carlo (MCMC) proves to be powerful for Bayesian inference and in particular for exoplanet radial velocity fitting because MCMC provides more statistical information and makes better use of data than common approaches like chi-square fitting. However, the non-linear density functions encountered in these problems can make MCMC time-consuming. In this paper, we apply an ensemble s…
▽ More
Markov Chain Monte Carlo (MCMC) proves to be powerful for Bayesian inference and in particular for exoplanet radial velocity fitting because MCMC provides more statistical information and makes better use of data than common approaches like chi-square fitting. However, the non-linear density functions encountered in these problems can make MCMC time-consuming. In this paper, we apply an ensemble sampler respecting affine invariance to orbital parameter extraction from radial velocity data. This new sampler has only one free parameter, and it does not require much tuning for good performance, which is important for automatization. The autocorrelation time of this sampler is approximately the same for all parameters and far smaller than Metropolis-Hastings, which means it requires many fewer function calls to produce the same number of independent samples. The affine-invariant sampler speeds up MCMC by hundreds of times compared with Metropolis-Hastings in the same computing situation. This novel sampler would be ideal for projects involving large datasets such as statistical investigations of planet distribution. The biggest obstacle to ensemble samplers is the existence of multiple local optima; we present a clustering technique to deal with local optima by clustering based on the likelihood of the walkers in the ensemble. We demonstrate the effectiveness of the sampler on real radial velocity data.
△ Less
Submitted 30 November, 2011; v1 submitted 13 April, 2011;
originally announced April 2011.
-
Parallel marginalization Monte Carlo with applications to conditional path sampling
Authors:
Jonathan Weare
Abstract:
Monte Carlo sampling methods often suffer from long correlation times. Consequently, these methods must be run for many steps to generate an independent sample. In this paper a method is proposed to overcome this difficulty. The method utilizes information from rapidly equilibrating coarse Markov chains that sample marginal distributions of the full system. This is accomplished through exchanges…
▽ More
Monte Carlo sampling methods often suffer from long correlation times. Consequently, these methods must be run for many steps to generate an independent sample. In this paper a method is proposed to overcome this difficulty. The method utilizes information from rapidly equilibrating coarse Markov chains that sample marginal distributions of the full system. This is accomplished through exchanges between the full chain and the auxiliary coarse chains. Results of numerical tests on the bridge sampling and filtering/smoothing problems for a stochastic differential equation are presented.
△ Less
Submitted 11 September, 2007;
originally announced September 2007.