Search | arXiv e-print repository

The surprising efficiency of temporal difference learning for rare event prediction

Abstract: We quantify the efficiency of temporal difference (TD) learning over the direct, or Monte Carlo (MC), estimator for policy evaluation in reinforcement learning, with an emphasis on estimation of quantities related to rare events. Policy evaluation is complicated in the rare event setting by the long timescale of the event and by the need for \emph{relative accuracy} in estimates of very small valu… ▽ More We quantify the efficiency of temporal difference (TD) learning over the direct, or Monte Carlo (MC), estimator for policy evaluation in reinforcement learning, with an emphasis on estimation of quantities related to rare events. Policy evaluation is complicated in the rare event setting by the long timescale of the event and by the need for \emph{relative accuracy} in estimates of very small values. Specifically, we focus on least-squares TD (LSTD) prediction for finite state Markov chains, and show that LSTD can achieve relative accuracy far more efficiently than MC. We prove a central limit theorem for the LSTD estimator and upper bound the \emph{relative asymptotic variance} by simple quantities characterizing the connectivity of states relative to the transition probabilities between them. Using this bound, we show that, even when both the timescale of the rare event and the relative accuracy of the MC estimator are exponentially large in the number of states, LSTD maintains a fixed level of relative accuracy with a total number of observed transitions of the Markov chain that is only \emph{polynomially} large in the number of states. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2404.19653 [pdf, other]

BAD-NEUS: Rapidly converging trajectory stratification

Authors: John Strahan, Chatipat Lorpaiboon, Jonathan Weare, Aaron R. Dinner

Abstract: An issue for molecular dynamics simulations is that events of interest often involve timescales that are much longer than the simulation time step, which is set by the fastest timescales of the model. Because of this timescale separation, direct simulation of many events is prohibitively computationally costly. This issue can be overcome by aggregating information from many relatively short simula… ▽ More An issue for molecular dynamics simulations is that events of interest often involve timescales that are much longer than the simulation time step, which is set by the fastest timescales of the model. Because of this timescale separation, direct simulation of many events is prohibitively computationally costly. This issue can be overcome by aggregating information from many relatively short simulations that sample segments of trajectories involving events of interest. This is the strategy of Markov state models (MSMs) and related approaches, but such methods suffer from approximation error because the variables defining the states generally do not capture the dynamics fully. By contrast, once converged, the weighted ensemble (WE) method aggregates information from trajectory segments so as to yield unbiased estimates of both thermodynamic and kinetic statistics. Unfortunately, errors decay no faster than unbiased simulation in WE. Here we introduce a theoretical framework for describing WE that shows that introduction of an element of stratification, as in nonequilibrium umbrella sampling (NEUS), accelerates convergence. Then, building on ideas from MSMs and related methods, we propose an improved stratification that allows approximation error to be reduced systematically. We show that the improved stratification can decrease simulation times required to achieve a desired precision by orders of magnitude. △ Less

Submitted 30 April, 2024; originally announced April 2024.

Comments: 18 pages, 10 figures

arXiv:2404.08613 [pdf, other]

Using Explainable AI and Transfer Learning to understand and predict the maintenance of Atlantic blocking with limited observational data

Authors: Huan Zhang, Justin Finkel, Dorian S. Abbot, Edwin P. Gerber, Jonathan Weare

Abstract: Blocking events are an important cause of extreme weather, especially long-lasting blocking events that trap weather systems in place. The duration of blocking events is, however, underestimated in climate models. Explainable Artificial Intelligence are a class of data analysis methods that can help identify physical causes of prolonged blocking events and diagnose model deficiencies. We demonstra… ▽ More Blocking events are an important cause of extreme weather, especially long-lasting blocking events that trap weather systems in place. The duration of blocking events is, however, underestimated in climate models. Explainable Artificial Intelligence are a class of data analysis methods that can help identify physical causes of prolonged blocking events and diagnose model deficiencies. We demonstrate this approach on an idealized quasigeostrophic model developed by Marshall and Molteni (1993). We train a convolutional neural network (CNN), and subsequently, build a sparse predictive model for the persistence of Atlantic blocking, conditioned on an initial high-pressure anomaly. Shapley Additive ExPlanation (SHAP) analysis reveals that high-pressure anomalies in the American Southeast and North Atlantic, separated by a trough over Atlantic Canada, contribute significantly to prediction of sustained blocking events in the Atlantic region. This agrees with previous work that identified precursors in the same regions via wave train analysis. When we apply the same CNN to blockings in the ERA5 atmospheric reanalysis, there is insufficient data to accurately predict persistent blocks. We partially overcome this limitation by pre-training the CNN on the plentiful data of the Marshall-Molteni model, and then using Transfer Learning to achieve better predictions than direct training. SHAP analysis before and after transfer learning allows a comparison between the predictive features in the reanalysis and the quasigeostrophic model, quantifying dynamical biases in the idealized model. This work demonstrates the potential for machine learning methods to extract meaningful precursors of extreme weather events and achieve better prediction using limited observational data. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 29 pages, 10 figures

arXiv:2401.12180 [pdf, other]

doi 10.3847/2515-5172/ad18a6

AI can identify Solar System instability billions of years in advance

Authors: Dorian S. Abbot, J. D. Laurence-Chasen, Robert J. Webber, David M. Hernandez, Jonathan Weare

Abstract: Rare event schemes require an approximation of the probability of the rare event as a function of system state. Finding an appropriate reaction coordinate is typically the most challenging aspect of applying a rare event scheme. Here we develop an artificial intelligence (AI) based reaction coordinate that effectively predicts which of a limited number of simulations of the Solar System will go un… ▽ More Rare event schemes require an approximation of the probability of the rare event as a function of system state. Finding an appropriate reaction coordinate is typically the most challenging aspect of applying a rare event scheme. Here we develop an artificial intelligence (AI) based reaction coordinate that effectively predicts which of a limited number of simulations of the Solar System will go unstable using a convolutional neural network classifier. The performance of the algorithm does not degrade significantly even 3.5 billion years before the instability. We overcome the class imbalance intrinsic to rare event problems using a combination of minority class oversampling, increased minority class weighting, and pulling multiple non-overlap** training sequences from simulations. Our success suggests that AI may provide a promising avenue for develo** reaction coordinates without detailed theoretical knowledge of the system. △ Less

Submitted 15 January, 2024; originally announced January 2024.

Comments: 3 pages, 1 figure

Journal ref: Res. Notes AAS 8 3 (2024)

arXiv:2311.07881 [pdf, other]

Accurate estimates of dynamical statistics using memory

Authors: Chatipat Lorpaiboon, Spencer C. Guo, John Strahan, Jonathan Weare, Aaron R. Dinner

Abstract: Many chemical reactions and molecular processes occur on timescales that are significantly longer than those accessible by direct simulation. One successful approach to estimating dynamical statistics for such processes is to use many short time series observations of the system to construct a Markov state model (MSM), which approximates the dynamics of the system as memoryless transitions between… ▽ More Many chemical reactions and molecular processes occur on timescales that are significantly longer than those accessible by direct simulation. One successful approach to estimating dynamical statistics for such processes is to use many short time series observations of the system to construct a Markov state model (MSM), which approximates the dynamics of the system as memoryless transitions between a set of discrete states. The dynamical Galerkin approximation (DGA) generalizes MSMs for the problem of calculating dynamical statistics, such as committors and mean first passage times, by replacing the set of discrete states with a projection onto a basis. Because the projected dynamics are generally not memoryless, the Markov approximation can result in significant systematic error. Inspired by quasi-Markov state models, which employ the generalized master equation to encode memory resulting from the projection, we reformulate DGA to account for memory and analyze its performance on two systems: a two-dimensional triple well and helix-to-helix transitions of the AIB$_9$ peptide. We demonstrate that our method is robust to the choice of basis and can decrease the time series length required to obtain accurate kinetics by an order of magnitude. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: 17 pages, 14 figures

arXiv:2310.04966 [pdf, other]

Improved Active Learning via Dependent Leverage Score Sampling

Authors: Atsushi Shimizu, Xiaoou Cheng, Christopher Musco, Jonathan Weare

Abstract: We show how to obtain improved active learning methods in the agnostic (adversarial noise) setting by combining marginal leverage score sampling with non-independent sampling strategies that promote spatial coverage. In particular, we propose an easily implemented method based on the \emph{pivotal sampling algorithm}, which we test on problems motivated by learning-based methods for parametric PDE… ▽ More We show how to obtain improved active learning methods in the agnostic (adversarial noise) setting by combining marginal leverage score sampling with non-independent sampling strategies that promote spatial coverage. In particular, we propose an easily implemented method based on the \emph{pivotal sampling algorithm}, which we test on problems motivated by learning-based methods for parametric PDEs and uncertainty quantification. In comparison to independent sampling, our method reduces the number of samples needed to reach a given target accuracy by up to $50\%$. We support our findings with two theoretical results. First, we show that any non-independent leverage score sampling method that obeys a weak \emph{one-sided $\ell_{\infty}$ independence condition} (which includes pivotal sampling) can actively learn $d$ dimensional linear functions with $O(d\log d)$ samples, matching independent sampling. This result extends recent work on matrix Chernoff bounds under $\ell_{\infty}$ independence, and may be of interest for analyzing other sampling strategies beyond pivotal sampling. Second, we show that, for the important case of polynomial regression, our pivotal method obtains an improved bound on $O(d)$ samples. △ Less

Submitted 4 May, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

Comments: To appear at ICLR 2024

arXiv:2309.17270 [pdf, other]

Randomly sparsified Richardson iteration is really fast

Authors: Jonathan Weare, Robert J. Webber

Abstract: Recently, a class of algorithms combining classical fixed point iterations with repeated random sparsification of approximate solution vectors has been successfully applied to eigenproblems with matrices as large as $10^{108} \times 10^{108}$. So far, a complete mathematical explanation for their success has proven elusive. Additionally, the methods have not been extended to linear system solves.… ▽ More Recently, a class of algorithms combining classical fixed point iterations with repeated random sparsification of approximate solution vectors has been successfully applied to eigenproblems with matrices as large as $10^{108} \times 10^{108}$. So far, a complete mathematical explanation for their success has proven elusive. Additionally, the methods have not been extended to linear system solves. In this paper we propose a new scheme based on repeated random sparsification that is capable of solving linear systems in extremely high dimensions. We provide a complete mathematical analysis of this new algorithm. Our analysis establishes a faster-than-Monte Carlo convergence rate and justifies use of the scheme even when the solution vector itself is too large to store. △ Less

Submitted 17 November, 2023; v1 submitted 29 September, 2023; originally announced September 2023.

Comments: 27 pages, 2 figures

arXiv:2306.11870 [pdf, other]

Mercury's chaotic secular evolution as a subdiffusive process

Authors: Dorian S. Abbot, Robert J. Webber, David M. Hernandez, Sam Hadden, Jonathan Weare

Abstract: Mercury's orbit can destabilize, generally resulting in a collision with either Venus or the Sun. Chaotic evolution can cause g1 to decrease to the approximately constant value of g5 and create a resonance. Previous work has approximated the variation in g1 as stochastic diffusion, which leads to a phenomological model that can reproduce the Mercury instability statistics of secular and N-body mod… ▽ More Mercury's orbit can destabilize, generally resulting in a collision with either Venus or the Sun. Chaotic evolution can cause g1 to decrease to the approximately constant value of g5 and create a resonance. Previous work has approximated the variation in g1 as stochastic diffusion, which leads to a phenomological model that can reproduce the Mercury instability statistics of secular and N-body models on timescales longer than 10 Gyr. Here we show that the diffusive model underpredicts the Mercury instability probability by a factor of 3-10,000 on timescales less than 5 Gyr, the remaining lifespan of the Solar System. This is because g1 exhibits larger variations on short timescales than the diffusive model would suggest. To better model the variations on short timescales, we build a new subdiffusive phenomological model for g1. Subdiffusion is similar to diffusion but exhibits larger displacements on short timescales and smaller displacements on long timescales. We choose model parameters based on the behavior of the g1 trajectories in the N-body simulations, leading to a tuned model that can reproduce Mercury instability statistics from 1-40 Gyr. This work motivates fundamental questions in Solar System dynamics: Why does subdiffusion better approximate the variation in g1 than standard diffusion? Why is there an upper bound on g1, but not a lower bound that would prevent it from reaching g5? △ Less

Submitted 12 April, 2024; v1 submitted 20 June, 2023; originally announced June 2023.

Comments: accepted at ApJ

arXiv:2303.12534 [pdf, other]

doi 10.1063/5.0151309

Inexact iterative numerical linear algebra for neural network-based spectral estimation and rare-event prediction

Authors: John Strahan, Spencer C. Guo, Chatipat Lorpaiboon, Aaron R. Dinner, Jonathan Weare

Abstract: Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics such as the likelihood and average time of events (predictio… ▽ More Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics such as the likelihood and average time of events (predictions). Here we develop inexact iterative linear algebra methods for computing these eigenfunctions (spectral estimation) and making predictions from a data set of short trajectories sampled at finite intervals. We demonstrate the methods on a low-dimensional model that facilitates visualization and a high-dimensional model of a biomolecular system. Implications for the prediction problem in reinforcement learning are discussed. △ Less

Submitted 20 July, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: 24 pages, 16 figures

MSC Class: 62M20; 65C40; 62M45; 37M10

Journal ref: J. Chem. Phys. 159, 014110 (2023)

arXiv:2212.14844 [pdf, other]

doi 10.3847/1538-4357/acb6ff

Simple physics and integrators accurately reproduce Mercury instability statistics

Authors: Dorian S. Abbot, David M. Hernandez, Sam Hadden, Robert J. Webber, Georgios P. Afentakis, Jonathan Weare

Abstract: The long-term stability of the Solar System is an issue of significant scientific and philosophical interest. The mechanism leading to instability is Mercury's eccentricity being pumped up so high that Mercury either collides with Venus or is scattered into the Sun. Previously, only three five-billion-year $N$-body ensembles of the Solar System with thousands of simulations have been run to assess… ▽ More The long-term stability of the Solar System is an issue of significant scientific and philosophical interest. The mechanism leading to instability is Mercury's eccentricity being pumped up so high that Mercury either collides with Venus or is scattered into the Sun. Previously, only three five-billion-year $N$-body ensembles of the Solar System with thousands of simulations have been run to assess long-term stability. We generate two additional ensembles, each with 2750 members, and make them publicly available at \texttt{https://archive.org/details/@dorianabbot}. We find that accurate Mercury instability statistics can be obtained by (1) including only the Sun and the 8 planets, (2) using a simple Wisdom-Holman scheme without correctors, (3) using a basic representation of general relativity, and (4) using a time step of 3.16 days. By combining our Solar System ensembles with previous ensembles we form a 9,601-member ensemble of ensembles. In this ensemble of ensembles, the logarithm of the frequency of a Mercury instability event increases linearly with time between 1.3 and 5 Gyr, suggesting that a single mechanism is responsible for Mercury instabilities in this time range and that this mechanism becomes more active as time progresses. Our work provides a robust estimate of Mercury instability statistics over the next five billion years, outlines methodologies that may be useful for exoplanet system investigations, and provides two large ensembles of publicly available Solar System integrations that can serve as testbeds for theoretical ideas as well as training sets for artificial intelligence schemes. △ Less

Submitted 21 February, 2023; v1 submitted 30 December, 2022; originally announced December 2022.

Comments: accepted at ApJ

Journal ref: Ap.J. 944 (2023) 190

arXiv:2211.09767 [pdf, other]

doi 10.1103/PhysRevResearch.5.023101

Understanding and eliminating spurious modes in variational Monte Carlo using collective variables

Authors: Huan Zhang, Robert J. Webber, Michael Lindsey, Timothy C. Berkelbach, Jonathan Weare

Abstract: The use of neural network parametrizations to represent the ground state in variational Monte Carlo (VMC) calculations has generated intense interest in recent years. However, as we demonstrate in the context of the periodic Heisenberg spin chain, this approach can produce unreliable wave function approximations. One of the most obvious signs of failure is the occurrence of random, persistent spik… ▽ More The use of neural network parametrizations to represent the ground state in variational Monte Carlo (VMC) calculations has generated intense interest in recent years. However, as we demonstrate in the context of the periodic Heisenberg spin chain, this approach can produce unreliable wave function approximations. One of the most obvious signs of failure is the occurrence of random, persistent spikes in the energy estimate during training. These energy spikes are caused by regions of configuration space that are over-represented by the wave function density, which are called ``spurious modes'' in the machine learning literature. After exploring these spurious modes in detail, we demonstrate that a collective-variable-based penalization yields a substantially more robust training procedure, preventing the formation of spurious modes and improving the accuracy of energy estimates. Because the penalization scheme is cheap to implement and is not specific to the particular model studied here, it can be extended to other applications of VMC where a reasonable choice of collective variable is available. △ Less

Submitted 11 November, 2022; originally announced November 2022.

Comments: 12 pages, 13 figures

Journal ref: Phys.Rev.Research 5 (2023) 023101

arXiv:2208.01717 [pdf, other]

doi 10.1016/j.jcp.2023.112152

Predicting rare events using neural networks and short-trajectory data

Authors: John Strahan, Justin Finkel, Aaron R. Dinner, Jonathan Weare

Abstract: Estimating the likelihood, timing, and nature of events is a major goal of modeling stochastic dynamical systems. When the event is rare in comparison with the timescales of simulation and/or measurement needed to resolve the elemental dynamics, accurate prediction from direct observations becomes challenging. In such cases a more effective approach is to cast statistics of interest as solutions t… ▽ More Estimating the likelihood, timing, and nature of events is a major goal of modeling stochastic dynamical systems. When the event is rare in comparison with the timescales of simulation and/or measurement needed to resolve the elemental dynamics, accurate prediction from direct observations becomes challenging. In such cases a more effective approach is to cast statistics of interest as solutions to Feynman-Kac equations (partial differential equations). Here, we develop an approach to solve Feynman-Kac equations by training neural networks on short-trajectory data. Our approach is based on a Markov approximation but otherwise avoids assumptions about the underlying model and dynamics. This makes it applicable to treating complex computational models and observational data. We illustrate the advantages of our method using a low-dimensional model that facilitates visualization, and this analysis motivates an adaptive sampling strategy that allows on-the-fly identification of and addition of data to regions important for predicting the statistics of interest. Finally, we demonstrate that we can compute accurate statistics for a 75-dimensional model of sudden stratospheric warming. This system provides a stringent test bed for our method. △ Less

Submitted 2 March, 2023; v1 submitted 2 August, 2022; originally announced August 2022.

Comments: 21 pages, 12 figures

MSC Class: 65C40; 62M20; 62M45

arXiv:2206.11360 [pdf, other]

doi 10.1063/5.0087058

Computing transition path theory quantities with trajectory stratification

Authors: Bodhi P. Vani, Jonathan Weare, Aaron R. Dinner

Abstract: Transition path theory computes statistics from ensembles of reactive trajectories. A common strategy for sampling reactive trajectories is to control the branching and pruning of trajectories so as to enhance the sampling of low probability segments. However, it can be challenging to apply transition path theory to data from such methods because determining whether configurations and trajectory s… ▽ More Transition path theory computes statistics from ensembles of reactive trajectories. A common strategy for sampling reactive trajectories is to control the branching and pruning of trajectories so as to enhance the sampling of low probability segments. However, it can be challenging to apply transition path theory to data from such methods because determining whether configurations and trajectory segments are part of reactive trajectories requires looking backward and forward in time. Here, we show how this issue can be overcome efficiently by introducing simple data structures. We illustrate the approach in the context of nonequilibrium umbrella sampling (NEUS), but the strategy is general and can be used to obtain transition path theory statistics from other methods that sample segments of unbiased trajectories. △ Less

Submitted 22 June, 2022; originally announced June 2022.

Comments: 12 pages, 11 figures

arXiv:2206.05363 [pdf, other]

Revealing the statistics of extreme events hidden in short weather forecast data

Authors: Justin Finkel, Edwin P. Gerber, Dorian S. Abbot, Jonathan Weare

Abstract: Extreme weather events have significant consequences, dominating the impact of climate on society. While high-resolution weather models can forecast many types of extreme events on synoptic timescales, long-term climatological risk assessment is an altogether different problem. A once-in-a-century event takes, on average, 100 years of simulation time to appear just once, far beyond the typical int… ▽ More Extreme weather events have significant consequences, dominating the impact of climate on society. While high-resolution weather models can forecast many types of extreme events on synoptic timescales, long-term climatological risk assessment is an altogether different problem. A once-in-a-century event takes, on average, 100 years of simulation time to appear just once, far beyond the typical integration length of a weather forecast model. Therefore, this task is left to cheaper, but less accurate, low-resolution or statistical models. But there is untapped potential in weather model output: despite being short in duration, weather forecast ensembles are produced multiple times a week. Integrations are launched with independent perturbations, causing them to spread apart over time and broadly sample phase space. Collectively, these integrations add up to thousands of years of data. We establish methods to extract climatological information from these short weather simulations. Using ensemble hindcasts by the European Center for Medium-range Weather Forecasting (ECMWF) archived in the subseasonal-to-seasonal (S2S) database, we characterize sudden stratospheric warming (SSW) events with multi-centennial return times. Consistent results are found between alternative methods, including basic counting strategies and Markov state modeling. By carefully combining trajectories together, we obtain estimates of SSW frequencies and their seasonal distributions that are consistent with reanalysis-derived estimates for moderately rare events, but with much tighter uncertainty bounds, and which can be extended to events of unprecedented severity that have not yet been observed historically. These methods hold potential for assessing extreme events throughout the climate system, beyond this example of stratospheric extremes. △ Less

Submitted 23 January, 2023; v1 submitted 10 June, 2022; originally announced June 2022.

Comments: 28 pages, 6 figures. Resubmitted for publication

arXiv:2205.05067 [pdf, other]

doi 10.1063/5.0098587

Augmented Transition Path Theory for Sequences of Events

Authors: Chatipat Lorpaiboon, Jonathan Weare, Aaron R. Dinner

Abstract: Transition path theory provides a statistical description of the dynamics of a reaction in terms of local spatial quantities. In its original formulation, it is limited to reactions that consist of trajectories flowing from a reactant set A to a product set B. We extend the basic concepts and principles of transition path theory to reactions in which trajectories exhibit a specified sequence of ev… ▽ More Transition path theory provides a statistical description of the dynamics of a reaction in terms of local spatial quantities. In its original formulation, it is limited to reactions that consist of trajectories flowing from a reactant set A to a product set B. We extend the basic concepts and principles of transition path theory to reactions in which trajectories exhibit a specified sequence of events and illustrate the utility of this generalization on examples. △ Less

Submitted 29 July, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

Comments: 16 pages, 8 figures

arXiv:2201.12164 [pdf, other]

doi 10.1021/acs.jctc.2c00435

Full Configuration Interaction Excited-State Energies in Large Active Spaces from Subspace Iteration with Repeated Random Sparsification

Authors: Samuel M. Greene, Robert J. Webber, James E. T. Smith, Jonathan Weare, Timothy C. Berkelbach

Abstract: We present a stable and systematically improvable quantum Monte Carlo (QMC) approach to calculating excited-state energies, which we implement using our fast randomized iteration method for the full configuration interaction problem (FCI-FRI). Unlike previous excited-state quantum Monte Carlo methods, our approach, which is an asymmetric variant of subspace iteration, avoids the use of dot product… ▽ More We present a stable and systematically improvable quantum Monte Carlo (QMC) approach to calculating excited-state energies, which we implement using our fast randomized iteration method for the full configuration interaction problem (FCI-FRI). Unlike previous excited-state quantum Monte Carlo methods, our approach, which is an asymmetric variant of subspace iteration, avoids the use of dot products of random vectors and instead relies upon trial vectors to maintain orthogonality and estimate eigenvalues. By leveraging recent advances, we apply our method to calculate ground- and excited-state energies of strongly correlated molecular systems in large active spaces, including the carbon dimer with 8 electrons in 108 orbitals (8e,108o), an oxo-Mn(salen) transition metal complex (28e,28o), ozone (18e,87o), and butadiene (22e,82o). In the majority of these test cases, our approach yields total excited-state energies that agree with those from state-of-the-art methods -- including heat-bath CI, the density matrix renormalization group approach, and FCIQMC -- to within sub-milliHartree accuracy. In all cases, estimated excitation energies agree to within about 0.1 eV. △ Less

Submitted 12 October, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

Comments: 16 pages, 5 figures, 3 tables

arXiv:2108.12727 [pdf, other]

doi 10.1175/JAS-D-21-0213.1

Data-driven transition path analysis yields a statistical understanding of sudden stratospheric warming events in an idealized model

Authors: Justin Finkel, Robert J. Webber, Edwin P. Gerber, Dorian S. Abbot, Jonathan Weare

Abstract: Atmospheric regime transitions are highly impactful as drivers of extreme weather events, but pose two formidable modeling challenges: predicting the next event (weather forecasting), and characterizing the statistics of events of a given severity (the risk climatology). Each event has a different duration and spatial structure, making it hard to define an objective "average event." We argue here… ▽ More Atmospheric regime transitions are highly impactful as drivers of extreme weather events, but pose two formidable modeling challenges: predicting the next event (weather forecasting), and characterizing the statistics of events of a given severity (the risk climatology). Each event has a different duration and spatial structure, making it hard to define an objective "average event." We argue here that transition path theory (TPT), a stochastic process framework, is an appropriate tool for the task. We demonstrate TPT's capacities on a wave-mean flow model of sudden stratospheric warmings (SSWs) developed by Holton and Mass (1976), which is idealized enough for transparent TPT analysis but complex enough to demonstrate computational scalability. Whereas a recent article (Finkel et al. 2021) studied near-term SSW predictability, the present article uses TPT to link predictability to long-term SSW frequency. This requires not only forecasting forward in time from an initial condition, but also \emph{backward in time} to assess the probability of the initial conditions themselves. TPT enables one to condition the dynamics on the regime transition occurring, and thus visualize its physical drivers with a vector field called the \emph{reactive current}. The reactive current shows that before an SSW, dissipation and stochastic forcing drive a slow decay of vortex strength at lower altitudes. The response of upper-level winds is late and sudden, occurring only after the transition is almost complete from a probabilistic point of view. This case study demonstrates that TPT quantities, visualized in a space of physically meaningful variables, can help one understand the dynamics of regime transitions. △ Less

Submitted 19 October, 2022; v1 submitted 28 August, 2021; originally announced August 2021.

Comments: 18 pages, 7 figures (main text), 19 pages, 1 figure (supplement). Accepted for publication in the Journal of the Atmospheric Sciences

arXiv:2106.09091 [pdf, other]

doi 10.3847/1538-4357/ac2fa8

Rare Event Sampling Improves Mercury Instability Statistics

Authors: Dorian S. Abbot, Robert J. Webber, Sam Hadden, Darryl Seligman, Jonathan Weare

Abstract: Due to the chaotic nature of planetary dynamics, there is a non-zero probability that Mercury's orbit will become unstable in the future. Previous efforts have estimated the probability of this happening between 3 and 5 billion years in the future using a large number of direct numerical simulations with an N-body code, but were not able to obtain accurate estimates before 3 billion years in the f… ▽ More Due to the chaotic nature of planetary dynamics, there is a non-zero probability that Mercury's orbit will become unstable in the future. Previous efforts have estimated the probability of this happening between 3 and 5 billion years in the future using a large number of direct numerical simulations with an N-body code, but were not able to obtain accurate estimates before 3 billion years in the future because Mercury instability events are too rare. In this paper we use a new rare event sampling technique, Quantile Diffusion Monte Carlo (QDMC), to estimate that the probability of a Mercury instability event in the next 2 billion years is approximately $10^{-4}$ in the REBOUND N-body code. We show that QDMC provides unbiased probability estimates at a computational cost of up to 100 times less than direct numerical simulation. QDMC is easy to implement and could be applied to many problems in planetary dynamics in which it is necessary to estimate the probability of a rare event. △ Less

Submitted 27 December, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

Comments: Abbot et al 2021 ApJ 923 236

arXiv:2106.02686 [pdf, other]

Ensemble Markov chain Monte Carlo with teleporting walkers

Authors: Michael Lindsey, Jonathan Weare, Anna Zhang

Abstract: We introduce an ensemble Markov chain Monte Carlo approach to sampling from a probability density with known likelihood. This method upgrades an underlying Markov chain by allowing an ensemble of such chains to interact via a process in which one chain's state is cloned as another's is deleted. This effective teleportation of states can overcome issues of metastability in the underlying chain, as… ▽ More We introduce an ensemble Markov chain Monte Carlo approach to sampling from a probability density with known likelihood. This method upgrades an underlying Markov chain by allowing an ensemble of such chains to interact via a process in which one chain's state is cloned as another's is deleted. This effective teleportation of states can overcome issues of metastability in the underlying chain, as the scheme enjoys rapid mixing once the modes of the target density have been populated. We derive a mean-field limit for the evolution of the ensemble. We analyze the global and local convergence of this mean-field limit, showing asymptotic convergence independent of the spectral gap of the underlying Markov chain, and moreover we interpret the limiting evolution as a gradient flow. We explain how interaction can be applied selectively to a subset of state variables in order to maintain advantage on very high-dimensional problems. Finally we present the application of our methodology to Bayesian hyperparameter estimation for Gaussian process regression. △ Less

Submitted 4 June, 2021; originally announced June 2021.

arXiv:2103.12109 [pdf, other]

doi 10.1137/21M1422513

Approximating matrix eigenvalues by subspace iteration with repeated random sparsification

Authors: Samuel M. Greene, Robert J. Webber, Timothy C. Berkelbach, Jonathan Weare

Abstract: Traditional numerical methods for calculating matrix eigenvalues are prohibitively expensive for high-dimensional problems. Iterative random sparsification methods allow for the estimation of a single dominant eigenvalue at reduced cost by leveraging repeated random sampling and averaging. We present a general approach to extending such methods for the estimation of multiple eigenvalues and demons… ▽ More Traditional numerical methods for calculating matrix eigenvalues are prohibitively expensive for high-dimensional problems. Iterative random sparsification methods allow for the estimation of a single dominant eigenvalue at reduced cost by leveraging repeated random sampling and averaging. We present a general approach to extending such methods for the estimation of multiple eigenvalues and demonstrate its performance for several benchmark problems in quantum chemistry. △ Less

Submitted 2 March, 2022; v1 submitted 22 March, 2021; originally announced March 2021.

Comments: 31 pages, 8 figures

arXiv:2102.07760 [pdf, other]

doi 10.1175/MWR-D-21-0024.1

Learning forecasts of rare stratospheric transitions from short simulations

Authors: Justin Finkel, Robert J. Webber, Dorian S. Abbot, Edwin P. Gerber, Jonathan Weare

Abstract: Rare events arising in nonlinear atmospheric dynamics remain hard to predict and attribute. We address the problem of forecasting rare events in a prototypical example, Sudden Stratospheric Warmings (SSWs). Approximately once every other winter, the boreal stratospheric polar vortex rapidly breaks down, shifting midlatitude surface weather patterns for months. We focus on two key quantities of int… ▽ More Rare events arising in nonlinear atmospheric dynamics remain hard to predict and attribute. We address the problem of forecasting rare events in a prototypical example, Sudden Stratospheric Warmings (SSWs). Approximately once every other winter, the boreal stratospheric polar vortex rapidly breaks down, shifting midlatitude surface weather patterns for months. We focus on two key quantities of interest: the probability of an SSW occurring, and the expected lead time if it does occur, as functions of initial condition. These \emph{optimal forecasts} concretely measure the event's progress. Direct numerical simulation can estimate them in principle, but is prohibitively expensive in practice: each rare event requires a long integration to observe, and the cost of each integration grows with model complexity. We describe an alternative approach using integrations that are \emph{short} compared to the timescale of the warming event. We compute the probability and lead time efficiently by solving equations involving the transition operator, which encodes all information about the dynamics. We relate these optimal forecasts to a small number of interpretable physical variables, suggesting optimal measurements for forecasting. We illustrate the methodology on a prototype SSW model developed by Holton and Mass (1976) and modified by stochastic forcing. While highly idealized, this model captures the essential nonlinear dynamics of SSWs and exhibits the key forecasting challenge: the dramatic separation in timescales between a single event and the return time between successive events. Our methodology is designed to fully exploit high-dimensional data from models and observations, and has the potential to identify detailed predictors of many complex rare events in meteorology. △ Less

Submitted 28 August, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

Comments: 26 pages, 7 figures, major revision after original. Accepted to Monthly Weather Review, American Meteorological Society

arXiv:2009.04034 [pdf, other]

Long-timescale predictions from short-trajectory data: A benchmark analysis of the trp-cage miniprotein

Authors: John Strahan, Adam Antoszewski, Chatipat Lorpaiboon, Bodhi P. Vani, Jonathan Weare, Aaron R. Dinner

Abstract: Elucidating physical mechanisms with statistical confidence from molecular dynamics simulations can be challenging owing to the many degrees of freedom that contribute to collective motions. To address this issue, we recently introduced a dynamical Galerkin approximation (DGA) [Thiede et al. J. Phys. Chem. 150, 244111 (2019)], in which chemical kinetic statistics that satisfy equations of dynamica… ▽ More Elucidating physical mechanisms with statistical confidence from molecular dynamics simulations can be challenging owing to the many degrees of freedom that contribute to collective motions. To address this issue, we recently introduced a dynamical Galerkin approximation (DGA) [Thiede et al. J. Phys. Chem. 150, 244111 (2019)], in which chemical kinetic statistics that satisfy equations of dynamical operators are represented by a basis expansion. Here, we reformulate this approach, clarifying (and reducing) the dependence on the choice of lag time. We present a new projection of the reactive current onto collective variables and provide improved estimators for rates and committors. We also present simple procedures for constructing suitable smoothly varying basis functions from arbitrary molecular features. To evaluate estimators and basis sets numerically, we generate and carefully validate a dataset of short trajectories for the unfolding and folding of the trp-cage miniprotein, a well-studied system. Our analysis demonstrates a comprehensive strategy for characterizing reaction pathways quantitatively. △ Less

Submitted 8 September, 2020; originally announced September 2020.

Comments: 61 pages, 17 figures

arXiv:2007.08027 [pdf, other]

doi 10.1021/acs.jpcb.0c06477

Integrated VAC: A robust strategy for identifying eigenfunctions of dynamical operators

Authors: Chatipat Lorpaiboon, Erik Henning Thiede, Robert J. Webber, Jonathan Weare, Aaron R. Dinner

Abstract: One approach to analyzing the dynamics of a physical system is to search for long-lived patterns in its motions. This approach has been particularly successful for molecular dynamics data, where slowly decorrelating patterns can indicate large-scale conformational changes. Detecting such patterns is the central objective of the variational approach to conformational dynamics (VAC), as well as the… ▽ More One approach to analyzing the dynamics of a physical system is to search for long-lived patterns in its motions. This approach has been particularly successful for molecular dynamics data, where slowly decorrelating patterns can indicate large-scale conformational changes. Detecting such patterns is the central objective of the variational approach to conformational dynamics (VAC), as well as the related methods of time-lagged independent component analysis and Markov state modeling. In VAC, the search for slowly decorrelating patterns is formalized as a variational problem solved by the eigenfunctions of the system's transition operator. VAC computes solutions to this variational problem by optimizing a linear or nonlinear model of the eigenfunctions using time series data. Here, we build on VAC's success by addressing two practical limitations. First, VAC can give poor eigenfunction estimates when the lag time parameter is chosen poorly. Second, VAC can overfit when using flexible parameterizations such as artificial neural networks with insufficient regularization. To address these issues, we propose an extension that we call integrated VAC (IVAC). IVAC integrates over multiple lag times before solving the variational problem, making its results more robust and reproducible than VAC's. △ Less

Submitted 9 September, 2020; v1 submitted 15 July, 2020; originally announced July 2020.

Comments: 16 pages, 7 figures

arXiv:2006.14482 [pdf, other]

doi 10.1137/20M1348315

A metric on directed graphs and Markov chains based on hitting probabilities

Authors: Zachary M. Boyd, Nicolas Fraiman, Jeremy L. Marzuola, Peter J. Mucha, Braxton Osting, Jonathan Weare

Abstract: The shortest-path, commute time, and diffusion distances on undirected graphs have been widely employed in applications such as dimensionality reduction, link prediction, and trip planning. Increasingly, there is interest in using asymmetric structure of data derived from Markov chains and directed graphs, but few metrics are specifically adapted to this task. We introduce a metric on the state sp… ▽ More The shortest-path, commute time, and diffusion distances on undirected graphs have been widely employed in applications such as dimensionality reduction, link prediction, and trip planning. Increasingly, there is interest in using asymmetric structure of data derived from Markov chains and directed graphs, but few metrics are specifically adapted to this task. We introduce a metric on the state space of any ergodic, finite-state, time-homogeneous Markov chain and, in particular, on any Markov chain derived from a directed graph. Our construction is based on hitting probabilities, with nearness in the metric space related to the transfer of random walkers from one node to another at stationarity. Notably, our metric is insensitive to shortest and average walk distances, thus giving new information compared to existing metrics. We use possible degeneracies in the metric to develop an interesting structural theory of directed graphs and explore a related quotienting procedure. Our metric can be computed in $O(n^3)$ time, where $n$ is the number of states, and in examples we scale up to $n=10,000$ nodes and $\approx 38M$ edges on a desktop computer. In several examples, we explore the nature of the metric, compare it to alternative methods, and demonstrate its utility for weak recovery of community structure in dense graphs, visualization, structure recovering, dynamics exploration, and multiscale cluster detection. △ Less

Submitted 18 January, 2021; v1 submitted 25 June, 2020; originally announced June 2020.

Comments: 26 pages, 9 figures, for associated code, visit https://github.com/zboyd2/hitting_probabilities_metric, accepted at SIAM J. Math. Data Sci

Journal ref: SIAM Journal on Mathematics of Data Science, Vol. 3, pp. 467-493 (2021)

arXiv:2005.13113 [pdf, other]

doi 10.1175/JAS-D-19-0278.1

Path properties of atmospheric transitions: illustration with a low-order sudden stratospheric warming model

Authors: Justin Finkel, Dorian Abbot, Jonathan Weare

Abstract: Many rare weather events, including hurricanes, droughts, and floods, dramatically impact human life. To accurately forecast these events and characterize their climatology requires specialized mathematical techniques to fully leverage the limited data that are available. Here we describe \emph{transition path theory} (TPT), a framework originally developed for molecular simulation, and argue that… ▽ More Many rare weather events, including hurricanes, droughts, and floods, dramatically impact human life. To accurately forecast these events and characterize their climatology requires specialized mathematical techniques to fully leverage the limited data that are available. Here we describe \emph{transition path theory} (TPT), a framework originally developed for molecular simulation, and argue that it is a useful paradigm for develo** mechanistic understanding of rare climate events. TPT provides a method to calculate statistical properties of the paths into the event. As an initial demonstration of the utility of TPT, we analyze a low-order model of sudden stratospheric warming (SSW), a dramatic disturbance to the polar vortex which can induce extreme cold spells at the surface in the midlatitudes. SSW events pose a major challenge for seasonal weather prediction because of their rapid, complex onset and development. Climate models struggle to capture the long-term statistics of SSW, owing to their diversity and intermittent nature. We use a stochastically forced Holton-Mass-type model with two stable states, corresponding to radiative equilibrium and a vacillating SSW-like regime. In this stochastic bistable setting, from certain probabilistic forecasts TPT facilitates estimation of dominant transition pathways and return times of transitions. These "dynamical statistics" are obtained by solving partial differential equations in the model's phase space. With future application to more complex models, TPT and its constituent quantities promise to improve the predictability of extreme weather events, through both generation and principled evaluation of forecasts. △ Less

Submitted 26 May, 2020; originally announced May 2020.

Comments: 17 pages, 12 figures. To be published in the Journal of the Atmospheric Sciences

arXiv:2005.02248 [pdf, other]

doi 10.1137/20M1335984

Error bounds for dynamical spectral estimation

Authors: Robert J. Webber, Erik H. Thiede, Douglas Dow, Aaron R. Dinner, Jonathan Weare

Abstract: Dynamical spectral estimation is a well-established numerical approach for estimating eigenvalues and eigenfunctions of the Markov transition operator from trajectory data. Although the approach has been widely applied in biomolecular simulations, its error properties remain poorly understood. Here we analyze the error of a dynamical spectral estimation method called "the variational approach to c… ▽ More Dynamical spectral estimation is a well-established numerical approach for estimating eigenvalues and eigenfunctions of the Markov transition operator from trajectory data. Although the approach has been widely applied in biomolecular simulations, its error properties remain poorly understood. Here we analyze the error of a dynamical spectral estimation method called "the variational approach to conformational dynamics" (VAC). We bound the approximation error and estimation error for VAC estimates. Our analysis establishes VAC's convergence properties and suggests new strategies for tuning VAC to improve accuracy. △ Less

Submitted 24 September, 2020; v1 submitted 5 May, 2020; originally announced May 2020.

Comments: 34 pages, 7 figures

MSC Class: 65C05; 60J35; 65N30

arXiv:2005.00654 [pdf, other]

doi 10.1021/acs.jctc.0c00437

Improved Fast Randomized Iteration Approach to Full Configuration Interaction

Authors: Samuel M. Greene, Robert J. Webber, Jonathan Weare, Timothy C. Berkelbach

Abstract: We present three modifications to our recently introduced fast randomized iteration method for full configuration interaction (FCI-FRI) and investigate their effects on the method's performance for Ne, H$_2$O, and N$_2$. The initiator approximation, originally developed for full configuration interaction quantum Monte Carlo, significantly reduces statistical error in FCI-FRI when few samples are u… ▽ More We present three modifications to our recently introduced fast randomized iteration method for full configuration interaction (FCI-FRI) and investigate their effects on the method's performance for Ne, H$_2$O, and N$_2$. The initiator approximation, originally developed for full configuration interaction quantum Monte Carlo, significantly reduces statistical error in FCI-FRI when few samples are used in compression operations, enabling its application to larger chemical systems. The semi-stochastic extension, which involves exactly preserving a fixed subset of elements in each compression, improves statistical efficiency in some cases but reduces it in others. We also developed a new approach to sampling excitations that yields consistent improvements in statistical efficiency and reductions in computational cost. We discuss possible strategies based on our findings for improving the performance of stochastic quantum chemistry methods more generally. △ Less

Submitted 20 July, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

Comments: 13 pages, 5 figures

arXiv:2004.12023 [pdf, other]

doi 10.1063/5.0004997

NWChem: Past, Present, and Future

Authors: E. Aprà, E. J. Bylaska, W. A. de Jong, N. Govind, K. Kowalski, T. P. Straatsma, M. Valiev, H. J. J. van Dam, Y. Alexeev, J. Anchell, V. Anisimov, F. W. Aquino, R. Atta-Fynn, J. Autschbach, N. P. Bauman, J. C. Becca, D. E. Bernholdt, K. Bhaskaran-Nair, S. Bogatko, P. Borowski, J. Boschen, J. Brabec, A. Bruner, E. Cauët, Y. Chen , et al. (89 additional authors not shown)

Abstract: Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties. In this regard, electronic structure packages have played a special role by using first-principledriven methodologies to model complex chemical and materials… ▽ More Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties. In this regard, electronic structure packages have played a special role by using first-principledriven methodologies to model complex chemical and materials processes. Over the last few decades, the rapid development of computing technologies and the tremendous increase in computational power have offered a unique chance to study complex transformations using sophisticated and predictive many-body techniques that describe correlated behavior of electrons in molecular and condensed phase systems at different levels of theory. In enabling these simulations, novel parallel algorithms have been able to take advantage of computational resources to address the polynomial scaling of electronic structure methods. In this paper, we briefly review the NWChem computational chemistry suite, including its history, design principles, parallel tools, current capabilities, outreach and outlook. △ Less

Submitted 26 May, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

Comments: This article appeared in volume 152, issue 18, page 184102 of the Journal of Chemical Physics. It can be found at https://doi.org/10.1063/5.0004997

Journal ref: J. Chem. Phys., 152, 184102 (2020)

arXiv:1912.08081 [pdf, other]

A Kinetic Monte Carlo Approach for Simulating Cascading Transmission Line Failure

Authors: Jacob Roth, David A. Barajas-Solano, Panos Stinis, Jonathan Weare, Mihai Anitescu

Abstract: In this work, cascading transmission line failures are studied through a dynamical model of the power system operating under fixed conditions. The power grid is modeled as a stochastic dynamical system where first-principles electromechanical dynamics are excited by small Gaussian disturbances in demand and generation around a specified operating point. In this context, a single line failure is in… ▽ More In this work, cascading transmission line failures are studied through a dynamical model of the power system operating under fixed conditions. The power grid is modeled as a stochastic dynamical system where first-principles electromechanical dynamics are excited by small Gaussian disturbances in demand and generation around a specified operating point. In this context, a single line failure is interpreted in a large deviation context as a first escape event across a surface in phase space defined by line security constraints. The resulting system of stochastic differential equations admits a transverse decomposition of the drift, which leads to considerable simplification in evaluating the quasipotential (rate function) and, consequently, computation of exit rates. Tractable expressions for the rate of transmission line failure in a restricted network are derived from large deviation theory arguments and validated against numerical simulations. Extensions to realistic settings are considered, and individual line failure models are aggregated into a Markov model of cascading failure inspired by chemical kinetics. Cascades are generated by traversing a graph composed of weighted edges representing transitions to degraded network topologies. Numerical results indicate that the Markov model can produce cascades with qualitative power-law properties similar to those observed in empirical cascades. △ Less

Submitted 15 December, 2019; originally announced December 2019.

MSC Class: 60H30; 68U20; 37H10

arXiv:1905.00995 [pdf, other]

doi 10.1021/acs.jctc.9b00422

Beyond Walkers in Stochastic Quantum Chemistry: Reducing Error using Fast Randomized Iteration

Authors: Samuel M. Greene, Robert J. Webber, Jonathan Weare, Timothy C. Berkelbach

Abstract: We introduce a family of methods for the full configuration interaction problem in quantum chemistry, based on the fast randomized iteration (FRI) framework [L.-H. Lim and J. Weare, SIAM Rev. 59, 547 (2017)]. These methods, which we term "FCI-FRI," stochastically impose sparsity during iterations of the power method and can be viewed as a generalization of full configuration interaction quantum Mo… ▽ More We introduce a family of methods for the full configuration interaction problem in quantum chemistry, based on the fast randomized iteration (FRI) framework [L.-H. Lim and J. Weare, SIAM Rev. 59, 547 (2017)]. These methods, which we term "FCI-FRI," stochastically impose sparsity during iterations of the power method and can be viewed as a generalization of full configuration interaction quantum Monte Carlo (FCIQMC) without walkers. In addition to the multinomial scheme commonly used to sample excitations in FCIQMC, we present a systematic scheme where excitations are not sampled independently. Performing ground-state calculations on five small molecules at fixed cost, we find that the systematic FCI-FRI scheme is 11 to 45 times more statistically efficient than the multinomial FCI-FRI scheme, which is in turn 1.4 to 178 times more statistically efficient than the original FCIQMC algorithm. △ Less

Submitted 9 July, 2019; v1 submitted 2 May, 2019; originally announced May 2019.

Comments: 19 pages, 7 figures

arXiv:1905.00515 [pdf, other]

doi 10.1029/2018MS001419

Maximizing simulated tropical cyclone intensity with action minimization

Authors: David A. Plotkin, Robert J. Webber, Morgan E O'Neill, Jonathan Weare, Dorian S. Abbot

Abstract: Direct computer simulation of intense tropical cyclones (TCs) in weather models is limited by computational expense. Intense TCs are rare and have small-scale structures, making it difficult to produce large ensembles of storms at high resolution. Further, models often fail to capture the process of rapid intensification, which is a distinguishing feature of many intense TCs. Understanding rapid i… ▽ More Direct computer simulation of intense tropical cyclones (TCs) in weather models is limited by computational expense. Intense TCs are rare and have small-scale structures, making it difficult to produce large ensembles of storms at high resolution. Further, models often fail to capture the process of rapid intensification, which is a distinguishing feature of many intense TCs. Understanding rapid intensification is especially important in the context of global warming, which may increase the frequency of intense TCs. To better leverage computational resources for the study of rapid intensification, we introduce an action minimization algorithm applied to the WRF and WRFPLUS models. Action minimization nudges the model into forming more intense TCs than it otherwise would; it does so via the maximum likelihood path in a stochastic formulation of the model, thereby allowing targeted study of intensification mechanisms. We apply action minimization to simulations of Hurricanes Danny (2015) and Fred (2009) at 6 km resolution to demonstrate that the algorithm consistently intensifies TCs via physically plausible pathways. We show an approximately ten-fold computational savings using action minimization to study the tail of the TC intensification distribution. Further, for Hurricanes Danny and Fred, action minimization produces perturbations that preferentially reduce low-level shear as compared to upper-level shear, at least above a threshold of approximately $4 \mathrm{\ m \ s^{-1}}$. We also demonstrate that asymmetric, time-dependent patterns of heating can cause significant TC intensification beyond symmetric, azimuthally-averaged heating and find a regime of non-linear response to asymmetric heating that has not been extensively studied in previous work. △ Less

Submitted 1 May, 2019; originally announced May 2019.

arXiv:1904.03464 [pdf, other]

doi 10.1063/1.5081461

Practical rare event sampling for extreme mesoscale weather

Authors: Robert J. Webber, David A. Plotkin, Morgan E O'Neill, Dorian S. Abbot, Jonathan Weare

Abstract: Extreme mesoscale weather, including tropical cyclones, squall lines, and floods, can be enormously damaging and yet challenging to simulate; hence, there is a pressing need for more efficient simulation strategies. Here we present a new rare event sampling algorithm called Quantile Diffusion Monte Carlo (Quantile DMC). Quantile DMC is a simple-to-use algorithm that can sample extreme tail behavio… ▽ More Extreme mesoscale weather, including tropical cyclones, squall lines, and floods, can be enormously damaging and yet challenging to simulate; hence, there is a pressing need for more efficient simulation strategies. Here we present a new rare event sampling algorithm called Quantile Diffusion Monte Carlo (Quantile DMC). Quantile DMC is a simple-to-use algorithm that can sample extreme tail behavior for a wide class of processes. We demonstrate the advantages of Quantile DMC compared to other sampling methods and discuss practical aspects of implementing Quantile DMC. To test the feasibility of Quantile DMC for extreme mesoscale weather, we sample extremely intense realizations of two historical tropical cyclones, 2010 Hurricane Earl and 2015 Hurricane Joaquin. Our results demonstrate Quantile DMC's potential to provide low-variance extreme weather statistics while highlighting the work that is necessary for Quantile DMC to attain greater efficiency in future applications. △ Less

Submitted 6 April, 2019; originally announced April 2019.

Comments: 18 pages, 9 figures

arXiv:1902.03497 [pdf, other]

Symmetry Breaking in Density Functional Theory due to Dirac Exchange for a Hydrogen Molecule

Authors: Michael Holst, Houdong Hu, Jianfeng Lu, Jeremy L. Marzuola, Duo Song, John Weare

Abstract: We study symmetry breaking in the mean field solutions to the 2 electron hydrogen molecule within Kohn Sham (KS) local spin density function theory with Dirac exchange (the XLDA model). This simplified model shows behavior related to that of the (KS) spin density functional theory (SDFT) predictions in condensed and molecular systems. The Kohn Sham solutions to the constrained SDFT variation probl… ▽ More We study symmetry breaking in the mean field solutions to the 2 electron hydrogen molecule within Kohn Sham (KS) local spin density function theory with Dirac exchange (the XLDA model). This simplified model shows behavior related to that of the (KS) spin density functional theory (SDFT) predictions in condensed and molecular systems. The Kohn Sham solutions to the constrained SDFT variation problem undergo spontaneous symmetry breaking as the relative strength of the non-convex exchange term increases. This results in the change of the molecular ground state from a paramagnetic state to an antiferromagnetic ground states and a stationary symmetric delocalized 1st excited state. We further characterize the limiting behavior of the minimizer when the strength of the exchange term goes to infinity. This leads to further bifurcations and highly localized states with varying character. The stability of the various solution classes is demonstrated by Hessian analysis. Finite element numerical results provide support for the formal conjectures. △ Less

Submitted 22 February, 2021; v1 submitted 9 February, 2019; originally announced February 2019.

Comments: 33 pages, 6 figures; many improvements and clarifications made due to an anonymous referee

MSC Class: 35Q40

arXiv:1810.01841 [pdf, other]

doi 10.1063/1.5063730

Galerkin Approximation of Dynamical Quantities using Trajectory Data

Authors: Erik H. Thiede, Dimitrios Giannakis, Aaron R. Dinner, Jonathan Weare

Abstract: Understanding chemical mechanisms requires estimating dynamical statistics such as expected hitting times, reaction rates, and committors. Here, we present a general framework for calculating these dynamical quantities by approximating boundary value problems using dynamical operators with a Galerkin expansion. A specific choice of basis set in the expansion corresponds to estimation of dynamical… ▽ More Understanding chemical mechanisms requires estimating dynamical statistics such as expected hitting times, reaction rates, and committors. Here, we present a general framework for calculating these dynamical quantities by approximating boundary value problems using dynamical operators with a Galerkin expansion. A specific choice of basis set in the expansion corresponds to estimation of dynamical quantities using a Markov state model. More generally, the boundary conditions impose restrictions on the choice of basis sets. We demonstrate how an alternative basis can be constructed using ideas from diffusion maps. In our numerical experiments, this basis gives results of comparable or better accuracy to Markov state models. Additionally, we show that delay embedding can reduce the information lost when projecting the system's dynamics for model construction; this improves estimates of dynamical statistics considerably over the standard practice of increasing the lag time. △ Less

Submitted 26 February, 2019; v1 submitted 3 October, 2018; originally announced October 2018.

arXiv:1806.02420 [pdf, other]

Simulating the stochastic dynamics and cascade failure of power networks

Authors: Charles Matthews, Bradly Stadie, Jonathan Weare, Mihai Anitescu, Christopher Demarco

Abstract: For large-scale power networks, the failure of particular transmission lines can offload power to other lines and cause self-protection trips to activate, instigating a cascade of line failures. In extreme cases, this can bring down the entire network. Learning where the vulnerabilities are and the expected timescales for which failures are likely is an active area of research. In this article we… ▽ More For large-scale power networks, the failure of particular transmission lines can offload power to other lines and cause self-protection trips to activate, instigating a cascade of line failures. In extreme cases, this can bring down the entire network. Learning where the vulnerabilities are and the expected timescales for which failures are likely is an active area of research. In this article we present a novel stochastic dynamics model for a large-scale power network along with a framework for efficient computer simulation of the model including long timescale events such as cascade failure. We build on an existing Hamiltonian formulation and introduce stochastic forcing and dam** components to simulate small perturbations to the network. Our model and simulation framework allow assessment of the particular weaknesses in a power network that make it susceptible to cascade failure, along with the timescales and mechanism for expected failures. △ Less

Submitted 6 June, 2018; originally announced June 2018.

arXiv:1805.08863 [pdf, other]

Langevin Markov Chain Monte Carlo with stochastic gradients

Authors: Charles Matthews, Jonathan Weare

Abstract: Monte Carlo sampling techniques have broad applications in machine learning, Bayesian posterior inference, and parameter estimation. Often the target distribution takes the form of a product distribution over a dataset with a large number of entries. For sampling schemes utilizing gradient information it is cheaper for the derivative to be approximated using a random small subset of the data, intr… ▽ More Monte Carlo sampling techniques have broad applications in machine learning, Bayesian posterior inference, and parameter estimation. Often the target distribution takes the form of a product distribution over a dataset with a large number of entries. For sampling schemes utilizing gradient information it is cheaper for the derivative to be approximated using a random small subset of the data, introducing extra noise into the system. We present a new discretization scheme for underdamped Langevin dynamics when utilizing a stochastic (noisy) gradient. This scheme is shown to bias computed averages to second order in the stepsize while giving exact results in the special case of sampling a Gaussian distribution with a normally distributed stochastic gradient. △ Less

Submitted 17 September, 2019; v1 submitted 22 May, 2018; originally announced May 2018.

arXiv:1712.05024 [pdf, other]

doi 10.1093/mnras/sty2140

Umbrella sampling: a powerful method to sample tails of distributions

Authors: Charles Matthews, Jonathan Weare, Andrey Kravtsov, Elise Jennings

Abstract: We present the umbrella sampling (US) technique and show that it can be used to sample extremely low probability areas of the posterior distribution that may be required in statistical analyses of data. In this approach sampling of the target likelihood is split into sampling of multiple biased likelihoods confined within individual umbrella windows. We show that the US algorithm is efficient and… ▽ More We present the umbrella sampling (US) technique and show that it can be used to sample extremely low probability areas of the posterior distribution that may be required in statistical analyses of data. In this approach sampling of the target likelihood is split into sampling of multiple biased likelihoods confined within individual umbrella windows. We show that the US algorithm is efficient and highly parallel and that it can be easily used with other existing MCMC samplers. The method allows the user to capitalize on their intuition and define umbrella windows and increase sampling accuracy along specific directions in the parameter space. Alternatively, one can define umbrella windows using an approach similar to parallel tempering. We provide a public code that implements umbrella sampling as a standalone python package. We present a number of tests illustrating the power of the US method in sampling low probability areas of the posterior and show that this ability allows a considerably more robust sampling of multi-modal distributions compared to the standard sampling methods. We also present an application of the method in a real world example of deriving cosmological constraints using the supernova type Ia data. We show that umbrella sampling can sample the posterior accurately down to the $\approx 15σ$ credible region in the $Ω_{\rm m}-Ω_Λ$ plane, while for the same computational work the affine-invariant MCMC sampling implemented in the {\tt emcee} code samples the posterior reliably only to $\approx 3σ$. △ Less

Submitted 13 December, 2017; originally announced December 2017.

Comments: submitted to MNRAS, 10 pages, 6 figures. Code implementing the umbrella sampling method with examples of use is available at https://github.com/c-matthews/usample

arXiv:1705.08445 [pdf, other]

Stratification as a general variance reduction method for Markov chain Monte Carlo

Authors: Aaron R. Dinner, Erik Thiede, Brian Van Koten, Jonathan Weare

Abstract: The Eigenvector Method for Umbrella Sampling (EMUS) belongs to a popular class of methods in statistical mechanics which adapt the principle of stratified survey sampling to the computation of free energies. We develop a detailed theoretical analysis of EMUS. Based on this analysis, we show that EMUS is an efficient general method for computing averages over arbitrary target distributions. In part… ▽ More The Eigenvector Method for Umbrella Sampling (EMUS) belongs to a popular class of methods in statistical mechanics which adapt the principle of stratified survey sampling to the computation of free energies. We develop a detailed theoretical analysis of EMUS. Based on this analysis, we show that EMUS is an efficient general method for computing averages over arbitrary target distributions. In particular, we show that EMUS can be dramatically more efficient than direct MCMC when the target distribution is multimodal or when the goal is to compute tail probabilities. To illustrate these theoretical results, we present a tutorial application of the method to a problem from Bayesian statistics. △ Less

Submitted 19 June, 2020; v1 submitted 23 May, 2017; originally announced May 2017.

Comments: 52 pages, 11 figures

MSC Class: 65C05; 65C60

arXiv:1610.09426 [pdf, other]

Trajectory stratification of stochastic dynamics

Authors: Aaron R. Dinner, Jonathan C. Mattingly, Jeremy O. B. Tempkin, Brian Van Koten, Jonathan Weare

Abstract: We present a general mathematical framework for trajectory stratification for simulating rare events. Trajectory stratification involves decomposing trajectories of the underlying process into fragments limited to restricted regions of state space (strata), computing averages over the distributions of the trajectory fragments within the strata with minimal communication between them, and combining… ▽ More We present a general mathematical framework for trajectory stratification for simulating rare events. Trajectory stratification involves decomposing trajectories of the underlying process into fragments limited to restricted regions of state space (strata), computing averages over the distributions of the trajectory fragments within the strata with minimal communication between them, and combining those averages with appropriate weights to yield averages with respect to the original underlying process. Our framework reveals the full generality and flexibility of trajectory stratification, and it illuminates a common mathematical structure shared by existing algorithms for sampling rare events. We demonstrate the power of the framework by defining strata in terms of both points in time and path-dependent variables for efficiently estimating averages that were not previously tractable. △ Less

Submitted 10 November, 2017; v1 submitted 28 October, 2016; originally announced October 2016.

Comments: 19 pages, 8 figures, 4 pages supplementary material

arXiv:1607.03954 [pdf, other]

Ensemble preconditioning for Markov chain Monte Carlo simulation

Authors: Charles Matthews, Jonathan Weare, Benedict Leimkuhler

Abstract: We describe parallel Markov chain Monte Carlo methods that propagate a collective ensemble of paths, with local covariance information calculated from neighboring replicas. The use of collective dynamics eliminates multiplicative noise and stabilizes the dynamics thus providing a practical approach to difficult anisotropic sampling problems in high dimensions. Numerical experiments with model prob… ▽ More We describe parallel Markov chain Monte Carlo methods that propagate a collective ensemble of paths, with local covariance information calculated from neighboring replicas. The use of collective dynamics eliminates multiplicative noise and stabilizes the dynamics thus providing a practical approach to difficult anisotropic sampling problems in high dimensions. Numerical experiments with model problems demonstrate that dramatic potential speedups, compared to various alternative schemes, are attainable. △ Less

Submitted 13 July, 2016; originally announced July 2016.

arXiv:1603.04505 [pdf, ps, other]

doi 10.1063/1.4960649

Eigenvector method for umbrella sampling enables error analysis

Authors: Erik Thiede, Brian Van Koten, Jonathan Weare, Aaron R. Dinner

Abstract: Umbrella sampling efficiently yields equilibrium averages that depend on exploring rare states of a model by biasing simulations to windows of coordinate values and then combining the resulting data with physical weighting. Here, we introduce a mathematical framework that casts the step of combining the data as an eigenproblem. The advantage to this approach is that it facilitates error analysis.… ▽ More Umbrella sampling efficiently yields equilibrium averages that depend on exploring rare states of a model by biasing simulations to windows of coordinate values and then combining the resulting data with physical weighting. Here, we introduce a mathematical framework that casts the step of combining the data as an eigenproblem. The advantage to this approach is that it facilitates error analysis. We discuss how the error scales with the number of windows. Then, we derive a central limit theorem for averages that are obtained from umbrella sampling. The central limit theorem suggests an estimator of the error contributions from individual windows, and we develop a simple and computationally inexpensive procedure for implementing it. We demonstrate this estimator for simulations of the alanine dipeptide and show that it emphasizes low free energy pathways between stable states in comparison to existing approaches for assessing error contributions. We discuss the possibility of using the estimator and, more generally, the eigenvector method for umbrella sampling to guide adaptation of the simulation parameters to accelerate convergence. △ Less

Submitted 14 March, 2016; originally announced March 2016.

arXiv:1508.06104 [pdf, other]

doi 10.1137/15M1040827

Fast randomized iteration: diffusion Monte Carlo through the lens of numerical linear algebra

Authors: Lek-Heng Lim, Jonathan Weare

Abstract: We review the basic outline of the highly successful diffusion Monte Carlo technique commonly used in contexts ranging from electronic structure calculations to rare event simulation and data assimilation, and propose a new class of randomized iterative algorithms based on similar principles to address a variety of common tasks in numerical linear algebra. From the point of view of numerical linea… ▽ More We review the basic outline of the highly successful diffusion Monte Carlo technique commonly used in contexts ranging from electronic structure calculations to rare event simulation and data assimilation, and propose a new class of randomized iterative algorithms based on similar principles to address a variety of common tasks in numerical linear algebra. From the point of view of numerical linear algebra, the main novelty of the Fast Randomized Iteration schemes described in this article is that they work in either linear or constant cost per iteration (and in total, under appropriate conditions) and are rather versatile: we will show how they apply to solution of linear systems, eigenvalue problems, and matrix exponentiation, in dimensions far beyond the present limits of numerical linear algebra. While traditional iterative methods in numerical linear algebra were created in part to deal with instances where a matrix (of size $\mathcal{O}(n^2)$) is too big to store, the algorithms that we propose are effective even in instances where the solution vector itself (of size $\mathcal{O}(n)$) may be too big to store or manipulate. In fact, our work is motivated by recent DMC based quantum Monte Carlo schemes that have been applied to matrices as large as $10^{108} \times 10^{108}$. We provide basic convergence results, discuss the dependence of these results on the dimension of the system, and demonstrate dramatic cost savings on a range of test problems. △ Less

Submitted 9 October, 2017; v1 submitted 25 August, 2015; originally announced August 2015.

Comments: 44 pages, 7 figures

MSC Class: 65C05; 65F10; 65F15; 65F60; 68W20

Journal ref: SIAM Review, 59 (2017), no. 3, pp. 547--587

arXiv:1410.1431 [pdf, other]

Sharp entrywise perturbation bounds for Markov chains

Authors: Erik Thiede, Brian Van Koten, Jonathan Weare

Abstract: For many Markov chains of practical interest, the invariant distribution is extremely sensitive to perturbations of some entries of the transition matrix, but insensitive to others; we give an example of such a chain, motivated by a problem in computational statistical physics. We have derived perturbation bounds on the relative error of the invariant distribution that reveal these variations in s… ▽ More For many Markov chains of practical interest, the invariant distribution is extremely sensitive to perturbations of some entries of the transition matrix, but insensitive to others; we give an example of such a chain, motivated by a problem in computational statistical physics. We have derived perturbation bounds on the relative error of the invariant distribution that reveal these variations in sensitivity. Our bounds are sharp, we do not impose any structural assumptions on the transition matrix or on the perturbation, and computing the bounds has the same complexity as computing the invariant distribution or computing other bounds in the literature. Moreover, our bounds have a simple interpretation in terms of hitting times, which can be used to draw intuitive but rigorous conclusions about the sensitivity of a chain to various types of perturbations. △ Less

Submitted 9 October, 2015; v1 submitted 6 October, 2014; originally announced October 2014.

MSC Class: 60J10; 15B51; 65C40; 15A18; 65F15

arXiv:1404.2928 [pdf, other]

The Brownian fan

Authors: Martin Hairer, Jonathan Weare

Abstract: We provide a mathematical study of the modified Diffusion Monte Carlo (DMC) algorithm introduced in the companion article \cite{DMC}. DMC is a simulation technique that uses branching particle systems to represent expectations associated with Feynman-Kac formulae. We provide a detailed heuristic explanation of why, in cases in which a stochastic integral appears in the Feynman-Kac formula (e.g. in… ▽ More We provide a mathematical study of the modified Diffusion Monte Carlo (DMC) algorithm introduced in the companion article \cite{DMC}. DMC is a simulation technique that uses branching particle systems to represent expectations associated with Feynman-Kac formulae. We provide a detailed heuristic explanation of why, in cases in which a stochastic integral appears in the Feynman-Kac formula (e.g. in rare event simulation, continuous time filtering, and other settings), the new algorithm is expected to converge in a suitable sense to a limiting process as the time interval between branching steps goes to 0. The situation studied here stands in stark contrast to the "naïve" generalisation of the DMC algorithm which would lead to an exponential explosion of the number of particles, thus precluding the existence of any finite limiting object. Convergence is shown rigorously in the simplest possible situation of a random walk, biased by a linear potential. The resulting limiting object, which we call the "Brownian fan", is a very natural new mathematical object of independent interest. △ Less

Submitted 9 April, 2014; originally announced April 2014.

Comments: 53 pages, 2 figures. Formerly 2nd part of arXiv:1207.2866

MSC Class: 82B80; 60G35

arXiv:1304.3525 [pdf, other]

doi 10.1103/PhysRevE.88.032403

The relaxation of a family of broken bond crystal surface models

Authors: Jeremy L. Marzuola, Jonathan Weare

Abstract: We study the continuum limit of a family of kinetic Monte Carlo models of crystal surface relaxation that includes both the solid-on-solid and discrete Gaussian models. With computational experiments and theoretical arguments we are able to derive several partial differential equation limits identified (or nearly identified) in previous studies and to clarify the correct choice of surface tension… ▽ More We study the continuum limit of a family of kinetic Monte Carlo models of crystal surface relaxation that includes both the solid-on-solid and discrete Gaussian models. With computational experiments and theoretical arguments we are able to derive several partial differential equation limits identified (or nearly identified) in previous studies and to clarify the correct choice of surface tension appearing in the PDE and the correct scaling regime giving rise to each PDE. We also provide preliminary computational investigations of a number of interesting qualitative features of the large scale behavior of the models. △ Less

Submitted 11 April, 2013; originally announced April 2013.

Comments: 35 pages, 19 figures

arXiv:1207.2866 [pdf, other]

Improved diffusion Monte Carlo

Authors: Martin Hairer, Jonathan Weare

Abstract: We propose a modification, based on the RESTART (repetitive simulation trials after reaching thresholds) and DPR (dynamics probability redistribution) rare event simulation algorithms, of the standard diffusion Monte Carlo (DMC) algorithm. The new algorithm has a lower variance per workload, regardless of the regime considered. In particular, it makes it feasible to use DMC in situations where the… ▽ More We propose a modification, based on the RESTART (repetitive simulation trials after reaching thresholds) and DPR (dynamics probability redistribution) rare event simulation algorithms, of the standard diffusion Monte Carlo (DMC) algorithm. The new algorithm has a lower variance per workload, regardless of the regime considered. In particular, it makes it feasible to use DMC in situations where the "naïve" generalisation of the standard algorithm would be impractical, due to an exponential explosion of its variance. We numerically demonstrate the effectiveness of the new algorithm on a standard rare event simulation problem (probability of an unlikely transition in a Lennard-Jones cluster), as well as a high-frequency data assimilation problem. △ Less

Submitted 9 April, 2014; v1 submitted 12 July, 2012; originally announced July 2012.

Comments: 24 pages; 5 figures

MSC Class: 82B80; 60G35

arXiv:1202.4952 [pdf, other]

doi 10.1175/MWR-D-12-00060.1

Data assimilation in the low noise regime with application to the Kuroshio

Authors: Eric Vanden-Eijnden, Jonathan Weare

Abstract: On-line data assimilation techniques such as ensemble Kalman filters and particle filters lose accuracy dramatically when presented with an unlikely observation. Such an observation may be caused by an unusually large measurement error or reflect a rare fluctuation in the dynamics of the system. Over a long enough span of time it becomes likely that one or several of these events will occur. Often… ▽ More On-line data assimilation techniques such as ensemble Kalman filters and particle filters lose accuracy dramatically when presented with an unlikely observation. Such an observation may be caused by an unusually large measurement error or reflect a rare fluctuation in the dynamics of the system. Over a long enough span of time it becomes likely that one or several of these events will occur. Often they are signatures of the most interesting features of the underlying system and their prediction becomes the primary focus of the data assimilation procedure. The Kuroshio or Black Current that runs along the eastern coast of Japan is an example of such a system. It undergoes infrequent but dramatic changes of state between a small meander during which the current remains close to the coast of Japan, and a large meander during which it bulges away from the coast. Because of the important role that the Kuroshio plays in distributing heat and salinity in the surrounding region, prediction of these transitions is of acute interest. Here we focus on a regime in which both the stochastic forcing on the system and the observational noise are small. In this setting large deviation theory can be used to understand why standard filtering methods fail and guide the design of the more effective data assimilation techniques. Motivated by our analysis we propose several data assimilation strategies capable of efficiently handling rare events such as the transitions of the Kuroshio. These techniques are tested on a model of the Kuroshio and shown to perform much better than standard filtering methods. △ Less

Submitted 14 March, 2014; v1 submitted 22 February, 2012; originally announced February 2012.

Comments: 43 pages, 12 figures

Journal ref: Monthly Weather Review (2013), 141, 1822-1841

arXiv:1202.0316 [pdf, other]

doi 10.1063/1.4724301

Steered Transition Path Sampling

Authors: Nicholas Guttenberg, Aaron R. Dinner, Jonathan Weare

Abstract: We introduce a path sampling method for obtaining statistical properties of an arbitrary stochastic dynamics. The method works by decomposing a trajectory in time, estimating the probability of satisfying a progress constraint, modifying the dynamics based on that probability, and then reweighting to calculate averages. Because the progress constraint can be formulated in terms of occurrences of e… ▽ More We introduce a path sampling method for obtaining statistical properties of an arbitrary stochastic dynamics. The method works by decomposing a trajectory in time, estimating the probability of satisfying a progress constraint, modifying the dynamics based on that probability, and then reweighting to calculate averages. Because the progress constraint can be formulated in terms of occurrences of events within time intervals, the method is particularly well suited for controlling the sampling of currents of dynamic events. We demonstrate the method for calculating transition probabilities in barrier crossing problems and survival probabilities in strongly diffusive systems with absorbing states, which are difficult to treat by shooting. We discuss the relation of the algorithm to other methods. △ Less

Submitted 1 February, 2012; originally announced February 2012.

Comments: 11 pages, 8 figures

arXiv:1104.2612 [pdf, ps, other]

doi 10.1088/0004-637X/745/2/198

An Affine-Invariant Sampler for Exoplanet Fitting and Discovery in Radial Velocity Data

Authors: Fengji Hou, Jonathan Goodman, David W. Hogg, Jonathan Weare, Christian Schwab

Abstract: Markov Chain Monte Carlo (MCMC) proves to be powerful for Bayesian inference and in particular for exoplanet radial velocity fitting because MCMC provides more statistical information and makes better use of data than common approaches like chi-square fitting. However, the non-linear density functions encountered in these problems can make MCMC time-consuming. In this paper, we apply an ensemble s… ▽ More Markov Chain Monte Carlo (MCMC) proves to be powerful for Bayesian inference and in particular for exoplanet radial velocity fitting because MCMC provides more statistical information and makes better use of data than common approaches like chi-square fitting. However, the non-linear density functions encountered in these problems can make MCMC time-consuming. In this paper, we apply an ensemble sampler respecting affine invariance to orbital parameter extraction from radial velocity data. This new sampler has only one free parameter, and it does not require much tuning for good performance, which is important for automatization. The autocorrelation time of this sampler is approximately the same for all parameters and far smaller than Metropolis-Hastings, which means it requires many fewer function calls to produce the same number of independent samples. The affine-invariant sampler speeds up MCMC by hundreds of times compared with Metropolis-Hastings in the same computing situation. This novel sampler would be ideal for projects involving large datasets such as statistical investigations of planet distribution. The biggest obstacle to ensemble samplers is the existence of multiple local optima; we present a clustering technique to deal with local optima by clustering based on the likelihood of the walkers in the ensemble. We demonstrate the effectiveness of the sampler on real radial velocity data. △ Less

Submitted 30 November, 2011; v1 submitted 13 April, 2011; originally announced April 2011.

Comments: 24 pages, 7 figures, accepted to ApJ

Journal ref: 2012, ApJ, 745, 198

arXiv:0709.1721 [pdf, other]

Parallel marginalization Monte Carlo with applications to conditional path sampling

Authors: Jonathan Weare

Abstract: Monte Carlo sampling methods often suffer from long correlation times. Consequently, these methods must be run for many steps to generate an independent sample. In this paper a method is proposed to overcome this difficulty. The method utilizes information from rapidly equilibrating coarse Markov chains that sample marginal distributions of the full system. This is accomplished through exchanges… ▽ More Monte Carlo sampling methods often suffer from long correlation times. Consequently, these methods must be run for many steps to generate an independent sample. In this paper a method is proposed to overcome this difficulty. The method utilizes information from rapidly equilibrating coarse Markov chains that sample marginal distributions of the full system. This is accomplished through exchanges between the full chain and the auxiliary coarse chains. Results of numerical tests on the bridge sampling and filtering/smoothing problems for a stochastic differential equation are presented. △ Less

Submitted 11 September, 2007; originally announced September 2007.

Showing 1–50 of 57 results for author: Weare, J