-
Machine-learned molecular mechanics force field for the simulation of protein-ligand systems and beyond
Authors:
Kenichiro Takaba,
Iván Pulido,
Pavan Kumar Behara,
Chapin E. Cavender,
Anika J. Friedman,
Michael M. Henry,
Hugo MacDermott Opeskin,
Christopher R. Iacovella,
Arnav M. Nagle,
Alexander Matthew Payne,
Michael R. Shirts,
David L. Mobley,
John D. Chodera,
Yuanqing Wang
Abstract:
The development of reliable and extensible molecular mechanics (MM) force fields -- fast, empirical models characterizing the potential energy surface of molecular systems -- is indispensable for biomolecular simulation and computer-aided drug design. Here, we introduce a generalized and extensible machine-learned MM force field, \texttt{espaloma-0.3}, and an end-to-end differentiable framework us…
▽ More
The development of reliable and extensible molecular mechanics (MM) force fields -- fast, empirical models characterizing the potential energy surface of molecular systems -- is indispensable for biomolecular simulation and computer-aided drug design. Here, we introduce a generalized and extensible machine-learned MM force field, \texttt{espaloma-0.3}, and an end-to-end differentiable framework using graph neural networks to overcome the limitations of traditional rule-based methods. Trained in a single GPU-day to fit a large and diverse quantum chemical dataset of over 1.1M energy and force calculations, \texttt{espaloma-0.3} reproduces quantum chemical energetic properties of chemical domains highly relevant to drug discovery, including small molecules, peptides, and nucleic acids. Moreover, this force field maintains the quantum chemical energy-minimized geometries of small molecules and preserves the condensed phase properties of peptides, self-consistently parametrizing proteins and ligands to produce stable simulations leading to highly accurate predictions of binding free energies. This methodology demonstrates significant promise as a path forward for systematically building more accurate force fields that are easily extensible to new chemical domains of interest.
△ Less
Submitted 8 December, 2023; v1 submitted 13 July, 2023;
originally announced July 2023.
-
Structure-Based Experimental Datasets for Benchmarking of Protein Simulation Force Fields
Authors:
Chapin E. Cavender,
David A. Case,
Julian C. -H. Chen,
Lillian T. Chong,
Daniel A. Keedy,
Kresten Lindorff-Larsen,
David L. Mobley,
O. H. Samuli Ollila,
Chris Oostenbrink,
Paul Robustelli,
Vincent A. Voelz,
Michael E. Wall,
David C. Wych,
Michael K. Gilson
Abstract:
This review article provides an overview of structurally oriented, experimental datasets that can be used to benchmark protein force fields, focusing on data generated by nuclear magnetic resonance (NMR) spectroscopy and room temperature (RT) protein crystallography. We discuss why these observables are useful for assessing force field accuracy, how they can be calculated from simulation trajector…
▽ More
This review article provides an overview of structurally oriented, experimental datasets that can be used to benchmark protein force fields, focusing on data generated by nuclear magnetic resonance (NMR) spectroscopy and room temperature (RT) protein crystallography. We discuss why these observables are useful for assessing force field accuracy, how they can be calculated from simulation trajectories, and statistical issues that arise when comparing simulations with experiment. The target audience for this article is computational researchers and trainees who develop, benchmark, or use protein force fields for molecular simulations.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
Best practices for constructing, preparing, and evaluating protein-ligand binding affinity benchmarks
Authors:
David F. Hahn,
Christopher I. Bayly,
Hannah E. Bruce Macdonald,
John D. Chodera,
Vytautas Gapsys,
Antonia S. J. S. Mey,
David L. Mobley,
Laura Perez Benito,
Christina E. M. Schindler,
Gary Tresadern,
Gregory L. Warren
Abstract:
Free energy calculations are rapidly becoming indispensable in structure-enabled drug discovery programs. As new methods, force fields, and implementations are developed, assessing their expected accuracy on real-world systems (benchmarking) becomes critical to provide users with an assessment of the accuracy expected when these methods are applied within their domain of applicability, and develop…
▽ More
Free energy calculations are rapidly becoming indispensable in structure-enabled drug discovery programs. As new methods, force fields, and implementations are developed, assessing their expected accuracy on real-world systems (benchmarking) becomes critical to provide users with an assessment of the accuracy expected when these methods are applied within their domain of applicability, and developers with a way to assess the expected impact of new methodologies. These assessments require construction of a benchmark - a set of well-prepared, high quality systems with corresponding experimental measurements designed to ensure the resulting calculations provide a realistic assessment of expected performance when these methods are deployed within their domains of applicability. To date, the community has not yet adopted a common standardized benchmark, and existing benchmark reports suffer from a myriad of issues, including poor data quality, limited statistical power, and statistically deficient analyses, all of which can conspire to produce benchmarks that are poorly predictive of real-world performance. Here, we address these issues by presenting guidelines for (1) curating experimental data to develop meaningful benchmark sets, (2) preparing benchmark inputs according to best practices to facilitate widespread adoption, and (3) analysis of the resulting predictions to enable statistically meaningful comparisons among methods and force fields.
△ Less
Submitted 12 November, 2021; v1 submitted 13 May, 2021;
originally announced May 2021.
-
Best Practices for Alchemical Free Energy Calculations
Authors:
Antonia S. J. S. Mey,
Bryce Allen,
Hannah E. Bruce Macdonald,
John D. Chodera,
Maximilian Kuhn,
Julien Michel,
David L. Mobley,
Levi N. Naden,
Samarjeet Prasad,
Andrea Rizzi,
Jenke Scheen,
Michael R. Shirts,
Gary Tresadern,
Huafeng Xu
Abstract:
Alchemical free energy calculations are a useful tool for predicting free energy differences associated with the transfer of molecules from one environment to another. The hallmark of these methods is the use of "bridging" potential energy functions representing \emph{alchemical} intermediate states that cannot exist as real chemical species. The data collected from these bridging alchemical therm…
▽ More
Alchemical free energy calculations are a useful tool for predicting free energy differences associated with the transfer of molecules from one environment to another. The hallmark of these methods is the use of "bridging" potential energy functions representing \emph{alchemical} intermediate states that cannot exist as real chemical species. The data collected from these bridging alchemical thermodynamic states allows the efficient computation of transfer free energies (or differences in transfer free energies) with orders of magnitude less simulation time than simulating the transfer process directly. While these methods are highly flexible, care must be taken in avoiding common pitfalls to ensure that computed free energy differences can be robust and reproducible for the chosen force field, and that appropriate corrections are included to permit direct comparison with experimental data. In this paper, we review current best practices for several popular application domains of alchemical free energy calculations, including relative and absolute small molecule binding free energy calculations to biomolecular targets.
△ Less
Submitted 21 August, 2020; v1 submitted 7 August, 2020;
originally announced August 2020.
-
Kinetics and Free Energy of Ligand Dissociation Using Weighted Ensemble Milestoning
Authors:
Dhiman Ray,
Trevor Gokey,
David L. Mobley,
Ioan Andricioaei
Abstract:
We consider the recently developed weighted ensemble milestoning (WEM) scheme [J. Chem. Phys. 152, 234114 (2020)], and test its capability of simulating ligand-receptor dissociation dynamics. We performed WEM simulations on the following host-guest systems: Na$^+$/Cl$^-$ ion pair and 4-hydroxy-2-butanone (BUT) ligand with FK506 binding protein (FKBP). As proof or principle, we show that the WEM fo…
▽ More
We consider the recently developed weighted ensemble milestoning (WEM) scheme [J. Chem. Phys. 152, 234114 (2020)], and test its capability of simulating ligand-receptor dissociation dynamics. We performed WEM simulations on the following host-guest systems: Na$^+$/Cl$^-$ ion pair and 4-hydroxy-2-butanone (BUT) ligand with FK506 binding protein (FKBP). As proof or principle, we show that the WEM formalism reproduces the Na$^+$/Cl$^-$ ion pair dissociation timescale and the free energy profile obtained from long conventional MD simulation. To increase accuracy of WEM calculations applied to kinetics and thermodynamics in protein-ligand binding, we introduced a modified WEM scheme called weighted ensemble milestoning with restraint release (WEM-RR), which can increase the number of starting points per milestone without adding additional computational cost. WEM-RR calculations obtained a ligand residence time and binding free energy in agreement with experimental and previous computational results. Moreover, using the milestoning framework, the binding time and rate constants, dissociation constant and the committor probabilities could also be calculated at a low computational cost. We also present an analytical approach for estimating the association rate constant ($k_{\text{on}}$) when binding is primarily diffusion driven. We show that the WEM method can efficiently calculate multiple experimental observables describing ligand-receptor binding/unbinding and is a promising candidate for computer-aided inhibitor design.
△ Less
Submitted 30 September, 2020; v1 submitted 18 July, 2020;
originally announced July 2020.
-
Bayesian Model Averaging for Ensemble-Based Estimates of Solvation Free Energies
Authors:
Luke J. Gosink,
Christopher C. Overall,
Sarah M. Reehl,
Paul D. Whitney,
David L. Mobley,
Nathan A. Baker
Abstract:
This paper applies the Bayesian Model Averaging (BMA) statistical ensemble technique to estimate small molecule solvation free energies. There is a wide range of methods available for predicting solvation free energies, ranging from empirical statistical models to ab initio quantum mechanical approaches. Each of these methods is based on a set of conceptual assumptions that can affect predictive a…
▽ More
This paper applies the Bayesian Model Averaging (BMA) statistical ensemble technique to estimate small molecule solvation free energies. There is a wide range of methods available for predicting solvation free energies, ranging from empirical statistical models to ab initio quantum mechanical approaches. Each of these methods is based on a set of conceptual assumptions that can affect predictive accuracy and transferability. Using an iterative statistical process, we have selected and combined solvation energy estimates using an ensemble of 17 diverse methods from the fourth Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) blind prediction study to form a single, aggregated solvation energy estimate. The ensemble design process evaluates the statistical information in each individual method as well as the performance of the aggregate estimate obtained from the ensemble as a whole. Methods that possess minimal or redundant information are pruned from the ensemble and the evaluation process repeats until aggregate predictive performance can no longer be improved. We show that this process results in a final aggregate estimate that outperforms all individual methods by reducing estimate errors by as much as 91% to 1.2 kcal/mol accuracy. We also compare our iterative refinement approach to other statistical ensemble approaches and demonstrate that this iterative process reduces estimate errors by as much as 61%. This work provides a new approach for accurate solvation free energy prediction and lays the foundation for future work on aggregate models that can balance computational cost with prediction accuracy.
△ Less
Submitted 14 December, 2016; v1 submitted 11 September, 2016;
originally announced September 2016.
-
A proposal for regularly updated review/survey articles: "Perpetual Reviews"
Authors:
David L. Mobley,
Daniel M. Zuckerman
Abstract:
We advocate the publication of review/survey articles that will be updated regularly, both in traditional journals and novel venues. We call these "perpetual reviews." This idea naturally builds on the dissemination and archival capabilities present in the modern internet, and indeed perpetual reviews exist already in some forms. Perpetual review articles allow authors to maintain over time the re…
▽ More
We advocate the publication of review/survey articles that will be updated regularly, both in traditional journals and novel venues. We call these "perpetual reviews." This idea naturally builds on the dissemination and archival capabilities present in the modern internet, and indeed perpetual reviews exist already in some forms. Perpetual review articles allow authors to maintain over time the relevance of non-research scholarship that requires a significant investment of effort. Further, such reviews published in a purely electronic format without space constraints can also permit more pedagogical scholarship and clearer treatment of technical issues that remain obscure in a brief treatment.
△ Less
Submitted 8 February, 2015; v1 submitted 3 February, 2015;
originally announced February 2015.
-
Modeling Amyloid Beta Peptide Insertion into Lipid Bilayers
Authors:
David L. Mobley,
Daniel L. Cox,
Rajiv R. P. Singh,
Michael W. Maddox,
Marjorie L. Longo
Abstract:
Inspired by recent suggestions that the Alzheimer's amyloid beta peptide (A beta) can insert into cell membranes and form harmful ion channels, we model insertion of the 40 and 42 residue forms of the peptide into cell membranes using a Monte Carlo code which is specific at the amino acid level. We examine insertion of the regular A-beta peptide as well as mutants causing familial Alzheimer's di…
▽ More
Inspired by recent suggestions that the Alzheimer's amyloid beta peptide (A beta) can insert into cell membranes and form harmful ion channels, we model insertion of the 40 and 42 residue forms of the peptide into cell membranes using a Monte Carlo code which is specific at the amino acid level. We examine insertion of the regular A-beta peptide as well as mutants causing familial Alzheimer's disease, and find that all but one of the mutants change the insertion behavior by causing the peptide to spend more simulation steps in only one leaflet of the bilayer. We also find that A-beta 42, because of the extra hydrophobic residues relative to A-beta 40, is more likely to adopt this conformation than A-beta 40 in both wild-type and mutant forms. We argue qualitatively why these effects happen. Here, we present our results and develop the hypothesis that this partial insertion increases the probability of harmful channel formation. This hypothesis can partly explain why these mutations are neurotoxic simply due to peptide insertion behavior. We further apply this model to various artificial A-beta mutants which have been examined experimentally, and offer testable experimental predictions contrasting the roles of aggregation and insertion with regard to toxicity of A-beta mutants. These can be used through further experiments to test our hypothesis.
△ Less
Submitted 23 January, 2004; v1 submitted 29 July, 2003;
originally announced July 2003.
-
Simulations of Oligomeric Intermediates in Prion Diseases
Authors:
David L. Mobley,
Daniel L. Cox,
Rajiv R. P. Singh,
Rahul V. Kulkarni,
Alexander Slepoy
Abstract:
We extend our previous stochastic cellular automata based model for areal aggregation of prion proteins on neuronal surfaces. The new anisotropic model allow us to simulate both strong beta-sheet and weaker attachment bonds between proteins. Constraining binding directions allows us to generate aggregate structures with the hexagonal lattice symmetry found in recently observed in vitro experimen…
▽ More
We extend our previous stochastic cellular automata based model for areal aggregation of prion proteins on neuronal surfaces. The new anisotropic model allow us to simulate both strong beta-sheet and weaker attachment bonds between proteins. Constraining binding directions allows us to generate aggregate structures with the hexagonal lattice symmetry found in recently observed in vitro experiments. We argue that these constraints on rules may correspond to underlying steric constraints on the aggregation process. We find that monomer dominated growth of the areal aggregate is too slow to account for some observed doubling time-to-incubation time ratios inferred from data, and so consider aggregation dominated by relatively stable but non-infectious oligomeric intermediates. We compare a kinetic theory analysis of oligomeric aggregation to spatially explicit simulations of the process. We find that with suitable rules for misfolding of oligomers, possibly due to water exclusion by the surrounding aggregate, the resulting oligomeric aggregation model maps onto our previous monomer aggregation model. Therefore it can produce some of the same attractive features for the description of prion incubation time data. We propose experiments to test the oligomeric aggregation model.
△ Less
Submitted 9 July, 2003;
originally announced July 2003.
-
Hysteresis loops of Co-Pt perpendicular magnetic multilayers
Authors:
David L. Mobley,
Christopher R. Pike,
Joseph E. Davies,
Daniel L. Cox,
Rajiv R. P. Singh
Abstract:
We develop a phenomenological model to study magnetic hysteresis in two samples designed as possible perpendicular recording media. A stochastic cellular automata model captures cooperative behavior in the nucleation of magnetic domains. We show how this simple model turns broad hysteresis loops into loops with sharp drops like those observed in these samples, and explains their unusual features…
▽ More
We develop a phenomenological model to study magnetic hysteresis in two samples designed as possible perpendicular recording media. A stochastic cellular automata model captures cooperative behavior in the nucleation of magnetic domains. We show how this simple model turns broad hysteresis loops into loops with sharp drops like those observed in these samples, and explains their unusual features. We also present, and experimentally verify, predictions of this model, and suggest how insights from this model may apply more generally.
△ Less
Submitted 21 July, 2003;
originally announced July 2003.