-
OrbNet Denali: A machine learning potential for biological and organic chemistry with semi-empirical cost and DFT accuracy
Authors:
Anders S. Christensen,
Sai Krishna Sirumalla,
Zhuoran Qiao,
Michael B. O'Connor,
Daniel G. A. Smith,
Feizhi Ding,
Peter J. Bygrave,
Animashree Anandkumar,
Matthew Welborn,
Frederick R. Manby,
Thomas F. Miller III
Abstract:
We present OrbNet Denali, a machine learning model for electronic structure that is designed as a drop-in replacement for ground-state density functional theory (DFT) energy calculations. The model is a message-passing neural network that uses symmetry-adapted atomic orbital features from a low-cost quantum calculation to predict the energy of a molecule. OrbNet Denali is trained on a vast dataset…
▽ More
We present OrbNet Denali, a machine learning model for electronic structure that is designed as a drop-in replacement for ground-state density functional theory (DFT) energy calculations. The model is a message-passing neural network that uses symmetry-adapted atomic orbital features from a low-cost quantum calculation to predict the energy of a molecule. OrbNet Denali is trained on a vast dataset of 2.3 million DFT calculations on molecules and geometries. This dataset covers the most common elements in bio- and organic chemistry (H, Li, B, C, N, O, F, Na, Mg, Si, P, S, Cl, K, Ca, Br, I) as well as charged molecules. OrbNet Denali is demonstrated on several well-established benchmark datasets, and we find that it provides accuracy that is on par with modern DFT methods while offering a speedup of up to three orders of magnitude. For the GMTKN55 benchmark set, OrbNet Denali achieves WTMAD-1 and WTMAD-2 scores of 7.19 and 9.84, on par with modern DFT functionals. For several GMTKN55 subsets, which contain chemical problems that are not present in the training set, OrbNet Denali produces a mean absolute error comparable to those of DFT methods. For the Hutchison conformers benchmark set, OrbNet Denali has a median correlation coefficient of R^2=0.90 compared to the reference DLPNO-CCSD(T) calculation, and R^2=0.97 compared to the method used to generate the training data (wB97X-D3/def2-TZVP), exceeding the performance of any other method with a similar cost. Similarly, the model reaches chemical accuracy for non-covalent interactions in the S66x10 dataset. For torsional profiles, OrbNet Denali reproduces the torsion profiles of wB97X-D3/def2-TZVP with an average MAE of 0.12 kcal/mol for the potential energy surfaces of the diverse fragments in the TorsionNet500 dataset.
△ Less
Submitted 2 July, 2021; v1 submitted 1 July, 2021;
originally announced July 2021.
-
Multi-task learning for electronic structure to predict and explore molecular potential energy surfaces
Authors:
Zhuoran Qiao,
Feizhi Ding,
Matthew Welborn,
Peter J. Bygrave,
Daniel G. A. Smith,
Animashree Anandkumar,
Frederick R. Manby,
Thomas F. Miller III
Abstract:
We refine the OrbNet model to accurately predict energy, forces, and other response properties for molecules using a graph neural-network architecture based on features from low-cost approximated quantum operators in the symmetry-adapted atomic orbital basis. The model is end-to-end differentiable due to the derivation of analytic gradients for all electronic structure terms, and is shown to be tr…
▽ More
We refine the OrbNet model to accurately predict energy, forces, and other response properties for molecules using a graph neural-network architecture based on features from low-cost approximated quantum operators in the symmetry-adapted atomic orbital basis. The model is end-to-end differentiable due to the derivation of analytic gradients for all electronic structure terms, and is shown to be transferable across chemical space due to the use of domain-specific features. The learning efficiency is improved by incorporating physically motivated constraints on the electronic structure through multi-task learning. The model outperforms existing methods on energy prediction tasks for the QM9 dataset and for molecular geometry optimizations on conformer datasets, at a computational cost that is thousand-fold or more reduced compared to conventional quantum-chemistry calculations (such as density functional theory) that offer similar accuracy.
△ Less
Submitted 1 December, 2020; v1 submitted 5 November, 2020;
originally announced November 2020.
-
The CECAM Electronic Structure Library and the modular software development paradigm
Authors:
Micael J. T. Oliveira,
Nick Papior,
Yann Pouillon,
Volker Blum,
Emilio Artacho,
Damien Caliste,
Fabiano Corsetti,
Stefano de Gironcoli,
Alin M. Elena,
Alberto Garcia,
Victor M. Garcia-Suarez,
Luigi Genovese,
William P. Huhn,
Georg Huhs,
Sebastian Kokott,
Emine Kucukbenli,
Ask H. Larsen,
Alfio Lazzaro,
Irina V. Lebedeva,
Yingzhou Li,
David Lopez-Duran,
Pablo Lopez-Tarifa,
Martin Luders,
Miguel A. L. Marques,
Jan Minar
, et al. (12 additional authors not shown)
Abstract:
First-principles electronic structure calculations are very widely used thanks to the many successful software packages available. Their traditional coding paradigm is monolithic, i.e., regardless of how modular its internal structure may be, the code is built independently from others, from the compiler up, with the exception of linear-algebra and message-passing libraries. This model has been qu…
▽ More
First-principles electronic structure calculations are very widely used thanks to the many successful software packages available. Their traditional coding paradigm is monolithic, i.e., regardless of how modular its internal structure may be, the code is built independently from others, from the compiler up, with the exception of linear-algebra and message-passing libraries. This model has been quite successful for decades. The rapid progress in methodology, however, has resulted in an ever increasing complexity of those programs, which implies a growing amount of replication in coding and in the recurrent re-engineering needed to adapt to evolving hardware architecture. The Electronic Structure Library (\esl) was initiated by CECAM (European Centre for Atomic and Molecular Calculations) to catalyze a paradigm shift away from the monolithic model and promote modularization, with the ambition to extract common tasks from electronic structure programs and redesign them as free, open-source libraries. They include "heavy-duty" ones with a high degree of parallelisation, and potential for adaptation to novel hardware within them, thereby separating the sophisticated computer science aspects of performance optimization and re-engineering from the computational science done by scientists when implementing new ideas. It is a community effort, undertaken by developers of various successful codes, now facing the challenges arising in the new model. This modular paradigm will improve overall coding efficiency and enable specialists (computer scientists or computational scientists) to use their skills more effectively. It will lead to a more sustainable and dynamic evolution of software as well as lower barriers to entry for new developers.
△ Less
Submitted 24 June, 2020; v1 submitted 11 May, 2020;
originally announced May 2020.
-
First-order symmetry-adapted perturbation theory for multiplet splittings
Authors:
Konrad Patkowski,
Piotr S. Żuchowski,
Daniel G. A. Smith
Abstract:
We present a symmetry-adapted perturbation theory (SAPT) for the interaction of two high-spin open-shell molecules (described by their restricted open-shell Hartree-Fock determinants) resulting in low-spin states of the complex. The previously available SAPT formalisms, except for some system-specific studies for few-electron complexes, were restricted to the high-spin state of the interacting sys…
▽ More
We present a symmetry-adapted perturbation theory (SAPT) for the interaction of two high-spin open-shell molecules (described by their restricted open-shell Hartree-Fock determinants) resulting in low-spin states of the complex. The previously available SAPT formalisms, except for some system-specific studies for few-electron complexes, were restricted to the high-spin state of the interacting system. Thus, the new approach provides, for the first time, a SAPT-based estimate of the splittings between different spin states of the complex. We have derived and implemented the lowest-order SAPT term responsible for these splittings, that is, the first-order exchange energy. We show that within the so-called S2 approximation commonly used in SAPT (neglecting effects that vanish as fourth or higher powers of intermolecular overlap integrals), the first-order exchange energies for all multiplets are linear combinations of two matrix elements: a diagonal exchange term that determines the spin-averaged effect and a {\em spin-flip term} responsible for the splittings between the states. The numerical factors in this linear combination are determined solely by the Clebsch-Gordan coefficients: accordingly, the S2 approximation implies a Heisenberg Hamiltonian picture with a single coupling strength parameter determining all the splittings. The new approach is cast into both molecular-orbital and atomic-orbital expressions: the latter enable an efficient density-fitted implementation. We test the newly developed formalism on several open-shell complexes ranging from diatomic systems (Li+H, Mn+Mn,...) to the phenalenyl dimer.
△ Less
Submitted 12 January, 2018;
originally announced January 2018.