Search | arXiv e-print repository

Semiempirical Hamiltonians learned from data can have accuracy comparable to Density Functional Theory

Authors: Frank Hu, Francis He, David J. Yaron

Abstract: Quantum chemistry provides chemists with invaluable information, but the high computational cost limits the size and type of systems that can be studied. Machine learning (ML) has emerged as a means to dramatically lower cost while maintaining high accuracy. However, ML models often sacrifice interpretability by using components, such as the artificial neural networks of deep learning, that functi… ▽ More Quantum chemistry provides chemists with invaluable information, but the high computational cost limits the size and type of systems that can be studied. Machine learning (ML) has emerged as a means to dramatically lower cost while maintaining high accuracy. However, ML models often sacrifice interpretability by using components, such as the artificial neural networks of deep learning, that function as black boxes. These components impart the flexibility needed to learn from large volumes of data but make it difficult to gain insight into the physical or chemical basis for the predictions. Here, we demonstrate that semiempirical quantum chemical (SEQC) models can learn from large volumes of data without sacrificing interpretability. The SEQC model is that of Density Functional based Tight Binding (DFTB) with fixed atomic orbital energies and interactions that are one-dimensional functions of interatomic distance. This model is trained to ab initio data in a manner that is analogous to that used to train deep learning models. Using benchmarks that reflect the accuracy of the training data, we show that the resulting model maintains a physically reasonable functional form while achieving an accuracy, relative to coupled cluster energies with a complete basis set extrapolation (CCSD(T)*/CBS), that is comparable to that of density functional theory (DFT). This suggests that trained SEQC models can achieve low computational cost and high accuracy without sacrificing interpretability. Use of a physically-motivated model form also substantially reduces the amount of ab initio data needed to train the model compared to that required for deep learning models. △ Less

Submitted 10 January, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

arXiv:1808.04526 [pdf, other]

A Density Functional Tight Binding Layer for Deep Learning of Chemical Hamiltonians

Authors: Haichen Li, Christopher Collins, Matteus Tanha, Geoffrey J. Gordon, David J. Yaron

Abstract: Current neural networks for predictions of molecular properties use quantum chemistry only as a source of training data. This paper explores models that use quantum chemistry as an integral part of the prediction process. This is done by implementing self-consistent-charge Density-Functional-Tight-Binding (DFTB) theory as a layer for use in deep learning models. The DFTB layer takes, as input, Ham… ▽ More Current neural networks for predictions of molecular properties use quantum chemistry only as a source of training data. This paper explores models that use quantum chemistry as an integral part of the prediction process. This is done by implementing self-consistent-charge Density-Functional-Tight-Binding (DFTB) theory as a layer for use in deep learning models. The DFTB layer takes, as input, Hamiltonian matrix elements generated from earlier layers and produces, as output, electronic properties from self-consistent field solutions of the corresponding DFTB Hamiltonian. Backpropagation enables efficient training of the model to target electronic properties. Two types of input to the DFTB layer are explored, splines and feed-forward neural networks. Because overfitting can cause models trained on smaller molecules to perform poorly on larger molecules, regularizations are applied that penalize non-monotonic behavior and deviation of the Hamiltonian matrix elements from those of the published DFTB model used to initialize the model. The approach is evaluated on 15,700 hydrocarbons by comparing the root mean square error in energy and dipole moment, on test molecules with 8 heavy atoms, to the error from the initial DFTB model. When trained on molecules with up to 7 heavy atoms, the spline model reduces the test error in energy by 60% and in dipole moments by 42%. The neural network model performs somewhat better, with error reductions of 67% and 59% respectively. Training on molecules with up to 4 heavy atoms reduces performance, with both the spline and neural net models reducing the test error in energy by about 53% and in dipole by about 25%. △ Less

Submitted 20 August, 2018; v1 submitted 14 August, 2018; originally announced August 2018.

Comments: 22 pages, 10 figures, 3 tables

arXiv:1712.04516 [pdf, other]

doi 10.1039/C7ME00131B

Tuning the Molecular Weight Distribution from Atom Transfer Radical Polymerization Using Deep Reinforcement Learning

Authors: Haichen Li, Christopher R. Collins, Thomas G. Ribelli, Krzysztof Matyjaszewski, Geoffrey J. Gordon, Tomasz Kowalewski, David J. Yaron

Abstract: We devise a novel technique to control the shape of polymer molecular weight distributions (MWDs) in atom transfer radical polymerization (ATRP). This technique makes use of recent advances in both simulation-based, model-free reinforcement learning (RL) and the numerical simulation of ATRP. A simulation of ATRP is built that allows an RL controller to add chemical reagents throughout the course o… ▽ More We devise a novel technique to control the shape of polymer molecular weight distributions (MWDs) in atom transfer radical polymerization (ATRP). This technique makes use of recent advances in both simulation-based, model-free reinforcement learning (RL) and the numerical simulation of ATRP. A simulation of ATRP is built that allows an RL controller to add chemical reagents throughout the course of the reaction. The RL controller incorporates fully-connected and convolutional neural network architectures and bases its decision upon the current status of the ATRP reaction. The initial, untrained, controller leads to ending MWDs with large variability, allowing the RL algorithm to explore a large search space. When trained using an actor-critic algorithm, the RL controller is able to discover and optimize control policies that lead to a variety of target MWDs. The target MWDs include Gaussians of various width, and more diverse shapes such as bimodal distributions. The learned control policies are robust and transfer to similar but not identical ATRP reaction settings, even under the presence of simulated noise. We believe this work is a proof-of-concept for employing modern artificial intelligence techniques in the synthesis of new functional polymer materials. △ Less

Submitted 21 March, 2018; v1 submitted 10 December, 2017; originally announced December 2017.

Comments: 18 pages, 14 figures, 2 tables

Journal ref: Mol. Syst. Des. Eng., 2018, Advance Article

arXiv:1701.06649 [pdf, other]

Constant Size Molecular Descriptors For Use With Machine Learning

Authors: Christopher R. Collins, Geoffrey J. Gordon, O. Anatole von Lilienfeld, David J. Yaron

Abstract: A set of molecular descriptors whose length is independent of molecular size is developed for machine learning models that target thermodynamic and electronic properties of molecules. These features are evaluated by monitoring performance of kernel ridge regression models on well-studied data sets of small organic molecules. The features include connectivity counts, which require only the bonding… ▽ More A set of molecular descriptors whose length is independent of molecular size is developed for machine learning models that target thermodynamic and electronic properties of molecules. These features are evaluated by monitoring performance of kernel ridge regression models on well-studied data sets of small organic molecules. The features include connectivity counts, which require only the bonding pattern of the molecule, and encoded distances, which summarize distances between both bonded and non-bonded atoms and so require the full molecular geometry. In addition to having constant size, these features summarize information regarding the local environment of atoms and bonds, such that models can take advantage of similarities resulting from the presence of similar chemical fragments across molecules. Combining these two types of features leads to models whose performance is comparable to or better than the current state of the art. The features introduced here have the advantage of leading to models that may be trained on smaller molecules and then used successfully on larger molecules. △ Less

Submitted 23 January, 2017; originally announced January 2017.

Comments: 18 pages, 5 figures

arXiv:1503.07852 [pdf, other]

Embedding parameters in ab initio theory to develop approximations based on molecular similarity

Authors: Matteus Tanha, Haichen Li, Shiva Kaul, Alexander Cappiello, Geoffrey J. Gordon, David J. Yaron

Abstract: A means to take advantage of molecular similarity to lower the computational cost of electronic structure theory is explored, in which parameters are embedded into a low-cost, low-level (LL) ab initio model and adjusted to obtain agreement with results from a higher-level (HL) ab initio model. A parametrized LL (pLL) model is created by multiplying selected matrix elements of the Hamiltonian opera… ▽ More A means to take advantage of molecular similarity to lower the computational cost of electronic structure theory is explored, in which parameters are embedded into a low-cost, low-level (LL) ab initio model and adjusted to obtain agreement with results from a higher-level (HL) ab initio model. A parametrized LL (pLL) model is created by multiplying selected matrix elements of the Hamiltonian operators by scaling factors that depend on element types. Various schemes for applying the scaling factors are compared, along with the impact of making the scaling factors linear functions of variables related to bond lengths, atomic charges, and bond orders. The models are trained on ethane and ethylene, substituted with -NH2, -OH and -F, and tested on substituted propane, propylene and t-butane. Training and test datasets are created by distorting the molecular geometries and applying uniform electric fields. The fitted properties include changes in total energy arising from geometric distortions or applied fields, and frontier orbital energies. The impacts of including additional training data, such as decomposition of the energy by operator or interaction of the electron density with external charges, are also explored. The best-performing model forms reduce the root mean square (RMS) difference between the HL and LL energy predictions by over 85% on the training data and over 75% on the test data. The argument is made that this approach has the potential to provide a flexible and systematically-improvable means to take advantage of molecular similarity in quantum chemistry. △ Less

Submitted 26 March, 2015; originally announced March 2015.

Comments: Main text: 16 pages, 6 figures, 6 tables; Supporting information: 5 pages, 9 tables

arXiv:1311.3440 [pdf]

Embedding parameters in ab initio theory to develop well-controlled approximations based on molecular similarity

Authors: Matteus Tanha, Shiva Kaul, Alex Cappiello, Geoffrey J. Gordon, David J. Yaron

Abstract: A means to take advantage of molecular similarity to lower the computational cost of electronic structure theory is proposed, in which parameters are embedded into a low-cost, low-level (LL) ab initio theory and adjusted to obtain agreement with a higher level (HL) ab initio theory. This approach is explored by training such a model on data for ethane and testing the resulting model on methane, pr… ▽ More A means to take advantage of molecular similarity to lower the computational cost of electronic structure theory is proposed, in which parameters are embedded into a low-cost, low-level (LL) ab initio theory and adjusted to obtain agreement with a higher level (HL) ab initio theory. This approach is explored by training such a model on data for ethane and testing the resulting model on methane, propane and butane. The electronic distribution of the molecules is varied by placing them in strong electrostatic environments consisting of random charges placed on the corners of a cube. The results find that parameters embedded in HF/STO-3G theory can be adjusted to obtain agreement, to within about 2 kcal/mol, with results of HF/6-31G theory. Obtaining this level of agreement requires the use of parameters that are functions of the bond lengths, atomic charges, and bond orders within the molecules. The argument is made that this approach provides a well-controlled means to take advantage of molecular similarity in quantum chemistry. △ Less

Submitted 14 November, 2013; originally announced November 2013.

Showing 1–6 of 6 results for author: Yaron, D J