-
EquiReact: An equivariant neural network for chemical reactions
Authors:
Puck van Gerwen,
Ksenia R. Briling,
Charlotte Bunne,
Vignesh Ram Somnath,
Ruben Laplaza,
Andreas Krause,
Clemence Corminboeuf
Abstract:
Equivariant neural networks have considerably improved the accuracy and data-efficiency of predictions of molecular properties. Building on this success, we introduce EquiReact, an equivariant neural network to infer properties of chemical reactions, built from three-dimensional structures of reactants and products. We illustrate its competitive performance on the prediction of activation barriers…
▽ More
Equivariant neural networks have considerably improved the accuracy and data-efficiency of predictions of molecular properties. Building on this success, we introduce EquiReact, an equivariant neural network to infer properties of chemical reactions, built from three-dimensional structures of reactants and products. We illustrate its competitive performance on the prediction of activation barriers on the GDB7-22-TS, Cyclo-23-TS and Proparg-21-TS datasets with different regimes according to the inclusion of atom-map** information. We show that, compared to state-of-the-art models for reaction property prediction, EquiReact offers: (i) a flexible model with reduced sensitivity between atom-map** regimes, (ii) better extrapolation capabilities to unseen chemistries, (iii) impressive prediction errors for datasets exhibiting subtle variations in three-dimensional geometries of reactants/products, (iv) reduced sensitivity to geometry quality and (iv) excellent data efficiency.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
SPA$^\mathrm{H}$M(a,b): encoding the density information from guess Hamiltonian in quantum machine learning representations
Authors:
Ksenia R. Briling,
Yannick Calvino Alonso,
Alberto Fabrizio,
Clemence Corminboeuf
Abstract:
Recently, we introduced a class of molecular representations for kernel-based regression methods -- the spectrum of approximated Hamiltonian matrices (SPA$^\mathrm{H}$M) -- that takes advantage of lightweight one-electron Hamiltonians traditionally used as an SCF initial guess. The original SPA$^\mathrm{H}$M variant is built from occupied-orbital energies (ie, eigenvalues) and naturally contains a…
▽ More
Recently, we introduced a class of molecular representations for kernel-based regression methods -- the spectrum of approximated Hamiltonian matrices (SPA$^\mathrm{H}$M) -- that takes advantage of lightweight one-electron Hamiltonians traditionally used as an SCF initial guess. The original SPA$^\mathrm{H}$M variant is built from occupied-orbital energies (ie, eigenvalues) and naturally contains all the information about nuclear charges, atomic positions, and symmetry requirements. Its advantages were demonstrated on datasets featuring a wide variation of charge and spin, for which traditional structure-based representations commonly fail. SPA$^\mathrm{H}$M(a,b), as introduced here, expand the eigenvalue SPA$^\mathrm{H}$M into local and transferable representations. They rely upon one-electron density matrices to build fingerprints from atomic and bond density overlap contributions inspired from preceding state-of-the-art representations. The performance and efficiency of SPA$^\mathrm{H}$M(a,b) is assessed on the predictions for datasets of prototypical organic molecules (QM7) of different charges and azoheteroarene dyes in an excited state. Overall, both SPA$^\mathrm{H}$M(a) and SPA$^\mathrm{H}$M(b) outperform state-of-the-art representations on difficult prediction tasks such as the atomic properties of charged open-shell species and of $π$-conjugated systems.
△ Less
Submitted 20 February, 2024; v1 submitted 6 September, 2023;
originally announced September 2023.
-
SPA$^\mathrm{H}$M: the Spectrum of Approximated Hamiltonian Matrices representations
Authors:
Alberto Fabrizio,
Ksenia R. Briling,
Clemence Corminboeuf
Abstract:
Physics-inspired molecular representations are the cornerstone of similarity-based learning applied to solve chemical problems. Despite their conceptual and mathematical diversity, this class of descriptors shares a common underlying philosophy: they all rely on the molecular information that determines the form of the electronic Schrödinger equation. Existing representations take the most varied…
▽ More
Physics-inspired molecular representations are the cornerstone of similarity-based learning applied to solve chemical problems. Despite their conceptual and mathematical diversity, this class of descriptors shares a common underlying philosophy: they all rely on the molecular information that determines the form of the electronic Schrödinger equation. Existing representations take the most varied forms, from non-linear functions of atom types and positions to atom densities and potential, up to complex quantum chemical objects directly injected into the ML architecture. In this work, we present the Spectrum of Approximated Hamiltonian Matrices (SPA$^\mathrm{H}$M) as an alternative pathway to construct quantum machine learning representations through leveraging the foundation of the electronic Schrödinger equation itself: the electronic Hamiltonian. As the Hamiltonian encodes all quantum chemical information at once, SPA$^\mathrm{H}$M representations not only distinguish different molecules and conformations, but also different spin, charge, and electronic states. As a proof of concept, we focus here on efficient SPA$^\mathrm{H}$M representations built from the eigenvalues of a hierarchy of well-established and readily-evaluated "guess" Hamiltonians. These SPA$^\mathrm{H}$M representations are particularly compact and efficient for kernel evaluation and their complexity is independent of the number of different atom types in the database.
△ Less
Submitted 5 April, 2022; v1 submitted 25 October, 2021;
originally announced October 2021.
-
Impact of quantum-chemical metrics on the machine learning prediction of electron density
Authors:
Ksenia R. Briling,
Alberto Fabrizio,
Clemence Corminboeuf
Abstract:
Machine learning (ML) algorithms have undergone an explosive development impacting every aspect of computational chemistry. To obtain reliable predictions, one needs to maintain the proper balance between the black-box nature of ML frameworks and the physics of the target properties. One of the most appealing quantum-chemical properties for regression models is the electron density, and some of us…
▽ More
Machine learning (ML) algorithms have undergone an explosive development impacting every aspect of computational chemistry. To obtain reliable predictions, one needs to maintain the proper balance between the black-box nature of ML frameworks and the physics of the target properties. One of the most appealing quantum-chemical properties for regression models is the electron density, and some of us recently proposed a transferable and scalable model based on the decomposition of the density onto an atom-centered basis set. The decomposition, as well as the training of the model, is at its core a minimization of some loss function, which can be arbitrarily chosen and may lead to results of different quality. Well-studied in the context of density fitting (DF), the impact of the metric on the performance of ML models has not been analyzed yet. In this work, we compare predictions obtained using the overlap and the Coulomb-repulsion metrics for both decomposition and training. As expected, the Coulomb metric used as both the DF and ML loss functions leads to the best results for the electrostatic potential and dipole moments. The origin of this difference lies in the fact that the model is not constrained to predict densities that integrate to the exact number of electrons $N$. Since an \textit{a posteriori} correction for the number of electrons decreases the errors, we proposed a modification of the model where $N$ is included directly into the kernel function, which allowed to lower the errors on the test and out-of-sample sets.
△ Less
Submitted 15 November, 2021; v1 submitted 26 April, 2021;
originally announced April 2021.
-
Learning on-top: regressing the on-top pair density for real-space visualization of electron correlation
Authors:
Alberto Fabrizio,
Ksenia R. Briling,
David D. Girardier,
Clemence Corminboeuf
Abstract:
The on-top pair density [$Π(\mathrm{\mathbf{r}})$] is a local quantum-chemical property that reflects the probability of two electrons of any spin to occupy the same position in space. Being the simplest quantity related to the two-particle density matrix, the on-top pair density is a powerful indicator of electron correlation effects, and as such, it has been extensively used to combine density f…
▽ More
The on-top pair density [$Π(\mathrm{\mathbf{r}})$] is a local quantum-chemical property that reflects the probability of two electrons of any spin to occupy the same position in space. Being the simplest quantity related to the two-particle density matrix, the on-top pair density is a powerful indicator of electron correlation effects, and as such, it has been extensively used to combine density functional theory and multireference wavefunction theory. The widespread application of $Π(\mathrm{\mathbf{r}})$ is currently hindered by the need for post-Hartree--Fock or multireference computations for its accurate evaluation. In this work, we propose the construction of a machine learning model capable of predicting the CASSCF-quality on-top pair density of a molecule only from its structure and composition. Our model, trained on the GDB11-AD-3165 database, is able to predict with minimal error the on-top pair density of organic molecules, bypassing completely the need for $\textit{ab initio}$ computations. The accuracy of the regression is demonstrated using the on-top ratio as a visual metric of electron correlation effects and bond-breaking in real-space. In addition, we report the construction of a specialized basis set, built to fit the on-top pair density in a single atom-centered expansion. This basis, cornerstone of the regression, could be potentially used also in the same spirit of the resolution-of-the-identity approximation for the electron density.
△ Less
Submitted 30 November, 2020; v1 submitted 14 October, 2020;
originally announced October 2020.
-
Atomic effective potentials for starting molecular electronic structure calculations
Authors:
Dimitri N. Laikov,
Ksenia R. Briling
Abstract:
Atomic effective one-electron potentials in a compact analytic form in terms of a few Gaussian charge distributions are developed, for Hydrogen through Nobelium, for starting molecular electronic structure calculations by a simple diagonalization. For each element, all terms but one are optimized in an isolated-atom Hartree--Fock calculation, and the last one is parametrized on a set of molecules.…
▽ More
Atomic effective one-electron potentials in a compact analytic form in terms of a few Gaussian charge distributions are developed, for Hydrogen through Nobelium, for starting molecular electronic structure calculations by a simple diagonalization. For each element, all terms but one are optimized in an isolated-atom Hartree--Fock calculation, and the last one is parametrized on a set of molecules. This one-parameter-per-atom model gives a good starting guess for typical molecules and may be of interest even on its own.
△ Less
Submitted 6 January, 2020; v1 submitted 8 February, 2019;
originally announced February 2019.