Search | arXiv e-print repository

Higher-Rank Irreducible Cartesian Tensors for Equivariant Message Passing

Authors: Viktor Zaverkin, Francesco Alesiani, Takashi Maruyama, Federico Errica, Henrik Christiansen, Makoto Takamoto, Nicolas Weber, Mathias Niepert

Abstract: The ability to perform fast and accurate atomistic simulations is crucial for advancing the chemical sciences. By learning from high-quality data, machine-learned interatomic potentials achieve accuracy on par with ab initio and first-principles methods at a fraction of their computational cost. The success of machine-learned interatomic potentials arises from integrating inductive biases such as… ▽ More The ability to perform fast and accurate atomistic simulations is crucial for advancing the chemical sciences. By learning from high-quality data, machine-learned interatomic potentials achieve accuracy on par with ab initio and first-principles methods at a fraction of their computational cost. The success of machine-learned interatomic potentials arises from integrating inductive biases such as equivariance to group actions on an atomic system, e.g., equivariance to rotations and reflections. In particular, the field has notably advanced with the emergence of equivariant message-passing architectures. Most of these models represent an atomic system using spherical tensors, tensor products of which require complicated numerical coefficients and can be computationally demanding. This work introduces higher-rank irreducible Cartesian tensors as an alternative to spherical tensors, addressing the above limitations. We integrate irreducible Cartesian tensor products into message-passing neural networks and prove the equivariance of the resulting layers. Through empirical evaluations on various benchmark data sets, we consistently observe on-par or better performance than that of state-of-the-art spherical models. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2402.01975 [pdf, other]

Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks

Authors: Duy M. H. Nguyen, Nina Lukashina, Tai Nguyen, An T. Le, TrungTin Nguyen, Nhat Ho, Jan Peters, Daniel Sonntag, Viktor Zaverkin, Mathias Niepert

Abstract: A molecule's 2D representation consists of its atoms, their attributes, and the molecule's covalent bonds. A 3D (geometric) representation of a molecule is called a conformer and consists of its atom types and Cartesian coordinates. Every conformer has a potential energy, and the lower this energy, the more likely it occurs in nature. Most existing machine learning methods for molecular property p… ▽ More A molecule's 2D representation consists of its atoms, their attributes, and the molecule's covalent bonds. A 3D (geometric) representation of a molecule is called a conformer and consists of its atom types and Cartesian coordinates. Every conformer has a potential energy, and the lower this energy, the more likely it occurs in nature. Most existing machine learning methods for molecular property prediction consider either 2D molecular graphs or 3D conformer structure representations in isolation. Inspired by recent work on using ensembles of conformers in conjunction with 2D graph representations, we propose $\mathrm{E}$(3)-invariant molecular conformer aggregation networks. The method integrates a molecule's 2D representation with that of multiple of its conformers. Contrary to prior work, we propose a novel 2D-3D aggregation mechanism based on a differentiable solver for the \emph{Fused Gromov-Wasserstein Barycenter} problem and the use of an efficient conformer generation method based on distance geometry. We show that the proposed aggregation mechanism is $\mathrm{E}$(3) invariant and propose an efficient GPU implementation. Moreover, we demonstrate that the aggregation mechanism helps to significantly outperform state-of-the-art molecule property prediction methods on established datasets. △ Less

Submitted 10 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: Accepted at ICML 2024

arXiv:2312.16560 [pdf, other]

Adaptive Message Passing: A General Framework to Mitigate Oversmoothing, Oversquashing, and Underreaching

Authors: Federico Errica, Henrik Christiansen, Viktor Zaverkin, Takashi Maruyama, Mathias Niepert, Francesco Alesiani

Abstract: Long-range interactions are essential for the correct description of complex systems in many scientific fields. The price to pay for including them in the calculations, however, is a dramatic increase in the overall computational costs. Recently, deep graph networks have been employed as efficient, data-driven surrogate models for predicting properties of complex systems represented as graphs. The… ▽ More Long-range interactions are essential for the correct description of complex systems in many scientific fields. The price to pay for including them in the calculations, however, is a dramatic increase in the overall computational costs. Recently, deep graph networks have been employed as efficient, data-driven surrogate models for predicting properties of complex systems represented as graphs. These models rely on a local and iterative message passing strategy that should, in principle, capture long-range information without explicitly modeling the corresponding interactions. In practice, most deep graph networks cannot really model long-range dependencies due to the intrinsic limitations of (synchronous) message passing, namely oversmoothing, oversquashing, and underreaching. This work proposes a general framework that learns to mitigate these limitations: within a variational inference framework, we endow message passing architectures with the ability to freely adapt their depth and filter messages along the way. With theoretical and empirical arguments, we show that this simple strategy better captures long-range interactions, by surpassing the state of the art on five node and graph prediction datasets suited for this problem. Our approach consistently improves the performances of the baselines tested on these tasks. We complement the exposition with qualitative analyses and ablations to get a deeper understanding of the framework's inner workings. △ Less

Submitted 20 March, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

arXiv:2312.01416 [pdf, other]

Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials

Authors: Viktor Zaverkin, David Holzmüller, Henrik Christiansen, Federico Errica, Francesco Alesiani, Makoto Takamoto, Mathias Niepert, Johannes Kästner

Abstract: Efficiently creating a concise but comprehensive data set for training machine-learned interatomic potentials (MLIPs) is an under-explored problem. Active learning (AL), which uses either biased or unbiased molecular dynamics (MD) simulations to generate candidate pools, aims to address this objective. Existing biased and unbiased MD simulations, however, are prone to miss either rare events or ex… ▽ More Efficiently creating a concise but comprehensive data set for training machine-learned interatomic potentials (MLIPs) is an under-explored problem. Active learning (AL), which uses either biased or unbiased molecular dynamics (MD) simulations to generate candidate pools, aims to address this objective. Existing biased and unbiased MD simulations, however, are prone to miss either rare events or extrapolative regions -- areas of the configurational space where unreliable predictions are made. Simultaneously exploring both regions is necessary for develo** uniformly accurate MLIPs. In this work, we demonstrate that MD simulations, when biased by the MLIP's energy uncertainty, effectively capture extrapolative regions and rare events without the need to know \textit{a priori} the system's transition temperatures and pressures. Exploiting automatic differentiation, we enhance bias-forces-driven MD simulations by introducing the concept of bias stress. We also employ calibrated ensemble-free uncertainties derived from sketched gradient features to yield MLIPs with similar or better accuracy than ensemble-based uncertainty methods at a lower computational cost. We use the proposed uncertainty-driven AL approach to develop MLIPs for two benchmark systems: alanine dipeptide and MIL-53(Al). Compared to MLIPs trained with conventional MD simulations, MLIPs trained with the proposed data-generation method more accurately represent the relevant configurational space for both atomic systems. △ Less

Submitted 3 December, 2023; originally announced December 2023.

arXiv:2312.01415 [pdf, other]

doi 10.1021/acs.jctc.1c00853

Thermally Averaged Magnetic Anisotropy Tensors via Machine Learning Based on Gaussian Moments

Authors: Viktor Zaverkin, Julia Netz, Fabian Zills, Andreas Köhn, Johannes Kästner

Abstract: We propose a machine learning method to model molecular tensorial quantities, namely the magnetic anisotropy tensor, based on the Gaussian-moment neural-network approach. We demonstrate that the proposed methodology can achieve an accuracy of 0.3--0.4 cm$^{-1}$ and has excellent generalization capability for out-of-sample configurations. Moreover, in combination with machine-learned interatomic po… ▽ More We propose a machine learning method to model molecular tensorial quantities, namely the magnetic anisotropy tensor, based on the Gaussian-moment neural-network approach. We demonstrate that the proposed methodology can achieve an accuracy of 0.3--0.4 cm$^{-1}$ and has excellent generalization capability for out-of-sample configurations. Moreover, in combination with machine-learned interatomic potential energies based on Gaussian moments, our approach can be applied to study the dynamic behavior of magnetic anisotropy tensors and provide a unique insight into spin-phonon relaxation. △ Less

Submitted 3 December, 2023; originally announced December 2023.

Journal ref: J. Chem. Theory Comput. 2022, 18, 1, 1--12

arXiv:2312.01414 [pdf, other]

doi 10.1063/5.0078983

Predicting Properties of Periodic Systems from Cluster Data: A Case Study of Liquid Water

Authors: Viktor Zaverkin, David Holzmüller, Robin Schuldt, Johannes Kästner

Abstract: The accuracy of the training data limits the accuracy of bulk properties from machine-learned potentials. For example, hybrid functionals or wave-function-based quantum chemical methods are readily available for cluster data but effectively out-of-scope for periodic structures. We show that local, atom-centred descriptors for machine-learned potentials enable the prediction of bulk properties from… ▽ More The accuracy of the training data limits the accuracy of bulk properties from machine-learned potentials. For example, hybrid functionals or wave-function-based quantum chemical methods are readily available for cluster data but effectively out-of-scope for periodic structures. We show that local, atom-centred descriptors for machine-learned potentials enable the prediction of bulk properties from cluster model training data, agreeing reasonably well with predictions from bulk training data. We demonstrate such transferability by studying structural and dynamical properties of bulk liquid water with density functional theory and have found an excellent agreement with experimental as well as theoretical counterparts. △ Less

Submitted 3 December, 2023; originally announced December 2023.

Journal ref: J. Chem. Phys. 156, 114103 (2022)

arXiv:2303.03059 [pdf, other]

doi 10.1051/0004-6361/202346073

Reaction dynamics on amorphous solid water surfaces using interatomic machine learned potentials. Microscopic energy partition revealed from the P + H -> PH reaction

Authors: Germán Molpeceres, Viktor Zaverkin, Kenji Furuya, Yuri Aikawa, Johannes Kästner

Abstract: Energy redistribution after a chemical reaction is one of the few mechanisms to explain the diffusion and desorption of molecules which require more energy than the thermal energy available in quiescent molecular clouds (10 K). This energy distribution can be important in phosphorous hydrides, elusive yet fundamental molecules for interstellar prebiotic chemistry. We studied the reaction dynamics… ▽ More Energy redistribution after a chemical reaction is one of the few mechanisms to explain the diffusion and desorption of molecules which require more energy than the thermal energy available in quiescent molecular clouds (10 K). This energy distribution can be important in phosphorous hydrides, elusive yet fundamental molecules for interstellar prebiotic chemistry. We studied the reaction dynamics of the \ce{P + H -> PH} reaction on amorphous solid water, a reaction of astrophysical interest, using \emph{ab-initio} molecular dynamics with atomic forces evaluated by a neural network interatomic potential. We found that the exact nature of the initial phosphorous binding sites is less relevant for the energy dissipation process because the nascent PH molecule rapidly migrates to sites with higher binding energy after the reaction. Non-thermal diffusion and desorption-after-reaction were observed and occurred early in the dynamics, essentially decoupled from the dissipation of the chemical reaction energy. From an extensive sampling of reactions on sites, we constrained the average dissipated reaction energy within the simulation time (50 ps) to be between 50 and 70 %. Most importantly, the fraction of translational energy acquired by the formed molecule was found to be mostly between 1 and 5 %. Including these values, specifically for the test cases of 2% and 5% of translational energy conversion, in astrochemical models, reveals very low gas-phase abundances of PH$_{x}$ molecules and reflects that considering binding energy distributions is paramount for correctly merging microscopic and macroscopic modelling of non-thermal surface astrochemical processes. Finally, we found that PD molecules dissipate more of the reaction energy. This effect can be relevant for the deuterium fractionation and preferential distillation of molecules in the interstellar medium. △ Less

Submitted 6 March, 2023; originally announced March 2023.

Comments: Accepted for publication in Astronomy and Astrophysics

Journal ref: A&A 673, A51 (2023)

arXiv:2212.03916 [pdf, other]

doi 10.1039/D2CP05793J

Transfer learning for chemically accurate interatomic neural network potentials

Authors: Viktor Zaverkin, David Holzmüller, Luca Bonfirraro, Johannes Kästner

Abstract: Develo** machine learning-based interatomic potentials from ab-initio electronic structure methods remains a challenging task for computational chemistry and materials science. This work studies the capability of transfer learning, in particular discriminative fine-tuning, for efficiently generating chemically accurate interatomic neural network potentials on organic molecules from the MD17 and… ▽ More Develo** machine learning-based interatomic potentials from ab-initio electronic structure methods remains a challenging task for computational chemistry and materials science. This work studies the capability of transfer learning, in particular discriminative fine-tuning, for efficiently generating chemically accurate interatomic neural network potentials on organic molecules from the MD17 and ANI data sets. We show that pre-training the network parameters on data obtained from density functional calculations considerably improves the sample efficiency of models trained on more accurate ab-initio data. Additionally, we show that fine-tuning with energy labels alone can suffice to obtain accurate atomic forces and run large-scale atomistic simulations, provided a well-designed fine-tuning data set. We also investigate possible limitations of transfer learning, especially regarding the design and size of the pre-training and fine-tuning data sets. Finally, we provide GM-NN potentials pre-trained and fine-tuned on the ANI-1x and ANI-1ccx data sets, which can easily be fine-tuned on and applied to organic molecules. △ Less

Submitted 28 January, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

arXiv:2203.09410 [pdf, other]

A Framework and Benchmark for Deep Batch Active Learning for Regression

Authors: David Holzmüller, Viktor Zaverkin, Johannes Kästner, Ingo Steinwart

Abstract: The acquisition of labels for supervised learning can be expensive. To improve the sample efficiency of neural network regression, we study active learning methods that adaptively select batches of unlabeled data for labeling. We present a framework for constructing such methods out of (network-dependent) base kernels, kernel transformations, and selection methods. Our framework encompasses many e… ▽ More The acquisition of labels for supervised learning can be expensive. To improve the sample efficiency of neural network regression, we study active learning methods that adaptively select batches of unlabeled data for labeling. We present a framework for constructing such methods out of (network-dependent) base kernels, kernel transformations, and selection methods. Our framework encompasses many existing Bayesian methods based on Gaussian process approximations of neural networks as well as non-Bayesian methods. Additionally, we propose to replace the commonly used last-layer features with sketched finite-width neural tangent kernels and to combine them with a novel clustering method. To evaluate different methods, we introduce an open-source benchmark consisting of 15 large tabular regression data sets. Our proposed method outperforms the state-of-the-art on our benchmark, scales to large data sets, and works out-of-the-box without adjusting the network architecture or training code. We provide open-source code that includes efficient implementations of all kernels, kernel transformations, and selection methods, and can be used for reproducing our results. △ Less

Submitted 1 August, 2023; v1 submitted 17 March, 2022; originally announced March 2022.

Comments: Published at the Journal of Machine Learning Research (JMLR). Changes in v4: Improvements in writing and other minor changes. Accompanying code can be found at https://github.com/dholzmueller/bmdal_reg

Journal ref: Journal of Machine Learning Research, 24(164):1-81, 2023

arXiv:2112.05412 [pdf, other]

doi 10.1093/mnras/stab3631

Neural-Network Assisted Study of Nitrogen Atom Dynamics on Amorphous Solid Water -- II. Diffusion

Authors: Viktor Zaverkin, Germán Molpeceres, Johannes Kästner

Abstract: The diffusion of atoms and radicals on interstellar dust grains is a fundamental ingredient for predicting accurate molecular abundances in astronomical environments. Quantitative values of diffusivity and diffusion barriers usually rely heavily on empirical rules. In this paper, we compute the diffusion coefficients of adsorbed nitrogen atoms by combining machine-learned interatomic potentials, m… ▽ More The diffusion of atoms and radicals on interstellar dust grains is a fundamental ingredient for predicting accurate molecular abundances in astronomical environments. Quantitative values of diffusivity and diffusion barriers usually rely heavily on empirical rules. In this paper, we compute the diffusion coefficients of adsorbed nitrogen atoms by combining machine-learned interatomic potentials, metadynamics, and kinetic Monte Carlo simulations. With this approach, we obtain a diffusion coefficient of nitrogen atoms on the surface of amorphous solid water of merely $(3.5 \pm 1.1)10^{-34}$cm$^2$s$^{-1}$ at 10 K for a bare ice surface. Thus, we find that nitrogen, as a paradigmatic case for light and weakly bound adsorbates, is unable to diffuse on bare amorphous solid water at 10 K. Surface coverage has a strong effect on the diffusion coefficient by modulating its value over 9--12 orders of magnitude at 10 K and enables diffusion for specific conditions. In addition, we have found that atom tunneling has a negligible effect. Average diffusion barriers of the potential energy surface (2.56 kJ mol$^{-1}$) differ strongly from the effective diffusion barrier obtained from the diffusion coefficient for a bare surface (6.06 kJ mol$^{-1}$) and are, thus, inappropriate for diffusion modeling. Our findings suggest that the thermal diffusion of N on water ice is a process that is highly dependent on the physical conditions of the ice. △ Less

Submitted 10 December, 2021; originally announced December 2021.

Comments: Accepted in MNRAS

arXiv:2109.09569 [pdf, other]

doi 10.1021/acs.jctc.1c00527

Fast and Sample-Efficient Interatomic Neural Network Potentials for Molecules and Materials Based on Gaussian Moments

Authors: Viktor Zaverkin, David Holzmüller, Ingo Steinwart, Johannes Kästner

Abstract: Artificial neural networks (NNs) are one of the most frequently used machine learning approaches to construct interatomic potentials and enable efficient large-scale atomistic simulations with almost ab initio accuracy. However, the simultaneous training of NNs on energies and forces, which are a prerequisite for, e.g., molecular dynamics simulations, can be demanding. In this work, we present an… ▽ More Artificial neural networks (NNs) are one of the most frequently used machine learning approaches to construct interatomic potentials and enable efficient large-scale atomistic simulations with almost ab initio accuracy. However, the simultaneous training of NNs on energies and forces, which are a prerequisite for, e.g., molecular dynamics simulations, can be demanding. In this work, we present an improved NN architecture based on the previous GM-NN model [V. Zaverkin and J. Kästner, J. Chem. Theory Comput. 16, 5410-5421 (2020)], which shows an improved prediction accuracy and considerably reduced training times. Moreover, we extend the applicability of Gaussian moment-based interatomic potentials to periodic systems and demonstrate the overall excellent transferability and robustness of the respective models. The fast training by the improved methodology is a pre-requisite for training-heavy workflows such as active learning or learning-on-the-fly. △ Less

Submitted 20 September, 2021; originally announced September 2021.

Comments: Manuscript accepted for publication in J. Chem. Theory Comput.; Code published at https://gitlab.com/zaverkin_v/gmnn

arXiv:2109.07421 [pdf, other]

doi 10.1021/acs.jctc.0c00347

Gaussian Moments as Physically Inspired Molecular Descriptors for Accurate and Scalable Machine Learning Potentials

Authors: Viktor Zaverkin, Johannes Kästner

Abstract: Machine learning techniques allow a direct map** of atomic positions and nuclear charges to the potential energy surface with almost ab-initio accuracy and the computational efficiency of empirical potentials. In this work we propose a machine learning method for constructing high-dimensional potential energy surfaces based on feed-forward neural networks. As input to the neural network we propo… ▽ More Machine learning techniques allow a direct map** of atomic positions and nuclear charges to the potential energy surface with almost ab-initio accuracy and the computational efficiency of empirical potentials. In this work we propose a machine learning method for constructing high-dimensional potential energy surfaces based on feed-forward neural networks. As input to the neural network we propose an extendable invariant local molecular descriptor constructed from geometric moments. Their formulation via pairwise distance vectors and tensor contractions allows a very efficient implementation on graphical processing units (GPUs). The atomic species is encoded in the molecular descriptor, which allows the restriction to one neural network for the training of all atomic species in the data set. We demonstrate that the accuracy of the developed approach in representing both chemical and configurational spaces is comparable to the one of several established machine learning models. Due to its high accuracy and efficiency, the proposed machine-learned potentials can be used for any further tasks, for example the optimization of molecular geometries, the calculation of rate constants or molecular dynamics. △ Less

Submitted 15 September, 2021; originally announced September 2021.

Journal ref: J. Chem. Theory Comput. 2020, 16, 8, 5410-5421

arXiv:2009.09994 [pdf, other]

doi 10.1093/mnras/staa2891

Neural-Network Assisted Study of Nitrogen Atom Dynamics on Amorphous Solid Water. I. Adsorption & Desorption

Authors: Germán Molpeceres, Viktor Zaverkin, Johannes Kästner

Abstract: Dynamics of adsorption and desorption of (4S)-N on amorphous solid water are analyzed using molecular dynamics simulations. The underlying potential energy surface was provided by machine-learned interatomic potentials. Binding energies confirm the latest available theoretical and experimental results. The nitrogen sticking coefficient is close to unity at dust temperatures of 10 K but decreases a… ▽ More Dynamics of adsorption and desorption of (4S)-N on amorphous solid water are analyzed using molecular dynamics simulations. The underlying potential energy surface was provided by machine-learned interatomic potentials. Binding energies confirm the latest available theoretical and experimental results. The nitrogen sticking coefficient is close to unity at dust temperatures of 10 K but decreases at higher temperatures. We estimate a desorption time scale of 1 μs at 28 K. The estimated time scale allows chemical processes mediated by diffusion to happen before desorption, even at higher temperatures. We found that the energy dissipation process after a sticking event happens on the picosecond timescale at dust temperatures of 10 K, even for high energies of the incoming adsorbate. Our approach allows the simulation of large systems for reasonable time scales at an affordable computational cost and ab-initio accuracy. Moreover, it is generally applicable for the study of adsorption dynamics of interstellar radicals on dust surfaces. △ Less

Submitted 21 September, 2020; originally announced September 2020.

arXiv:1806.05831 [pdf, other]

doi 10.1051/0004-6361/201833346

Tunnelling dominates the reactions of hydrogen atoms with unsaturated alcohols and aldehydes in the dense medium

Authors: V. Zaverkin, T. Lamberts, M. N. Markmeyer, J. Kästner

Abstract: Hydrogen addition and abstraction reactions play an important role as surface reactions in the buildup of complex organic molecules in the dense interstellar medium. Addition reactions allow unsaturated bonds to be fully hydrogenated, while abstraction reactions recreate radicals that may undergo radical-radical recombination reactions. Previous experimental work has indicated that double and trip… ▽ More Hydrogen addition and abstraction reactions play an important role as surface reactions in the buildup of complex organic molecules in the dense interstellar medium. Addition reactions allow unsaturated bonds to be fully hydrogenated, while abstraction reactions recreate radicals that may undergo radical-radical recombination reactions. Previous experimental work has indicated that double and triple C--C bonds are easily hydrogenated, but aldehyde -C=O bonds are not. Here, we investigate a total of 29 reactions of the hydrogen atom with propynal, propargyl alcohol, propenal, allyl alcohol, and propanal by means of quantum chemical methods to quantify the reaction rate constants involved. First of all, our results are in good agreement with and can explain the observed experimental findings. The hydrogen addition to the aldehyde group, either on the C or O side, is indeed slow for all molecules considered. Abstraction of the H atom of the aldehyde group, on the other hand, is among the faster reactions. Furthermore, hydrogen addition to C--C double bonds is generally faster than to triple bonds. In both cases, addition on the terminal carbon atom that is not connected to other functional groups is easiest. Finally, we wish to stress that it is not possible to predict rate constants based solely on the type of reaction: the specific functional groups attached to a backbone play a crucial role and can lead to a spread of several orders of magnitude in the rate constant. △ Less

Submitted 15 June, 2018; originally announced June 2018.

Comments: Accepted for publication in A&A

Showing 1–14 of 14 results for author: Zaverkin, V