-
Adaptive energy reference for machine-learning models of the electronic density of states
Authors:
Wei Bin How,
Sanggyu Chong,
Federico Grasselli,
Kevin K. Huguenin-Dumittan,
Michele Ceriotti
Abstract:
The electronic density of states (DOS) provides information regarding the distribution of electronic states in a material, and can be used to approximate its optical and electronic properties and therefore guide computational material design. Given its usefulness and relative simplicity, it has been one of the first electronic properties used as target for machine-learning approaches going beyond…
▽ More
The electronic density of states (DOS) provides information regarding the distribution of electronic states in a material, and can be used to approximate its optical and electronic properties and therefore guide computational material design. Given its usefulness and relative simplicity, it has been one of the first electronic properties used as target for machine-learning approaches going beyond interatomic potentials. A subtle but important point, well-appreciated in the condensed matter community but usually overlooked in the construction of data-driven models, is that for bulk configurations the absolute energy reference of single-particle energy levels is ill-defined. Only energy differences matter, and quantities derived from the DOS are typically independent on the absolute alignment. We introduce an adaptive scheme that optimizes the energy reference of each structure as part of training, and show that it consistently improves the quality of ML models compared to traditional choices of energy reference, for different classes of materials and different model architectures. On a practical level, we trace the improved performance to the ability of this self-aligning scheme to match the most prominent features in the DOS. More broadly, we believe that this work highlights the importance of incorporating insights into the nature of the physical target into the definition of the architecture and of the appropriate figures of merit for machine-learning models, that translate in better transferability and overall performance.
△ Less
Submitted 2 July, 2024; v1 submitted 1 July, 2024;
originally announced July 2024.
-
Expanding Density-Correlation Machine Learning Representations for Anisotropic Coarse-Grained Particles
Authors:
Arthur Y. Lin,
Kevin K. Huguenin-Dumittan,
Yong-Cheol Cho,
Jigyasa Nigam,
Rose K. Cersonsky
Abstract:
Physics-based, atom-centered machine learning (ML) representations have been instrumental to the effective integration of ML within the atomistic simulation community. Many of these representations build off the idea of atoms as having spherical, or isotropic, interactions. In many communities, there is often a need to represent groups of atoms, either to increase the computational efficiency of s…
▽ More
Physics-based, atom-centered machine learning (ML) representations have been instrumental to the effective integration of ML within the atomistic simulation community. Many of these representations build off the idea of atoms as having spherical, or isotropic, interactions. In many communities, there is often a need to represent groups of atoms, either to increase the computational efficiency of simulation via coarse-graining or to understand molecular influences on system behavior. In such cases, atom-centered representations will have limited utility, as groups of atoms may not be well-approximated as spheres. In this work, we extend the popular Smooth Overlap of Atomic Positions (SOAP) ML representation for systems consisting of non-spherical anisotropic particles or clusters of atoms. We show the power of this anisotropic extension of SOAP, which we deem \AniSOAP, in accurately characterizing liquid crystal systems and predicting the energetics of Gay-Berne ellipsoids and coarse-grained benzene crystals. With our study of these prototypical anisotropic systems, we derive fundamental insights into how molecular shape influences mesoscale behavior and explain how to reincorporate important atom-atom interactions typically not captured by coarse-grained models. Moving forward, we propose \AniSOAP as a flexible, unified framework for coarse-graining in complex, multiscale simulation.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Physics-inspired Equivariant Descriptors of Non-bonded Interactions
Authors:
Kevin K. Huguenin-Dumittan,
Philip Loche,
Ni Haoran,
Michele Ceriotti
Abstract:
One essential ingredient in many machine learning (ML) based methods for atomistic modeling of materials and molecules is the use of locality. While allowing better system-size scaling, this systematically neglects long-range (LR) effects, such as electrostatics or dispersion interaction. We present an extension of the long distance equivariant (LODE) framework that can handle diverse LR interacti…
▽ More
One essential ingredient in many machine learning (ML) based methods for atomistic modeling of materials and molecules is the use of locality. While allowing better system-size scaling, this systematically neglects long-range (LR) effects, such as electrostatics or dispersion interaction. We present an extension of the long distance equivariant (LODE) framework that can handle diverse LR interactions in a consistent way, and seamlessly integrates with preexisting methods by building new sets of atom centered features. We provide a direct physical interpretation of these using the multipole expansion, which allows for simpler and more efficient implementations. The framework is applied to simple toy systems as proof of concept, and a heterogeneous set of molecular dimers to push the method to its limits. By generalizing LODE to arbitrary asymptotic behaviors, we provide a coherent approach to treat arbitrary two- and many-body non-bonded interactions in the data-driven modeling of matter.
△ Less
Submitted 3 October, 2023; v1 submitted 25 August, 2023;
originally announced August 2023.
-
Completeness of Atomic Structure Representations
Authors:
Jigyasa Nigam,
Sergey N. Pozdnyakov,
Kevin K. Huguenin-Dumittan,
Michele Ceriotti
Abstract:
In this paper, we address the challenge of obtaining a comprehensive and symmetric representation of point particle groups, such as atoms in a molecule, which is crucial in physics and theoretical chemistry. The problem has become even more important with the widespread adoption of machine-learning techniques in science, as it underpins the capacity of models to accurately reproduce physical relat…
▽ More
In this paper, we address the challenge of obtaining a comprehensive and symmetric representation of point particle groups, such as atoms in a molecule, which is crucial in physics and theoretical chemistry. The problem has become even more important with the widespread adoption of machine-learning techniques in science, as it underpins the capacity of models to accurately reproduce physical relationships while being consistent with fundamental symmetries and conservation laws. However, some of the descriptors that are commonly used to represent point clouds -- most notably those based on discretized correlations of the neighbor density, that underpin most of the existing ML models of matter at the atomic scale -- are unable to distinguish between special arrangements of particles in three dimensions. This makes it impossible to machine learn their properties. Atom-density correlations are provably complete in the limit in which they simultaneously describe the mutual relationship between all atoms, which is impractical. We present a novel approach to construct descriptors of \emph{finite} correlations based on the relative arrangement of particle triplets, which can be employed to create symmetry-adapted models with universal approximation capabilities, which have the resolution of the neighbor discretization as the sole convergence parameter. Our strategy is demonstrated on a class of atomic arrangements that are specifically built to defy a broad class of conventional symmetric descriptors, showcasing its potential for addressing their limitations.
△ Less
Submitted 30 December, 2023; v1 submitted 28 February, 2023;
originally announced February 2023.
-
A smooth basis for atomistic machine learning
Authors:
Filippo Bigi,
Kevin Huguenin-Dumittan,
Michele Ceriotti,
David E. Manolopoulos
Abstract:
Machine learning frameworks based on correlations of interatomic positions begin with a discretized description of the density of other atoms in the neighbourhood of each atom in the system. Symmetry considerations support the use of spherical harmonics to expand the angular dependence of this density, but there is as yet no clear rationale to choose one radial basis over another. Here we investig…
▽ More
Machine learning frameworks based on correlations of interatomic positions begin with a discretized description of the density of other atoms in the neighbourhood of each atom in the system. Symmetry considerations support the use of spherical harmonics to expand the angular dependence of this density, but there is as yet no clear rationale to choose one radial basis over another. Here we investigate the basis that results from the solution of the Laplacian eigenvalue problem within a sphere around the atom of interest. We show that this generates the smoothest possible basis of a given size within the sphere, and that a tensor product of Laplacian eigenstates also provides the smoothest possible basis for expanding any higher-order correlation of the atomic density within the appropriate hypersphere. We consider several unsupervised metrics of the quality of a basis for a given dataset, and show that the Laplacian eigenstate basis has a performance that is much better than some widely used basis sets and is competitive with data-driven bases that numerically optimize each metric. In supervised machine learning tests, we find that the optimal function smoothness of the Laplacian eigenstates leads to comparable or better performance than can be obtained from a data-driven basis of a similar size that has been optimized to describe the atom-density correlation for the specific dataset. We conclude that the smoothness of the basis functions is a key and hitherto largely overlooked aspect of successful atomic density representations.
△ Less
Submitted 23 May, 2023; v1 submitted 5 September, 2022;
originally announced September 2022.