-
Efficient 3D Molecular Generation with Flow Matching and Scale Optimal Transport
Authors:
Ross Irwin,
Alessandro Tibo,
Jon Paul Janet,
Simon Olsson
Abstract:
Generative models for 3D drug design have gained prominence recently for their potential to design ligands directly within protein pockets. Current approaches, however, often suffer from very slow sampling times or generate molecules with poor chemical validity. Addressing these limitations, we propose Semla, a scalable E(3)-equivariant message passing architecture. We further introduce a molecula…
▽ More
Generative models for 3D drug design have gained prominence recently for their potential to design ligands directly within protein pockets. Current approaches, however, often suffer from very slow sampling times or generate molecules with poor chemical validity. Addressing these limitations, we propose Semla, a scalable E(3)-equivariant message passing architecture. We further introduce a molecular generation model, SemlaFlow, which is trained using flow matching along with scale optimal transport, a novel extension of equivariant optimal transport. Our model produces state-of-the-art results on benchmark datasets with just 100 sampling steps. Crucially, SemlaFlow samples high quality molecules with as few as 20 steps, corresponding to a two order-of-magnitude speed-up compared to state-of-the-art, without sacrificing performance. Furthermore, we highlight limitations of current evaluation methods for 3D generation and propose new benchmark metrics for unconditional molecular generators. Finally, using these new metrics, we compare our model's ability to generate high quality samples against current approaches and further demonstrate SemlaFlow's strong performance.
△ Less
Submitted 25 June, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
Masked Attention is All You Need for Graphs
Authors:
David Buterez,
Jon Paul Janet,
Dino Oglic,
Pietro Lio
Abstract:
Graph neural networks (GNNs) and variations of the message passing algorithm are the predominant means for learning on graphs, largely due to their flexibility, speed, and satisfactory performance. The design of powerful and general purpose GNNs, however, requires significant research efforts and often relies on handcrafted, carefully-chosen message passing operators. Motivated by this, we propose…
▽ More
Graph neural networks (GNNs) and variations of the message passing algorithm are the predominant means for learning on graphs, largely due to their flexibility, speed, and satisfactory performance. The design of powerful and general purpose GNNs, however, requires significant research efforts and often relies on handcrafted, carefully-chosen message passing operators. Motivated by this, we propose a remarkably simple alternative for learning on graphs that relies exclusively on attention. Graphs are represented as node or edge sets and their connectivity is enforced by masking the attention weight matrix, effectively creating custom attention patterns for each graph. Despite its simplicity, masked attention for graphs (MAG) has state-of-the-art performance on long-range tasks and outperforms strong message passing baselines and much more involved attention-based methods on over 55 node and graph-level tasks. We also show significantly better transfer learning capabilities compared to GNNs and comparable or better time and memory scaling. MAG has sub-linear memory scaling in the number of nodes or edges, enabling learning on dense graphs and future-proofing the approach.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Graph Neural Networks with Adaptive Readouts
Authors:
David Buterez,
Jon Paul Janet,
Steven J. Kiddle,
Dino Oglic,
Pietro Liò
Abstract:
An effective aggregation of node features into a graph-level representation via readout functions is an essential step in numerous learning tasks involving graph neural networks. Typically, readouts are simple and non-adaptive functions designed such that the resulting hypothesis space is permutation invariant. Prior work on deep sets indicates that such readouts might require complex node embeddi…
▽ More
An effective aggregation of node features into a graph-level representation via readout functions is an essential step in numerous learning tasks involving graph neural networks. Typically, readouts are simple and non-adaptive functions designed such that the resulting hypothesis space is permutation invariant. Prior work on deep sets indicates that such readouts might require complex node embeddings that can be difficult to learn via standard neighborhood aggregation schemes. Motivated by this, we investigate the potential of adaptive readouts given by neural networks that do not necessarily give rise to permutation invariant hypothesis spaces. We argue that in some problems such as binding affinity prediction where molecules are typically presented in a canonical form it might be possible to relax the constraints on permutation invariance of the hypothesis space and learn a more effective model of the affinity by employing an adaptive readout function. Our empirical results demonstrate the effectiveness of neural readouts on more than 40 datasets spanning different domains and graph characteristics. Moreover, we observe a consistent improvement over standard readouts (i.e., sum, max, and mean) relative to the number of neighborhood aggregation iterations and different convolutional operators.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Ligand additivity relationships enable efficient exploration of transition metal chemical space
Authors:
Naveen Arunachalam,
Stefan Gugler,
Michael G. Taylor,
Chenru Duan,
Aditya Nandy,
Jon Paul Janet,
Ralf Meyer,
Jonas Oldenstaedt,
Daniel B. K. Chu,
Heather J. Kulik
Abstract:
To accelerate exploration of chemical space, it is necessary to identify the compounds that will provide the most additional information or value. A large-scale analysis of mononuclear octahedral transition metal complexes deposited in an experimental database confirms an under-representation of lower-symmetry complexes. From a set of around 1000 previously studied Fe(II) complexes, we show that t…
▽ More
To accelerate exploration of chemical space, it is necessary to identify the compounds that will provide the most additional information or value. A large-scale analysis of mononuclear octahedral transition metal complexes deposited in an experimental database confirms an under-representation of lower-symmetry complexes. From a set of around 1000 previously studied Fe(II) complexes, we show that the theoretical space of synthetically accessible complexes formed from the relatively small number of unique ligands is significantly (ca. 816k) larger. For the properties of these complexes, we validate the concept of ligand additivity by inferring heteroleptic properties from a stoichiometric combination of homoleptic complexes. An improved interpolation scheme that incorporates information about cis and trans isomer effects predicts the adiabatic spin-splitting energy to around 2 kcal/mol and the HOMO level to less than 0.2 eV. We demonstrate a multi-stage strategy to discover leads from the 816k Fe(II) complexes within a targeted property region. We carry out a coarse interpolation from homoleptic complexes that we refine over a subspace of ligands based on the likelihood of generating complexes with targeted properties. We validate our approach on 9 new binary and ternary complexes predicted to be in a targeted zone of discovery, suggesting opportunities for efficient transition metal complex discovery.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Representations and Strategies for Transferable Machine Learning Models in Chemical Discovery
Authors:
Daniel R. Harper,
Aditya Nandy,
Naveen Arunachalam,
Chenru Duan,
Jon Paul Janet,
Heather J. Kulik
Abstract:
Strategies for machine-learning(ML)-accelerated discovery that are general across materials composition spaces are essential, but demonstrations of ML have been primarily limited to narrow composition variations. By addressing the scarcity of data in promising regions of chemical space for challenging targets like open-shell transition-metal complexes, general representations and transferable ML m…
▽ More
Strategies for machine-learning(ML)-accelerated discovery that are general across materials composition spaces are essential, but demonstrations of ML have been primarily limited to narrow composition variations. By addressing the scarcity of data in promising regions of chemical space for challenging targets like open-shell transition-metal complexes, general representations and transferable ML models that leverage known relationships in existing data will accelerate discovery. Over a large set (ca. 1000) of isovalent transition-metal complexes, we quantify evident relationships for different properties (i.e., spin-splitting and ligand dissociation) between rows of the periodic table (i.e., 3d/4d metals and 2p/3p ligands). We demonstrate an extension to graph-based revised autocorrelation (RAC) representation (i.e., eRAC) that incorporates the effective nuclear charge alongside the nuclear charge heuristic that otherwise overestimates dissimilarity of isovalent complexes. To address the common challenge of discovery in a new space where data is limited, we introduce a transfer learning approach in which we seed models trained on a large amount of data from one row of the periodic table with a small number of data points from the additional row. We demonstrate the synergistic value of the eRACs alongside this transfer learning strategy to consistently improve model performance. Analysis of these models highlights how the approach succeeds by reordering the distances between complexes to be more consistent with the periodic table, a property we expect to be broadly useful for other materials domains.
△ Less
Submitted 20 June, 2021;
originally announced June 2021.
-
Recovering the flat-plane condition in electronic structure theory at semi-local DFT cost
Authors:
Akash Bajaj,
Jon Paul Janet,
Heather J. Kulik
Abstract:
The flat plane condition is the union of two exact constraints in electronic structure theory: i) energetic piecewise linearity with fractional electron removal or addition and ii) invariant energetics with change in electron spin in a half filled orbital. Semi-local density functional theory (DFT) fails to recover the flat plane, exhibiting convex fractional charge errors (FCE) and concave fracti…
▽ More
The flat plane condition is the union of two exact constraints in electronic structure theory: i) energetic piecewise linearity with fractional electron removal or addition and ii) invariant energetics with change in electron spin in a half filled orbital. Semi-local density functional theory (DFT) fails to recover the flat plane, exhibiting convex fractional charge errors (FCE) and concave fractional spin errors (FSE) that are related to delocalization and static correlation errors. We previously showed that DFT+U eliminates FCE but now demonstrate that, like other widely employed corrections (i.e., Hartree Fock exchange), it worsens FSE. To find an alternative strategy, we examine the shape of semi-local DFT deviations from the exact flat plane, and we find this shape to be remarkably consistent across ions and molecules. We introduce the jmDFT approach, wherein corrections are constructed from few-parameter, low- order, functional forms that fit the shape of semi-local DFT errors. We select one such physically intuitive form and incorporate it self-consistently to correct semi-local DFT. We demonstrate on model systems that this jmDFT approach represents the first easy-to-implement, no-overhead approach to recovering the flat plane from semi-local DFT.
△ Less
Submitted 6 October, 2017;
originally announced October 2017.
-
Resolving transition metal chemical space: feature selection for machine learning and structure-property relationships
Authors:
Jon Paul Janet,
Heather J. Kulik
Abstract:
Machine learning (ML) of quantum mechanical properties shows promise for accelerating chemical discovery. For transition metal chemistry where accurate calculations are computationally costly and available training data sets are small, the molecular representation becomes a critical ingredient in ML model predictive accuracy. We introduce a series of revised autocorrelation functions (RACs) that e…
▽ More
Machine learning (ML) of quantum mechanical properties shows promise for accelerating chemical discovery. For transition metal chemistry where accurate calculations are computationally costly and available training data sets are small, the molecular representation becomes a critical ingredient in ML model predictive accuracy. We introduce a series of revised autocorrelation functions (RACs) that encode relationships between the heuristic atomic properties (e.g., size, connectivity, and electronegativity) on a molecular graph. We alter the starting point, scope, and nature of the quantities evaluated in standard ACs to make these RACs amenable to inorganic chemistry. On an organic molecule set, we first demonstrate superior standard AC performance to other presently-available topological descriptors for ML model training, with mean unsigned errors (MUEs) for atomization energies on set-aside test molecules as low as 6 kcal/mol. For inorganic chemistry, our RACs yield 1 kcal/mol ML MUEs on set-aside test molecules in spin-state splitting in comparison to 15-20x higher errors from feature sets that encode whole-molecule structural information. Systematic feature selection methods including univariate filtering, recursive feature elimination, and direct optimization (e.g., random forest and LASSO) are compared. Random-forest- or LASSO-selected subsets 4-5x smaller than RAC-155 produce sub- to 1-kcal/mol spin-splitting MUEs, with good transferability to metal-ligand bond length prediction (0.004-5 Å MUE) and redox potential on a smaller data set (0.2-0.3 eV MUE). Evaluation of feature selection results across property sets reveals the relative importance of local, electronic descriptors (e.g., electronegativity, atomic number) in spin-splitting and distal, steric effects in redox potential and bond lengths.
△ Less
Submitted 20 August, 2017;
originally announced August 2017.
-
Predicting Electronic Structure Properties of Transition Metal Complexes with Neural Networks
Authors:
Jon Paul Janet,
Heather J. Kulik
Abstract:
High-throughput computational screening has emerged as a critical component of materials discovery. Direct density functional theory (DFT) simulation of inorganic materials and molecular transition metal complexes is often used to describe subtle trends in inorganic bonding and spin-state ordering, but these calculations are computationally costly and properties are sensitive to the exchange-corre…
▽ More
High-throughput computational screening has emerged as a critical component of materials discovery. Direct density functional theory (DFT) simulation of inorganic materials and molecular transition metal complexes is often used to describe subtle trends in inorganic bonding and spin-state ordering, but these calculations are computationally costly and properties are sensitive to the exchange-correlation functional employed. To begin to overcome these challenges, we trained artificial neural networks (ANNs) to predict quantum-mechanically-derived properties, including spin-state ordering, sensitivity to Hartree-Fock exchange, and spin- state specific bond lengths in transition metal complexes. Our ANN is trained on a small set of inorganic-chemistry-appropriate empirical inputs that are both maximally transferable and do not require precise three-dimensional structural information for prediction. Using these descriptors, our ANN predicts spin-state splittings of single-site transition metal complexes (i.e., Cr-Ni) at arbitrary amounts of Hartree-Fock exchange to within 3 kcal/mol accuracy of DFT calculations. Our exchange-sensitivity ANN enables improved predictions on a diverse test set of experimentally-characterized transition metal complexes by extrapolation from semi-local DFT to hybrid DFT. The ANN also outperforms other machine learning models (i.e., support vector regression and kernel ridge regression), demonstrating particularly improved performance in transferability, as measured by prediction errors on the diverse test set. We establish the value of new uncertainty quantification tools to estimate ANN prediction uncertainty in computational chemistry, and we provide additional heuristics for identification of when a compound of interest is likely to be poorly predicted by the ANN.
△ Less
Submitted 19 February, 2017;
originally announced February 2017.