-
GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles
Authors:
Octavian-Eugen Ganea,
Lagnajit Pattanaik,
Connor W. Coley,
Regina Barzilay,
Klavs F. Jensen,
William H. Green,
Tommi S. Jaakkola
Abstract:
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery. Existing generative models have several drawbacks including lack of modeling important molecular geometry elements (e.g. torsion angles), separate optimization stages prone to error accumulation, and the need for structure fine-tuning based on approximate class…
▽ More
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery. Existing generative models have several drawbacks including lack of modeling important molecular geometry elements (e.g. torsion angles), separate optimization stages prone to error accumulation, and the need for structure fine-tuning based on approximate classical force-fields or computationally expensive methods such as metadynamics with approximate quantum mechanics calculations at each geometry. We propose GeoMol--an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate distributions of low-energy molecular 3D conformers. Leveraging the power of message passing neural networks (MPNNs) to capture local and global graph information, we predict local atomic 3D structures and torsion angles, avoiding unnecessary over-parameterization of the geometric degrees of freedom (e.g. one angle per non-terminal bond). Such local predictions suffice both for the training loss computation, as well as for the full deterministic conformer assembly (at test time). We devise a non-adversarial optimal transport based loss function to promote diverse conformer generation. GeoMol predominantly outperforms popular open-source, commercial, or state-of-the-art machine learning (ML) models, while achieving significant speed-ups. We expect such differentiable 3D structure generators to significantly impact molecular modeling and related applications.
△ Less
Submitted 8 June, 2021;
originally announced June 2021.
-
Message Passing Networks for Molecules with Tetrahedral Chirality
Authors:
Lagnajit Pattanaik,
Octavian-Eugen Ganea,
Ian Coley,
Klavs F. Jensen,
William H. Green,
Connor W. Coley
Abstract:
Molecules with identical graph connectivity can exhibit different physical and biological properties if they exhibit stereochemistry-a spatial structural characteristic. However, modern neural architectures designed for learning structure-property relationships from molecular structures treat molecules as graph-structured data and therefore are invariant to stereochemistry. Here, we develop two cu…
▽ More
Molecules with identical graph connectivity can exhibit different physical and biological properties if they exhibit stereochemistry-a spatial structural characteristic. However, modern neural architectures designed for learning structure-property relationships from molecular structures treat molecules as graph-structured data and therefore are invariant to stereochemistry. Here, we develop two custom aggregation functions for message passing neural networks to learn properties of molecules with tetrahedral chirality, one common form of stereochemistry. We evaluate performance on synthetic data as well as a newly-proposed protein-ligand docking dataset with relevance to drug discovery. Results show modest improvements over a baseline sum aggregator, highlighting opportunities for further architecture development.
△ Less
Submitted 4 December, 2020; v1 submitted 23 November, 2020;
originally announced December 2020.
-
Autonomous discovery in the chemical sciences part II: Outlook
Authors:
Connor W. Coley,
Natalie S. Eyke,
Klavs F. Jensen
Abstract:
This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this second part, we reflect on a selection of exemplary studies. It is increasingly important to articulate what the role of automation and computation has been in the scientific process and how that has or has not accelerated discovery. One can argue that even the best auto…
▽ More
This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this second part, we reflect on a selection of exemplary studies. It is increasingly important to articulate what the role of automation and computation has been in the scientific process and how that has or has not accelerated discovery. One can argue that even the best automated systems have yet to ``discover'' despite being incredibly useful as laboratory assistants. We must carefully consider how they have been and can be applied to future problems of chemical discovery in order to effectively design and interact with future autonomous platforms.
The majority of this article defines a large set of open research directions, including improving our ability to work with complex data, build empirical models, automate both physical and computational experiments for validation, select experiments, and evaluate whether we are making progress toward the ultimate goal of autonomous discovery. Addressing these practical and methodological challenges will greatly advance the extent to which autonomous systems can make meaningful discoveries.
△ Less
Submitted 30 March, 2020;
originally announced March 2020.
-
Autonomous discovery in the chemical sciences part I: Progress
Authors:
Connor W. Coley,
Natalie S. Eyke,
Klavs F. Jensen
Abstract:
This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this first part, we describe a classification for discoveries of physical matter (molecules, materials, devices), processes, and models and how they are unified as search problems. We then introduce a set of questions and considerations relevant to assessing the extent of aut…
▽ More
This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this first part, we describe a classification for discoveries of physical matter (molecules, materials, devices), processes, and models and how they are unified as search problems. We then introduce a set of questions and considerations relevant to assessing the extent of autonomy. Finally, we describe many case studies of discoveries accelerated by or resulting from computer assistance and automation from the domains of synthetic chemistry, drug discovery, inorganic chemistry, and materials science. These illustrate how rapid advancements in hardware automation and machine learning continue to transform the nature of experimentation and modelling.
Part two reflects on these case studies and identifies a set of open challenges for the field.
△ Less
Submitted 30 March, 2020;
originally announced March 2020.
-
Microfluidics for Chemical Synthesis: Flow Chemistry
Authors:
Klavs F. Jensen
Abstract:
Klavs F. Jensen is Warren K. Lewis Professor in Chemical Engineering and Materials Science and Engineering at the Massachusetts Institute of Technology. Here he describes the use of microfluidics for chemical synthesis, from the early demonstration examples to the current efforts with automated droplet microfluidic screening and optimization techniques.
Klavs F. Jensen is Warren K. Lewis Professor in Chemical Engineering and Materials Science and Engineering at the Massachusetts Institute of Technology. Here he describes the use of microfluidics for chemical synthesis, from the early demonstration examples to the current efforts with automated droplet microfluidic screening and optimization techniques.
△ Less
Submitted 10 January, 2018;
originally announced February 2018.
-
Direct Observation of Early-stage Quantum Dot Growth Mechanisms with High-temperature Ab Initio Molecular Dynamics
Authors:
Lisi Xie,
Qing Zhao,
Klavs F. Jensen,
Heather J. Kulik
Abstract:
Colloidal quantum dots (QDs) exhibit highly desirable size- and shape-dependent properties for applications from electronic devices to imaging. Indium phosphide QDs have emerged as a primary candidate to replace the more toxic CdSe QDs, but production of InP QDs with the desired properties lags behind other QD materials due to a poor understanding of how to tune the growth process. Using high-temp…
▽ More
Colloidal quantum dots (QDs) exhibit highly desirable size- and shape-dependent properties for applications from electronic devices to imaging. Indium phosphide QDs have emerged as a primary candidate to replace the more toxic CdSe QDs, but production of InP QDs with the desired properties lags behind other QD materials due to a poor understanding of how to tune the growth process. Using high-temperature ab initio molecular dynamics (AIMD) simulations, we report the first direct observation of the early stage intermediates and subsequent formation of an InP cluster from separated indium and phosphorus precursors. In our simulations, indium agglomeration precedes formation of In-P bonds. We observe a predominantly intercomplex pathway in which In-P bonds form between one set of precursor copies while the carboxylate ligand of a second indium precursor in the agglomerated indium abstracts a ligand from the phosphorus precursor. This process produces an indium-rich cluster with structural properties comparable to those in bulk zinc-blende InP crystals. Minimum energy pathway characterization of the AIMD-sampled reaction events confirms these observations and identifies that In-carboxylate dissociation energetics solely determine the barrier along the In-P bond formation pathway, which is lower for intercomplex (13 kcal/mol) than intracomplex (21 kcal/mol) mechanisms. The phosphorus precursor chemistry, on the other hand, controls the thermodynamics of the reaction. Our observations of the differing roles of precursors in controlling QD formation strongly suggests that the challenges thus far encountered in InP QD synthesis optimization may be attributed to an overlooked need for a cooperative tuning strategy that simultaneously addresses the chemistry of both indium and phosphorus precursors.
△ Less
Submitted 28 December, 2015;
originally announced December 2015.