-
Twins in rotational spectroscopy: Does a rotational spectrum uniquely identify a molecule?
Authors:
Marcus Schwarting,
Nathan A. Seifert,
Michael J. Davis,
Ben Blaiszik,
Ian Foster,
Kirill Prozument
Abstract:
Rotational spectroscopy is the most accurate method for determining structures of molecules in the gas phase. It is often assumed that a rotational spectrum is a unique "fingerprint" of a molecule. The availability of large molecular databases and the development of artificial intelligence methods for spectroscopy makes the testing of this assumption timely. In this paper, we pose the determinatio…
▽ More
Rotational spectroscopy is the most accurate method for determining structures of molecules in the gas phase. It is often assumed that a rotational spectrum is a unique "fingerprint" of a molecule. The availability of large molecular databases and the development of artificial intelligence methods for spectroscopy makes the testing of this assumption timely. In this paper, we pose the determination of molecular structures from rotational spectra as an inverse problem. Within this framework, we adopt a funnel-based approach to search for molecular twins, which are two or more molecules, which have similar rotational spectra but distinctly different molecular structures. We demonstrate that there are twins within standard levels of computational accuracy by generating rotational constants for many molecules from several large molecular databases, indicating the inverse problem is ill-posed. However, some twins can be distinguished by increasing the accuracy of the theoretical methods or by performing additional experiments.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon
Authors:
Kevin Maik Jablonka,
Qianxiang Ai,
Alexander Al-Feghali,
Shruti Badhwar,
Joshua D. Bocarsly,
Andres M Bran,
Stefan Bringuier,
L. Catherine Brinson,
Kamal Choudhary,
Defne Circi,
Sam Cox,
Wibe A. de Jong,
Matthew L. Evans,
Nicolas Gastellu,
Jerome Genzling,
María Victoria Gil,
Ankur K. Gupta,
Zhi Hong,
Alishba Imran,
Sabine Kruschwitz,
Anne Labarre,
Jakub Lála,
Tao Liu,
Steven Ma,
Sauradeep Majumdar
, et al. (28 additional authors not shown)
Abstract:
Large-language models (LLMs) such as GPT-4 caught the interest of many scientists. Recent studies suggested that these models could be useful in chemistry and materials science. To explore these possibilities, we organized a hackathon.
This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of mole…
▽ More
Large-language models (LLMs) such as GPT-4 caught the interest of many scientists. Recent studies suggested that these models could be useful in chemistry and materials science. To explore these possibilities, we organized a hackathon.
This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of molecules and materials, designing novel interfaces for tools, extracting knowledge from unstructured data, and develo** new educational applications.
The diverse topics and the fact that working prototypes could be generated in less than two days highlight that LLMs will profoundly impact the future of our fields. The rich collection of ideas and projects also indicates that the applications of LLMs are not limited to materials science and chemistry but offer potential benefits to a wide range of scientific disciplines.
△ Less
Submitted 14 July, 2023; v1 submitted 9 June, 2023;
originally announced June 2023.
-
HydroNet: Benchmark Tasks for Preserving Intermolecular Interactions and Structural Motifs in Predictive and Generative Models for Molecular Data
Authors:
Sutanay Choudhury,
Jenna A. Bilbrey,
Logan Ward,
Sotiris S. Xantheas,
Ian Foster,
Joseph P. Heindel,
Ben Blaiszik,
Marcus E. Schwarting
Abstract:
Intermolecular and long-range interactions are central to phenomena as diverse as gene regulation, topological states of quantum materials, electrolyte transport in batteries, and the universal solvation properties of water. We present a set of challenge problems for preserving intermolecular interactions and structural motifs in machine-learning approaches to chemical problems, through the use of…
▽ More
Intermolecular and long-range interactions are central to phenomena as diverse as gene regulation, topological states of quantum materials, electrolyte transport in batteries, and the universal solvation properties of water. We present a set of challenge problems for preserving intermolecular interactions and structural motifs in machine-learning approaches to chemical problems, through the use of a recently published dataset of 4.95 million water clusters held together by hydrogen bonding interactions and resulting in longer range structural patterns. The dataset provides spatial coordinates as well as two types of graph representations, to accommodate a variety of machine-learning practices.
△ Less
Submitted 30 November, 2020;
originally announced December 2020.
-
A Data Ecosystem to Support Machine Learning in Materials Science
Authors:
Ben Blaiszik,
Logan Ward,
Marcus Schwarting,
Jonathon Gaff,
Ryan Chard,
Daniel Pike,
Kyle Chard,
Ian Foster
Abstract:
Facilitating the application of machine learning to materials science problems will require enhancing the data ecosystem to enable discovery and collection of data from many sources, automated dissemination of new data across the ecosystem, and the connecting of data with materials-specific machine learning models. Here, we present two projects, the Materials Data Facility (MDF) and the Data and L…
▽ More
Facilitating the application of machine learning to materials science problems will require enhancing the data ecosystem to enable discovery and collection of data from many sources, automated dissemination of new data across the ecosystem, and the connecting of data with materials-specific machine learning models. Here, we present two projects, the Materials Data Facility (MDF) and the Data and Learning Hub for Science (DLHub), that address these needs. We use examples to show how MDF and DLHub capabilities can be leveraged to link data with machine learning models and how users can access those capabilities through web and programmatic interfaces.
△ Less
Submitted 20 July, 2019; v1 submitted 23 April, 2019;
originally announced April 2019.