-
Extracting Polymer Nanocomposite Samples from Full-Length Documents
Authors:
Ghazal Khalighinejad,
Defne Circi,
L. C. Brinson,
Bhuwan Dhingra
Abstract:
This paper investigates the use of large language models (LLMs) for extracting sample lists of polymer nanocomposites (PNCs) from full-length materials science research papers. The challenge lies in the complex nature of PNC samples, which have numerous attributes scattered throughout the text. The complexity of annotating detailed information on PNCs limits the availability of data, making conven…
▽ More
This paper investigates the use of large language models (LLMs) for extracting sample lists of polymer nanocomposites (PNCs) from full-length materials science research papers. The challenge lies in the complex nature of PNC samples, which have numerous attributes scattered throughout the text. The complexity of annotating detailed information on PNCs limits the availability of data, making conventional document-level relation extraction techniques impractical due to the challenge in creating comprehensive named entity span annotations. To address this, we introduce a new benchmark and an evaluation technique for this task and explore different prompting strategies in a zero-shot manner. We also incorporate self-consistency to improve the performance. Our findings show that even advanced LLMs struggle to extract all of the samples from an article. Finally, we analyze the errors encountered in this process, categorizing them into three main challenges, and discuss potential strategies for future research to overcome them.
△ Less
Submitted 29 February, 2024;
originally announced March 2024.
-
Uncertainty Quantification of Bandgaps in Acoustic Metamaterials with Stochastic Geometric Defects and Material Properties
Authors:
Han Zhang,
Rayehe Karimi Mahabadi,
Cynthia Rudin,
Johann Guilleminot,
L. Catherine Brinson
Abstract:
This paper studies the utility of techniques within uncertainty quantification, namely spectral projection and polynomial chaos expansion, in reducing sampling needs for characterizing acoustic metamaterial dispersion band responses given stochastic material properties and geometric defects. A novel method of encoding geometric defects in an interpretable, resolution independent is showcased in th…
▽ More
This paper studies the utility of techniques within uncertainty quantification, namely spectral projection and polynomial chaos expansion, in reducing sampling needs for characterizing acoustic metamaterial dispersion band responses given stochastic material properties and geometric defects. A novel method of encoding geometric defects in an interpretable, resolution independent is showcased in the formation of input space probability distributions. Orders of magnitude sampling reductions down to $\sim10^0$ and $\sim10^1$ are achieved in the 1D and 7D input space scenarios respectively while maintaining accurate output space probability distributions through combining Monte Carlo, quadrature rule, and sparse grid sampling with surrogate model fitting.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon
Authors:
Kevin Maik Jablonka,
Qianxiang Ai,
Alexander Al-Feghali,
Shruti Badhwar,
Joshua D. Bocarsly,
Andres M Bran,
Stefan Bringuier,
L. Catherine Brinson,
Kamal Choudhary,
Defne Circi,
Sam Cox,
Wibe A. de Jong,
Matthew L. Evans,
Nicolas Gastellu,
Jerome Genzling,
María Victoria Gil,
Ankur K. Gupta,
Zhi Hong,
Alishba Imran,
Sabine Kruschwitz,
Anne Labarre,
Jakub Lála,
Tao Liu,
Steven Ma,
Sauradeep Majumdar
, et al. (28 additional authors not shown)
Abstract:
Large-language models (LLMs) such as GPT-4 caught the interest of many scientists. Recent studies suggested that these models could be useful in chemistry and materials science. To explore these possibilities, we organized a hackathon.
This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of mole…
▽ More
Large-language models (LLMs) such as GPT-4 caught the interest of many scientists. Recent studies suggested that these models could be useful in chemistry and materials science. To explore these possibilities, we organized a hackathon.
This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of molecules and materials, designing novel interfaces for tools, extracting knowledge from unstructured data, and develo** new educational applications.
The diverse topics and the fact that working prototypes could be generated in less than two days highlight that LLMs will profoundly impact the future of our fields. The rich collection of ideas and projects also indicates that the applications of LLMs are not limited to materials science and chemistry but offer potential benefits to a wide range of scientific disciplines.
△ Less
Submitted 14 July, 2023; v1 submitted 9 June, 2023;
originally announced June 2023.
-
FAIR for AI: An interdisciplinary and international community building perspective
Authors:
E. A. Huerta,
Ben Blaiszik,
L. Catherine Brinson,
Kristofer E. Bouchard,
Daniel Diaz,
Caterina Doglioni,
Javier M. Duarte,
Murali Emani,
Ian Foster,
Geoffrey Fox,
Philip Harris,
Lukas Heinrich,
Shantenu Jha,
Daniel S. Katz,
Volodymyr Kindratenko,
Christine R. Kirkpatrick,
Kati Lassila-Perini,
Ravi K. Madduri,
Mark S. Neubauer,
Fotis E. Psomopoulos,
Avik Roy,
Oliver Rübel,
Zhizhen Zhao,
Ruike Zhu
Abstract:
A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to i…
▽ More
A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to include the software, tools, algorithms, and workflows that produce data. FAIR principles are now being adapted in the context of AI models and datasets. Here, we present the perspectives, vision, and experiences of researchers from different countries, disciplines, and backgrounds who are leading the definition and adoption of FAIR principles in their communities of practice, and discuss outcomes that may result from pursuing and incentivizing FAIR AI research. The material for this report builds on the FAIR for AI Workshop held at Argonne National Laboratory on June 7, 2022.
△ Less
Submitted 1 August, 2023; v1 submitted 30 September, 2022;
originally announced October 2022.
-
How to See Hidden Patterns in Metamaterials with Interpretable Machine Learning
Authors:
Zhi Chen,
Alexander Ogren,
Chiara Daraio,
L. Catherine Brinson,
Cynthia Rudin
Abstract:
Machine learning models can assist with metamaterials design by approximating computationally expensive simulators or solving inverse design problems. However, past work has usually relied on black box deep neural networks, whose reasoning processes are opaque and require enormous datasets that are expensive to obtain. In this work, we develop two novel machine learning approaches to metamaterials…
▽ More
Machine learning models can assist with metamaterials design by approximating computationally expensive simulators or solving inverse design problems. However, past work has usually relied on black box deep neural networks, whose reasoning processes are opaque and require enormous datasets that are expensive to obtain. In this work, we develop two novel machine learning approaches to metamaterials discovery that have neither of these disadvantages. These approaches, called shape-frequency features and unit-cell templates, can discover 2D metamaterials with user-specified frequency band gaps. Our approaches provide logical rule-based conditions on metamaterial unit-cells that allow for interpretable reasoning processes, and generalize well across design spaces of different resolutions. The templates also provide design flexibility where users can almost freely design the fine resolution features of a unit-cell without affecting the user's desired band gap.
△ Less
Submitted 1 October, 2022; v1 submitted 10 November, 2021;
originally announced November 2021.
-
Data-Centric Mixed-Variable Bayesian Optimization For Materials Design
Authors:
Akshay Iyer,
Yichi Zhang,
Aditya Prasad,
Siyu Tao,
Yixing Wang,
Linda Schadler,
L Catherine Brinson,
Wei Chen
Abstract:
Materials design can be cast as an optimization problem with the goal of achieving desired properties, by varying material composition, microstructure morphology, and processing conditions. Existence of both qualitative and quantitative material design variables leads to disjointed regions in property space, making the search for optimal design challenging. Limited availability of experimental dat…
▽ More
Materials design can be cast as an optimization problem with the goal of achieving desired properties, by varying material composition, microstructure morphology, and processing conditions. Existence of both qualitative and quantitative material design variables leads to disjointed regions in property space, making the search for optimal design challenging. Limited availability of experimental data and the high cost of simulations magnify the challenge. This situation calls for design methodologies that can extract useful information from existing data and guide the search for optimal designs efficiently. To this end, we present a data-centric, mixed-variable Bayesian Optimization framework that integrates data from literature, experiments, and simulations for knowledge discovery and computational materials design. Our framework pivots around the Latent Variable Gaussian Process (LVGP), a novel Gaussian Process technique which projects qualitative variables on a continuous latent space for covariance formulation, as the surrogate model to quantify "lack of data" uncertainty. Expected improvement, an acquisition criterion that balances exploration and exploitation, helps navigate a complex, nonlinear design space to locate the optimum design. The proposed framework is tested through a case study which seeks to concurrently identify the optimal composition and morphology for insulating polymer nanocomposites. We also present an extension of mixed-variable Bayesian Optimization for multiple objectives to identify the Pareto Frontier within tens of iterations. These findings project Bayesian Optimization as a powerful tool for design of engineered material systems.
△ Less
Submitted 4 July, 2019;
originally announced July 2019.
-
A Transfer Learning Approach for Microstructure Reconstruction and Structure-property Predictions
Authors:
Xiaolin Li,
Yichi Zhang,
He Zhao,
Craig Burkhart,
L Catherine Brinson,
Wei Chen
Abstract:
Stochastic microstructure reconstruction has become an indispensable part of computational materials science, but ongoing developments are specific to particular material systems. In this paper, we address this generality problem by presenting a transfer learning-based approach for microstructure reconstruction and structure-property predictions that is applicable to a wide range of material syste…
▽ More
Stochastic microstructure reconstruction has become an indispensable part of computational materials science, but ongoing developments are specific to particular material systems. In this paper, we address this generality problem by presenting a transfer learning-based approach for microstructure reconstruction and structure-property predictions that is applicable to a wide range of material systems. The proposed approach incorporates an encoder-decoder process and feature-matching optimization using a deep convolutional network. For microstructure reconstruction, model pruning is implemented in order to study the correlation between the microstructural features and hierarchical layers within the deep convolutional network. Knowledge obtained in model pruning is then leveraged in the development of a structure-property predictive model to determine the network architecture and initialization conditions. The generality of the approach is demonstrated numerically for a wide range of material microstructures with geometrical characteristics of varying complexity. Unlike previous approaches that only apply to specific material systems or require a significant amount of prior knowledge in model selection and hyper-parameter tuning, the present approach provides an off-the-shelf solution to handle complex microstructures, and has the potential of expediting the discovery of new materials.
△ Less
Submitted 7 May, 2018;
originally announced May 2018.