-
Machine Learning Materials Properties with Accurate Predictions, Uncertainty Estimates, Domain Guidance, and Persistent Online Accessibility
Authors:
Ryan Jacobs,
Lane E. Schultz,
Aristana Scourtas,
KJ Schmidt,
Owen Price-Skelly,
Will Engler,
Ian Foster,
Ben Blaiszik,
Paul M. Voyles,
Dane Morgan
Abstract:
One compelling vision of the future of materials discovery and design involves the use of machine learning (ML) models to predict materials properties and then rapidly find materials tailored for specific applications. However, realizing this vision requires both providing detailed uncertainty quantification (model prediction errors and domain of applicability) and making models readily usable. At…
▽ More
One compelling vision of the future of materials discovery and design involves the use of machine learning (ML) models to predict materials properties and then rapidly find materials tailored for specific applications. However, realizing this vision requires both providing detailed uncertainty quantification (model prediction errors and domain of applicability) and making models readily usable. At present, it is common practice in the community to assess ML model performance only in terms of prediction accuracy (e.g., mean absolute error), while neglecting detailed uncertainty quantification and robust model accessibility and usability. Here, we demonstrate a practical method for realizing both uncertainty and accessibility features with a large set of models. We develop random forest ML models for 33 materials properties spanning an array of data sources (computational and experimental) and property types (electrical, mechanical, thermodynamic, etc.). All models have calibrated ensemble error bars to quantify prediction uncertainty and domain of applicability guidance enabled by kernel-density-estimate-based feature distance measures. All data and models are publicly hosted on the Garden-AI infrastructure, which provides an easy-to-use, persistent interface for model dissemination that permits models to be invoked with only a few lines of Python code. We demonstrate the power of this approach by using our models to conduct a fully ML-based materials discovery exercise to search for new stable, highly active perovskite oxide catalyst materials.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Ultra-fast Oxygen Conduction in Sillén Oxychlorides
Authors:
Jun Meng,
Md Sariful Sheikh,
Lane E. Schultz,
William O. Nachlas,
Jian Liu,
Maciej P. Polak,
Ryan Jacobs,
Dane Morgan
Abstract:
Oxygen ion conductors are crucial for enhancing the efficiency of various clean energy technologies, including fuel cells, batteries, electrolyzers, membranes, sensors, and more. In this study, LaBi2O4Cl is identified as an ultra-fast oxygen conductor from the MBi2O4X (M=rare-earth element, X=halogen element) family, discovered by a structure-similarity analysis of >60k oxygen-containing compounds…
▽ More
Oxygen ion conductors are crucial for enhancing the efficiency of various clean energy technologies, including fuel cells, batteries, electrolyzers, membranes, sensors, and more. In this study, LaBi2O4Cl is identified as an ultra-fast oxygen conductor from the MBi2O4X (M=rare-earth element, X=halogen element) family, discovered by a structure-similarity analysis of >60k oxygen-containing compounds. Ab initio studies reveal that LaBi2O4Cl has an ultra-low migration barrier of 0.1 eV for oxygen vacancy, significantly lower than 0.6-0.8 eV for interstitial oxygen. Frenkel pairs are the dominant defects in intrinsic LaBi2O4Cl, facilitating notable oxygen diffusion primarily through vacancies at higher temperatures. LaBi2O4Cl with extrinsic oxygen vacancies (2.8%) exhibits a conductivity of 0.3 S/cm at 25°C, maintains a 0.1 eV diffusion barrier up to 1100°C, and transitions from extrinsic to mixed extrinsic and intrinsic behavior as the Frenkel pair concentration increases at higher temperatures. Experimental results on synthesized LaBi2O4Cl and Sr-doped LaBi2O4Cl demonstrate comparable or higher oxygen conductivity than YSZ and LSGM below 400 °C, with lower activation energies. Further experimental optimization of LaBi2O4Cl, including aliovalent do** and microstructure refinement, could significantly enhance its performance and efficiency, facilitating fast oxygen conduction approaching room temperature.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Determining Domain of Machine Learning Models using Kernel Density Estimates: Applications in Materials Property Prediction
Authors:
Lane E. Schultz,
Yiqi Wang,
Ryan Jacobs,
Dane Morgan
Abstract:
Knowledge of the domain of applicability of a machine learning model is essential to ensuring accurate and reliable model predictions. In this work, we develop a new approach of assessing model domain and demonstrate that our approach provides accurate and meaningful designation of in-domain versus out-of-domain when applied across multiple model types and material property data sets. Our approach…
▽ More
Knowledge of the domain of applicability of a machine learning model is essential to ensuring accurate and reliable model predictions. In this work, we develop a new approach of assessing model domain and demonstrate that our approach provides accurate and meaningful designation of in-domain versus out-of-domain when applied across multiple model types and material property data sets. Our approach assesses the distance between a test and training data point in feature space by using kernel density estimation and shows that this distance provides an effective tool for domain determination. We show that chemical groups considered unrelated based on established chemical knowledge exhibit significant dissimilarities by our measure. We also show that high measures of dissimilarity are associated with poor model performance (i.e., high residual magnitudes) and poor estimates of model uncertainty (i.e., unreliable uncertainty estimation). Automated tools are provided to enable researchers to establish acceptable dissimilarity thresholds to identify whether new predictions of their own machine learning models are in-domain versus out-of-domain.
△ Less
Submitted 28 May, 2024;
originally announced June 2024.
-
Accelerating Ensemble Error Bar Prediction with Single Models Fits
Authors:
Vidit Agrawal,
Shixin Zhang,
Lane E. Schultz,
Dane Morgan
Abstract:
Ensemble models can be used to estimate prediction uncertainties in machine learning models. However, an ensemble of N models is approximately N times more computationally demanding compared to a single model when it is used for inference. In this work, we explore fitting a single model to predicted ensemble error bar data, which allows us to estimate uncertainties without the need for a full ense…
▽ More
Ensemble models can be used to estimate prediction uncertainties in machine learning models. However, an ensemble of N models is approximately N times more computationally demanding compared to a single model when it is used for inference. In this work, we explore fitting a single model to predicted ensemble error bar data, which allows us to estimate uncertainties without the need for a full ensemble. Our approach is based on three models: Model A for predictive accuracy, Model $A_{E}$ for traditional ensemble-based error bar prediction, and Model B, fit to data from Model $A_{E}$, to be used for predicting the values of $A_{E}$ but with only one model evaluation. Model B leverages synthetic data augmentation to estimate error bars efficiently. This approach offers a highly flexible method of uncertainty quantification that can approximate that of ensemble methods but only requires a single extra model evaluation over Model A during inference. We assess this approach on a set of problems in materials science.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Machine Learning Prediction of Critical Cooling Rate for Metallic Glasses From Expanded Datasets and Elemental Features
Authors:
Benjamin T. Afflerbach,
Carter Francis,
Lane E. Schultz,
Janine Spethson,
Vanessa Meschke,
Elliot Strand,
Logan Ward,
John H. Perepezko,
Dan Thoma,
Paul M. Voyles,
Izabela Szlufarska,
Dane Morgan
Abstract:
We use a random forest model to predict the critical cooling rate (RC) for glass formation of various alloys from features of their constituent elements. The random forest model was trained on a database that integrates multiple sources of direct and indirect RC data for metallic glasses to expand the directly measured RC database of less than 100 values to a training set of over 2,000 values. The…
▽ More
We use a random forest model to predict the critical cooling rate (RC) for glass formation of various alloys from features of their constituent elements. The random forest model was trained on a database that integrates multiple sources of direct and indirect RC data for metallic glasses to expand the directly measured RC database of less than 100 values to a training set of over 2,000 values. The model error on 5-fold cross validation is 0.66 orders of magnitude in K/s. The error on leave out one group cross validation on alloy system groups is 0.59 log units in K/s when the target alloy constituents appear more than 500 times in training data. Using this model, we make predictions for the set of compositions with melt-spun glasses in the database, and for the full set of quaternary alloys that have constituents which appear more than 500 times in training data. These predictions identify a number of potential new bulk metallic glass (BMG) systems for future study, but the model is most useful for identification of alloy systems likely to contain good glass formers, rather than detailed discovery of bulk glass composition regions within known glassy systems.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Molecular Dynamic Characteristic Temperatures for Predicting Metallic Glass Forming Ability
Authors:
Lane E. Schultz,
Benjamin Afflerbach,
Izabela Szlufarska,
Dane Morgan
Abstract:
We explore the use of characteristic temperatures derived from molecular dynamics to predict aspects of metallic Glass Forming Ability (GFA). Temperatures derived from cooling curves of self-diffusion, viscosity, and energy were used as features for machine learning models of GFA. Multiple target and model combinations with these features were explored. First, we use the logarithm of critical cast…
▽ More
We explore the use of characteristic temperatures derived from molecular dynamics to predict aspects of metallic Glass Forming Ability (GFA). Temperatures derived from cooling curves of self-diffusion, viscosity, and energy were used as features for machine learning models of GFA. Multiple target and model combinations with these features were explored. First, we use the logarithm of critical casting thickness, $log_{10}(D_{max})$, as the target and trained regression models on 21 compositions. Application of 3-fold cross-validation on the 21 $log_{10}(D_{max})$ alloys showed only weak correlation between the model predictions and the target values. Second, the GFA of alloys were quantified by melt-spinning or suction casting amorphization behavior, with alloys that showed crystalline phases after synthesis classified as Poor GFA and those with pure amorphous phases as Good GFA. Binary GFA classification was then modeled using decision tree-based methods (random forest and gradient boosting models) and were assessed with nested-cross validation. The maximum F1 score for the precision-recall with Good Glass Forming Ability as the positive class was $0.82 \pm 0.01$ for the best model type. We also compared using simple functions of characteristic temperatures as features in place of the temperatures themselves and found no statistically significant difference in predictive abilities. Although the predictive ability of the models developed here are modest, this work demonstrates clearly that one can use molecular dynamics simulations and machine learning to predict metal glass forming ability.
△ Less
Submitted 27 September, 2021;
originally announced September 2021.
-
Exploration of Characteristic Temperature Contributions to Metallic Glass Forming Ability
Authors:
Lane E. Schultz,
Benjamin Afflerbach,
Carter Francis,
Paul M. Voyles,
Izabela Szlufarska,
Dane Morgan
Abstract:
Various combinations of characteristic temperatures, such as the glass transition temperature, liquidus temperature, and crystallization temperature, have been proposed as predictions of the glass forming ability of metal alloys. We have used statistical approaches from machine learning to systematically explore a wide range of possible characteristic temperature functions for predicting glass for…
▽ More
Various combinations of characteristic temperatures, such as the glass transition temperature, liquidus temperature, and crystallization temperature, have been proposed as predictions of the glass forming ability of metal alloys. We have used statistical approaches from machine learning to systematically explore a wide range of possible characteristic temperature functions for predicting glass forming ability in the form of critical casting diameter, $D_{max}$. Both linear and non-linear models were used to learn on the largest database of $D_{max}$ values to date consisting of 747 compositions. We find that no combination of temperatures for features offers a better prediction of $D_{max}$ in a machine learning model than the temperatures themselves, and that regression models suffer from poor performance on standard machine learning metrics like root mean square error (minimum value of $3.3 \pm 0.1$ $mm$ for data with a standard deviation of 4.8 $mm$). Examination of the errors vs. database size suggest that a larger database may improve results, although a database significantly larger than that used here would likely be required. Shifting a focus from regression to categorization models learning from characteristic temperatures can be used to weakly distinguish glasses likely to be above vs. below our database's median $D_{max}$ value of 4.0 $mm$, with a mean F1 score of $0.77 \pm 0.02$ for this categorization. The overall weak results on predicting $D_{max}$ suggests that critical cooling rate might be a better target for machine learning model prediction.
△ Less
Submitted 29 July, 2021;
originally announced July 2021.
-
Microalloying effect in ternary Al-Sm-X (X=Ag, Au, Cu) metallic glasses studied by ab initio molecular dynamics
Authors:
J. Xi,
G. Bokas,
L. E. Schultz,
M. Gao,
L. Zhao,
Y. Shen,
J. H. Perepezko,
D. Morgan,
I. Szlufarska
Abstract:
The icosahedral-like polyhedral fraction (ICO-like fraction) has been studied as a criterion for predicting the glass-forming ability of bulk ternary metallic glasses, Al90Sm8X2 (X = Al (binary), Cu, Ag, Au), using ab initio molecular dynamics (AIMD) simulations. We found that the ICO-like fraction can be determined with adequate precision to explore correlations with AIMD simulations. We then dem…
▽ More
The icosahedral-like polyhedral fraction (ICO-like fraction) has been studied as a criterion for predicting the glass-forming ability of bulk ternary metallic glasses, Al90Sm8X2 (X = Al (binary), Cu, Ag, Au), using ab initio molecular dynamics (AIMD) simulations. We found that the ICO-like fraction can be determined with adequate precision to explore correlations with AIMD simulations. We then demonstrated that ICO-like fraction correlates with the critical cooling rate, which is a widely used intrinsic measure of glass forming ability. These results suggest that the ICO-like fraction from AIMD simulations may offer a useful guide for searching and screening for good glass formers.
△ Less
Submitted 5 August, 2020;
originally announced August 2020.