-
Identification of high-reliability regions of machine learning predictions in materials science using transparent conducting oxides and perovskites as examples
Authors:
Evan M. Askanazi,
Emanuel A. Lazar,
Ilya Grinberg
Abstract:
Progress in the application of machine learning (ML) methods to materials design is hindered by the lack of understanding of the reliability of ML predictions, in particular for the application of ML to small data sets often found in materials science. Using ML prediction for transparent conductor oxide formation energy and band gap, dilute solute diffusion, and perovskite formation energy, band g…
▽ More
Progress in the application of machine learning (ML) methods to materials design is hindered by the lack of understanding of the reliability of ML predictions, in particular for the application of ML to small data sets often found in materials science. Using ML prediction for transparent conductor oxide formation energy and band gap, dilute solute diffusion, and perovskite formation energy, band gap and lattice parameter as examples, we demonstrate that 1) analysis of ML results by construction of a convex hull in feature space that encloses accurately predicted systems can be used to identify regions in feature space for which ML predictions are highly reliable 2) analysis of the systems enclosed by the convex hull can be used to extract physical understanding and 3) materials that satisfy all well-known chemical and physical principles that make a material physically reasonable are likely to be similar and show strong relationships between the properties of interest and the standard features used in ML. We also show that similar to the composition-structure-property relationships, inclusion in the ML training data set of materials from classes with different chemical properties will not be beneficial and will slightly decrease the accuracy of ML prediction and that reliable results likely will be obtained by ML model for narrow classes of similar materials even in the case where the ML model will show large errors on the dataset consisting of several classes of materials. Our work suggests that analysis of the error distributions of ML predictions will be beneficial for the further development of the application of ML methods in material science.
△ Less
Submitted 5 April, 2023;
originally announced April 2023.
-
Distance-based Analysis of Machine Learning Prediction Reliability for Datasets in Materials Science and Other Fields
Authors:
Evan Askanazi,
Ilya Grinberg
Abstract:
Despite successful use in a wide variety of disciplines for data analysis and prediction, machine learning (ML) methods suffer from a lack of understanding of the reliability of predictions due to the lack of transparency and black-box nature of ML models. In materials science and other fields, typical ML model results include a significant number of low-quality predictions. This problem is known…
▽ More
Despite successful use in a wide variety of disciplines for data analysis and prediction, machine learning (ML) methods suffer from a lack of understanding of the reliability of predictions due to the lack of transparency and black-box nature of ML models. In materials science and other fields, typical ML model results include a significant number of low-quality predictions. This problem is known to be particularly acute for target systems which differ significantly from the data used for ML model training. However, to date, a general method for characterization of the difference between the predicted and training system has not been available. Here, we show that a simple metric based on Euclidean feature space distance and sampling density allows effective separation of the accurately predicted data points from data points with poor prediction accuracy. We show that the metric effectiveness is enhanced by the decorrelation of the features using Gram-Schmidt orthogonalization. To demonstrate the generality of the method, we apply it to support vector regression models for various small data sets in materials science and other fields. Our method is computationally simple, can be used with any ML learning method and enables analysis of the sources of the ML prediction errors. Therefore, it is suitable for use as a standard technique for the estimation of ML prediction reliability for small data sets and as a tool for data set design.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Measurement of the Parity-Odd Angular Distribution of Gamma Rays From Polarized Neutron Capture on $^{35}$Cl
Authors:
N. Fomin,
R. Alarcon,
L. Alonzi,
E. Askanazi,
S. Baeßler,
S. Balascuta,
L. Barrón-Palos,
A. Barzilov,
D. Blyth,
J. D. Bowman,
N. Birge,
J. R. Calarco,
T. E. Chupp,
V. Cianciolo,
C. E. Coppola,
C. B. Crawford,
K. Craycraft,
D. Evans,
C. Fieseler,
E. Frlež,
J. Fry,
I. Garishvili,
M. T. W. Gericke,
R. C. Gillis,
K. B. Grammer
, et al. (39 additional authors not shown)
Abstract:
We report a measurement of two energy-weighted gamma cascade angular distributions from polarized slow neutron capture on the ${}^{35}$Cl nucleus, one parity-odd correlation proportional to $\vec{s_{n}} \cdot \vec{k_γ}$ and one parity-even correlation proportional to $\vec{s_{n}} \cdot \vec{k_{n}} \times \vec{k_γ}$. A parity violating asymmetry can appear in this reaction due to the weak nucleon-n…
▽ More
We report a measurement of two energy-weighted gamma cascade angular distributions from polarized slow neutron capture on the ${}^{35}$Cl nucleus, one parity-odd correlation proportional to $\vec{s_{n}} \cdot \vec{k_γ}$ and one parity-even correlation proportional to $\vec{s_{n}} \cdot \vec{k_{n}} \times \vec{k_γ}$. A parity violating asymmetry can appear in this reaction due to the weak nucleon-nucleon (NN) interaction which mixes opposite parity S and P-wave levels in the excited compound $^{36}$Cl nucleus formed upon slow neutron capture. If parity-violating (PV) and parity-conserving (PC) terms both exist, the measured differential cross section can be related to them via $\frac{dσ}{dΩ}\propto1+A_{γ, PV}\cosθ+A_{γ,PC}\sinθ$. The PV and PC asymmetries for energy-weighted gamma cascade angular distributions for polarized slow neutron capture on $^{35}$Cl averaged over the neutron energies from 2.27~meV to 9.53~meV were measured to be $A_{γ,PV}=(-23.9\pm0.7)\times 10^{-6}$ and $A_{γ,PC}=(0.1\pm0.7)\times 10^{-6}$. These results are consistent with previous experimental results. Systematic errors were quantified and shown to be small compared to the statistical error. These asymmetries in the angular distributions of the gamma rays emitted from the capture of polarized neutrons in $^{35}$Cl were used to verify the operation and data analysis procedures for the NPDGamma experiment which measured the parity-odd asymmetry in the angular distribution of gammas from polarized slow neutron capture on protons.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
First Observation of $P$-odd $γ$ Asymmetry in Polarized Neutron Capture on Hydrogen
Authors:
D. Blyth,
J. Fry,
N. Fomin,
R. Alarcon,
L. Alonzi,
E. Askanazi,
S. Baeßler,
S. Balascuta,
L. Barrón-Palos,
A. Barzilov,
J. D. Bowman,
N. Birge,
J. R. Calarco,
T. E. Chupp,
V. Cianciolo,
C. E. Coppola,
C. B. Crawford,
K. Craycraft,
D. Evans,
C. Fieseler,
E. Frlež,
I. Garishvili,
M. T. W. Gericke,
R. C. Gillis,
K. B. Grammer
, et al. (39 additional authors not shown)
Abstract:
We report the first observation of the parity-violating 2.2 MeV gamma-ray asymmetry $A^{np}_γ$ in neutron-proton capture using polarized cold neutrons incident on a liquid parahydrogen target at the Spallation Neutron Source at Oak Ridge National Laboratory. $A^{np}_γ$ isolates the $ΔI=1$, \mbox{$^{3}S_{1}\rightarrow {^{3}P_{1}}$} component of the weak nucleon-nucleon interaction, which is dominat…
▽ More
We report the first observation of the parity-violating 2.2 MeV gamma-ray asymmetry $A^{np}_γ$ in neutron-proton capture using polarized cold neutrons incident on a liquid parahydrogen target at the Spallation Neutron Source at Oak Ridge National Laboratory. $A^{np}_γ$ isolates the $ΔI=1$, \mbox{$^{3}S_{1}\rightarrow {^{3}P_{1}}$} component of the weak nucleon-nucleon interaction, which is dominated by pion exchange and can be directly related to a single coupling constant in either the DDH meson exchange model or pionless EFT. We measured $A^{np}_γ= [-3.0 \pm 1.4 (stat) \pm 0.2 (sys)]\times 10^{-8}$, which implies a DDH weak $πNN$ coupling of $h_π^{1} = [2.6 \pm 1.2(stat) \pm 0.2(sys)] \times 10^{-7}$ and a pionless EFT constant of $C^{^{3}S_{1}\rightarrow ^{3}P_{1}}/C_{0}=[-7.4 \pm 3.5 (stat) \pm 0.5 (sys)] \times 10^{-11}$ MeV$^{-1}$. We describe the experiment, data analysis, systematic uncertainties, and the implications of the result.
△ Less
Submitted 14 December, 2018; v1 submitted 26 July, 2018;
originally announced July 2018.
-
Bernstein Polynomials based Probabilistic Interpretation of Quark Hadron Duality
Authors:
Evan Askanazi,
Simonetta Liuti
Abstract:
It is now widely recognized that large Bjorken $x$ data play an important role in global analyses of Parton Distribution Functions (PDFs) even at collider energies, through perturbative QCD evolution. For values of the scale of the reaction, $Q^2$, in the multi-GeV region the structure functions at large $x$ present resonance structure. Notwithstanding, these data can be incorporated in the analys…
▽ More
It is now widely recognized that large Bjorken $x$ data play an important role in global analyses of Parton Distribution Functions (PDFs) even at collider energies, through perturbative QCD evolution. For values of the scale of the reaction, $Q^2$, in the multi-GeV region the structure functions at large $x$ present resonance structure. Notwithstanding, these data can be incorporated in the analyses by using quark-hadron duality or approximate scaling of the structure function data averaged over their resonance structure. Several averaging methods have been proposed using either the PDFs Mellin moments, or their truncated moments. We propose an alternative method using Bernstein polynomials integrals, or Bernstein moments. Bernstein moments render a smooth form of the structure function in the resonance region. Furthermore, being based on a different averaging criterion than the methods adopted so far, they provide a new framework for understanding the possible mechanisms giving origin to the phenomenon of quark-hadron duality.
△ Less
Submitted 14 November, 2017; v1 submitted 6 October, 2017;
originally announced October 2017.
-
Exploring Nucleon Structure with the Self-Organizing Maps Algorithm
Authors:
Evan M. Askanazi,
Katherine A. Holcomb,
Simonetta Liuti
Abstract:
We discuss the application of an alternative type of neural network, the Self-Organizing Map to extract parton distribution functions from various hard scattering processes.
We discuss the application of an alternative type of neural network, the Self-Organizing Map to extract parton distribution functions from various hard scattering processes.
△ Less
Submitted 10 November, 2014;
originally announced November 2014.
-
Self-Organizing Maps Parametrization of Deep Inelastic Structure Functions with Error Determination
Authors:
Evan Askanazi,
Katherine Holcomb,
Simonetta Liuti
Abstract:
We present and discuss a new method to extract parton distribution functions from hard scattering processes based on an alternative type of neural network, the Self-Organizing Map. Quantitative results including a detailed treatment of uncertainties are presented within a Next to Leading Order analysis of inclusive electron proton deep inelastic scattering data.
We present and discuss a new method to extract parton distribution functions from hard scattering processes based on an alternative type of neural network, the Self-Organizing Map. Quantitative results including a detailed treatment of uncertainties are presented within a Next to Leading Order analysis of inclusive electron proton deep inelastic scattering data.
△ Less
Submitted 26 September, 2013;
originally announced September 2013.
-
Self-Organizing Maps Algorithm for Parton Distribution Functions Extraction
Authors:
S. Liuti,
K. Holcomb,
E. Askanazi
Abstract:
We describe a new method to extract parton distribution functions from hard scattering processes based on Self-Organizing Maps. The extension to a larger, and more complex class of soft matrix elements, including generalized parton distributions is also discussed.
We describe a new method to extract parton distribution functions from hard scattering processes based on Self-Organizing Maps. The extension to a larger, and more complex class of soft matrix elements, including generalized parton distributions is also discussed.
△ Less
Submitted 10 December, 2011;
originally announced December 2011.
-
Exact Statistical Mechanical Investigation of a Finite Model Protein in its environment: A Small System Paradigm
Authors:
P. D. Gujrati,
Bradley P. Lambeth Jr,
Andrea Corsi,
Evan Askanazi
Abstract:
We consider a general incompressible finite model protein of size M in its environment, which we represent by a semiflexible copolymer consisting of amino acid residues classified into only two species (H and P, see text) following Lau and Dill. We allow various interactions between chemically unbonded residues in a given sequence and the solvent (water), and exactly enumerate the number of conf…
▽ More
We consider a general incompressible finite model protein of size M in its environment, which we represent by a semiflexible copolymer consisting of amino acid residues classified into only two species (H and P, see text) following Lau and Dill. We allow various interactions between chemically unbonded residues in a given sequence and the solvent (water), and exactly enumerate the number of conformations W(E) as a function of the energy E on an infinite lattice under two different conditions: (i) we allow conformations that are restricted to be compact (known as Hamilton walk conformations), and (ii) we allow unrestricted conformations that can also be non-compact. It is easily demonstrated using plausible arguments that our model does not possess any energy gap even though it is supposed to exhibit a sharp folding transition in the thermodynamic limit. The enumeration allows us to investigate exactly the effects of energetics on the native state(s), and the effect of small size on protein thermodynamics and, in particular, on the differences between the microcanonical and canonical ensembles. We find that the canonical entropy is much larger than the microcanonical entropy for finite systems. We investigate the property of self-averaging and conclude that small proteins do not self-average. We also present results that (i) provide some understanding of the energy landscape, and (ii) shed light on the free energy landscape at different temperatures.
△ Less
Submitted 29 August, 2007; v1 submitted 28 August, 2007;
originally announced August 2007.