-
Training OOD Detectors in their Natural Habitats
Authors:
Julian Katz-Samuels,
Julia Nakhleh,
Robert Nowak,
Yixuan Li
Abstract:
Out-of-distribution (OOD) detection is important for machine learning models deployed in the wild. Recent methods use auxiliary outlier data to regularize the model for improved OOD detection. However, these approaches make a strong distributional assumption that the auxiliary outlier data is completely separable from the in-distribution (ID) data. In this paper, we propose a novel framework that…
▽ More
Out-of-distribution (OOD) detection is important for machine learning models deployed in the wild. Recent methods use auxiliary outlier data to regularize the model for improved OOD detection. However, these approaches make a strong distributional assumption that the auxiliary outlier data is completely separable from the in-distribution (ID) data. In this paper, we propose a novel framework that leverages wild mixture data, which naturally consists of both ID and OOD samples. Such wild data is abundant and arises freely upon deploying a machine learning classifier in their natural habitats. Our key idea is to formulate a constrained optimization problem and to show how to tractably solve it. Our learning objective maximizes the OOD detection rate, subject to constraints on the classification error of ID data and on the OOD error rate of ID examples. We extensively evaluate our approach on common OOD detection tasks and demonstrate superior performance.
△ Less
Submitted 28 June, 2022; v1 submitted 7 February, 2022;
originally announced February 2022.
-
Identifying Entangled Physics Relationships through Sparse Matrix Decomposition to Inform Plasma Fusion Design
Authors:
M. Giselle Fernández-Godino,
Michael J. Grosskopf,
Julia B. Nakhleh,
Brandon M. Wilson,
John Kline,
Gowri Srinivasan
Abstract:
A sustainable burn platform through inertial confinement fusion (ICF) has been an ongoing challenge for over 50 years. Mitigating engineering limitations and improving the current design involves an understanding of the complex coupling of physical processes. While sophisticated simulations codes are used to model ICF implosions, these tools contain necessary numerical approximation but miss physi…
▽ More
A sustainable burn platform through inertial confinement fusion (ICF) has been an ongoing challenge for over 50 years. Mitigating engineering limitations and improving the current design involves an understanding of the complex coupling of physical processes. While sophisticated simulations codes are used to model ICF implosions, these tools contain necessary numerical approximation but miss physical processes that limit predictive capability. Identification of relationships between controllable design inputs to ICF experiments and measurable outcomes (e.g. yield, shape) from performed experiments can help guide the future design of experiments and development of simulation codes, to potentially improve the accuracy of the computational models used to simulate ICF experiments. We use sparse matrix decomposition methods to identify clusters of a few related design variables. Sparse principal component analysis (SPCA) identifies grou**s that are related to the physical origin of the variables (laser, hohlraum, and capsule). A variable importance analysis finds that in addition to variables highly correlated with neutron yield such as picket power and laser energy, variables that represent a dramatic change of the ICF design such as number of pulse steps are also very important. The obtained sparse components are then used to train a random forest (RF) surrogate for predicting total yield. The RF performance on the training and testing data compares with the performance of the RF surrogate trained using all design variables considered. This work is intended to inform design changes in future ICF experiments by augmenting the expert intuition and simulations results.
△ Less
Submitted 28 October, 2020;
originally announced October 2020.
-
Exploring Sensitivity of ICF Outputs to Design Parameters in Experiments Using Machine Learning
Authors:
Julia B. Nakhleh,
M. Giselle Fernández-Godino,
Michael J. Grosskopf,
Brandon M. Wilson,
John Kline,
Gowri Srinivasan
Abstract:
Building a sustainable burn platform in inertial confinement fusion (ICF) requires an understanding of the complex coupling of physical processes and the effects that key experimental design changes have on implosion performance. While simulation codes are used to model ICF implosions, incomplete physics and the need for approximations deteriorate their predictive capability. Identification of rel…
▽ More
Building a sustainable burn platform in inertial confinement fusion (ICF) requires an understanding of the complex coupling of physical processes and the effects that key experimental design changes have on implosion performance. While simulation codes are used to model ICF implosions, incomplete physics and the need for approximations deteriorate their predictive capability. Identification of relationships between controllable design inputs and measurable outcomes can help guide the future design of experiments and development of simulation codes, which can potentially improve the accuracy of the computational models used to simulate ICF implosions. In this paper, we leverage developments in machine learning (ML) and methods for ML feature importance/sensitivity analysis to identify complex relationships in ways that are difficult to process using expert judgment alone. We present work using random forest (RF) regression for prediction of yield, velocity, and other experimental outcomes given a suite of design parameters, along with an assessment of important relationships and uncertainties in the prediction model. We show that RF models are capable of learning and predicting on ICF experimental data with high accuracy, and we extract feature importance metrics that provide insight into the physical significance of different controllable design inputs for various ICF design configurations. These results can be used to augment expert intuition and simulation results for optimal design of future ICF experiments.
△ Less
Submitted 1 September, 2021; v1 submitted 8 October, 2020;
originally announced October 2020.