-
OMNIINPUT: A Model-centric Evaluation Framework through Output Distribution
Authors:
Weitang Liu,
Ying Wai Li,
Tianle Wang,
Yi-Zhuang You,
**gbo Shang
Abstract:
We propose a novel model-centric evaluation framework, OmniInput, to evaluate the quality of an AI/ML model's predictions on all possible inputs (including human-unrecognizable ones), which is crucial for AI safety and reliability. Unlike traditional data-centric evaluation based on pre-defined test sets, the test set in OmniInput is self-constructed by the model itself and the model quality is ev…
▽ More
We propose a novel model-centric evaluation framework, OmniInput, to evaluate the quality of an AI/ML model's predictions on all possible inputs (including human-unrecognizable ones), which is crucial for AI safety and reliability. Unlike traditional data-centric evaluation based on pre-defined test sets, the test set in OmniInput is self-constructed by the model itself and the model quality is evaluated by investigating its output distribution. We employ an efficient sampler to obtain representative inputs and the output distribution of the trained model, which, after selective annotation, can be used to estimate the model's precision and recall at different output values and a comprehensive precision-recall curve. Our experiments demonstrate that OmniInput enables a more fine-grained comparison between models, especially when their performance is almost the same on pre-defined datasets, leading to new findings and insights for how to train more robust, generalizable models.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Performance Analysis of CP2K Code for Ab Initio Molecular Dynamics
Authors:
Dewi Yokelson,
Nikolay V. Tkachenko,
Robert Robey,
Ying Wai Li,
Pavel A. Dub
Abstract:
Using a realistic molecular catalyst system, we conduct scaling studies of ab initio molecular dynamics simulations using the CP2K code on both Intel Xeon CPU and NVIDIA V100 GPU architectures. We explore using process placement and affinity to gain additional performance improvements. We also use statistical methods to understand performance changes in spite of the variability in runtime for each…
▽ More
Using a realistic molecular catalyst system, we conduct scaling studies of ab initio molecular dynamics simulations using the CP2K code on both Intel Xeon CPU and NVIDIA V100 GPU architectures. We explore using process placement and affinity to gain additional performance improvements. We also use statistical methods to understand performance changes in spite of the variability in runtime for each molecular dynamics timestep. We found ideal conditions for CPU runs included at least four MPI ranks per node, bound evenly across each socket, and fully utilizing processing cores with one OpenMP thread per core, no benefit was shown from reserving cores for the system. The CPU-only simulations scaled at 70% or more of the ideal scaling up to 10 compute nodes, after which the returns began to diminish more quickly. Simulations on a single 40-core node with two NVIDIA V100 GPUs for acceleration achieved over 3.7x speedup compared to the fastest single 36-core node CPU-only version, and showed 13% speedup over the fastest time we achieved across five CPU-only nodes.
△ Less
Submitted 9 September, 2021;
originally announced September 2021.
-
A scalable constructive algorithm for the optimization of neural network architectures
Authors:
Massimiliano Lupo Pasini,
Junqi Yin,
Ying Wai Li,
Markus Eisenbach
Abstract:
We propose a new scalable method to optimize the architecture of an artificial neural network. The proposed algorithm, called Greedy Search for Neural Network Architecture, aims to determine a neural network with minimal number of layers that is at least as performant as neural networks of the same structure identified by other hyperparameter search algorithms in terms of accuracy and computationa…
▽ More
We propose a new scalable method to optimize the architecture of an artificial neural network. The proposed algorithm, called Greedy Search for Neural Network Architecture, aims to determine a neural network with minimal number of layers that is at least as performant as neural networks of the same structure identified by other hyperparameter search algorithms in terms of accuracy and computational cost. Numerical results performed on benchmark datasets show that, for these datasets, our method outperforms state-of-the-art hyperparameter optimization algorithms in terms of attainable predictive performance by the selected neural network architecture, and time-to-solution for the hyperparameter optimization to complete.
△ Less
Submitted 21 April, 2021; v1 submitted 7 September, 2019;
originally announced September 2019.