-
Liouville Flow Importance Sampler
Authors:
Yifeng Tian,
Nishant Panda,
Yen Ting Lin
Abstract:
We present the Liouville Flow Importance Sampler (LFIS), an innovative flow-based model for generating samples from unnormalized density functions. LFIS learns a time-dependent velocity field that deterministically transports samples from a simple initial distribution to a complex target distribution, guided by a prescribed path of annealed distributions. The training of LFIS utilizes a unique met…
▽ More
We present the Liouville Flow Importance Sampler (LFIS), an innovative flow-based model for generating samples from unnormalized density functions. LFIS learns a time-dependent velocity field that deterministically transports samples from a simple initial distribution to a complex target distribution, guided by a prescribed path of annealed distributions. The training of LFIS utilizes a unique method that enforces the structure of a derived partial differential equation to neural networks modeling velocity fields. By considering the neural velocity field as an importance sampler, sample weights can be computed through accumulating errors along the sample trajectories driven by neural velocity fields, ensuring unbiased and consistent estimation of statistical quantities. We demonstrate the effectiveness of LFIS through its application to a range of benchmark problems, on many of which LFIS achieved state-of-the-art performance.
△ Less
Submitted 9 June, 2024; v1 submitted 3 May, 2024;
originally announced May 2024.
-
Semi-supervised Learning of Pushforwards For Domain Translation & Adaptation
Authors:
Nishant Panda,
Natalie Klein,
Dominic Yang,
Patrick Gasda,
Diane Oyen
Abstract:
Given two probability densities on related data spaces, we seek a map pushing one density to the other while satisfying application-dependent constraints. For maps to have utility in a broad application space (including domain translation, domain adaptation, and generative modeling), the map must be available to apply on out-of-sample data points and should correspond to a probabilistic model over…
▽ More
Given two probability densities on related data spaces, we seek a map pushing one density to the other while satisfying application-dependent constraints. For maps to have utility in a broad application space (including domain translation, domain adaptation, and generative modeling), the map must be available to apply on out-of-sample data points and should correspond to a probabilistic model over the two spaces. Unfortunately, existing approaches, which are primarily based on optimal transport, do not address these needs. In this paper, we introduce a novel pushforward map learning algorithm that utilizes normalizing flows to parameterize the map. We first re-formulate the classical optimal transport problem to be map-focused and propose a learning algorithm to select from all possible maps under the constraint that the map minimizes a probability distance and application-specific regularizers; thus, our method can be seen as solving a modified optimal transport problem. Once the map is learned, it can be used to map samples from a source domain to a target domain. In addition, because the map is parameterized as a composition of normalizing flows, it models the empirical distributions over the two data spaces and allows both sampling and likelihood evaluation for both data sets. We compare our method (parOT) to related optimal transport approaches in the context of domain adaptation and domain translation on benchmark data sets. Finally, to illustrate the impact of our work on applied problems, we apply parOT to a real scientific application: spectral calibration for high-dimensional measurements from two vastly different environments
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
Generative structured normalizing flow Gaussian processes applied to spectroscopic data
Authors:
Natalie Klein,
Nishant Panda,
Patrick Gasda,
Diane Oyen
Abstract:
In this work, we propose a novel generative model for map** inputs to structured, high-dimensional outputs using structured conditional normalizing flows and Gaussian process regression. The model is motivated by the need to characterize uncertainty in the input/output relationship when making inferences on new data. In particular, in the physical sciences, limited training data may not adequate…
▽ More
In this work, we propose a novel generative model for map** inputs to structured, high-dimensional outputs using structured conditional normalizing flows and Gaussian process regression. The model is motivated by the need to characterize uncertainty in the input/output relationship when making inferences on new data. In particular, in the physical sciences, limited training data may not adequately characterize future observed data; it is critical that models adequately indicate uncertainty, particularly when they may be asked to extrapolate. In our proposed model, structured conditional normalizing flows provide parsimonious latent representations that relate to the inputs through a Gaussian process, providing exact likelihood calculations and uncertainty that naturally increases away from the training data inputs. We demonstrate the methodology on laser-induced breakdown spectroscopy data from the ChemCam instrument onboard the Mars rover Curiosity. ChemCam was designed to recover the chemical composition of rock and soil samples by measuring the spectral properties of plasma atomic emissions induced by a laser pulse. We show that our model can generate realistic spectra conditional on a given chemical composition and that we can use the model to perform uncertainty quantification of chemical compositions for new observed spectra. Based on our results, we anticipate that our proposed modeling approach may be useful in other scientific domains with high-dimensional, complex structure where it is important to quantify predictive uncertainty.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
BB-ML: Basic Block Performance Prediction using Machine Learning Techniques
Authors:
Hamdy Abdelkhalik,
Shamminuj Aktar,
Yehia Arafa,
Atanu Barai,
Gopinath Chennupati,
Nandakishore Santhi,
Nishant Panda,
Nirmal Prajapati,
Nazmul Haque Turja,
Stephan Eidenbenz,
Abdel-Hameed Badawy
Abstract:
Recent years have seen the adoption of Machine Learning (ML) techniques to predict the performance of large-scale applications, mostly at a coarse level. In contrast, we propose to use ML techniques for performance prediction at a much finer granularity, namely at the Basic Block (BB) level, which are single entry, single exit code blocks that are used for analysis by the compilers to break down a…
▽ More
Recent years have seen the adoption of Machine Learning (ML) techniques to predict the performance of large-scale applications, mostly at a coarse level. In contrast, we propose to use ML techniques for performance prediction at a much finer granularity, namely at the Basic Block (BB) level, which are single entry, single exit code blocks that are used for analysis by the compilers to break down a large code into manageable pieces. We extrapolate the basic block execution counts of GPU applications and use them for predicting the performance for large input sizes from the counts of smaller input sizes. We train a Poisson Neural Network (PNN) model using random input values as well as the lowest input values of the application to learn the relationship between inputs and basic block counts. Experimental results show that the model can accurately predict the basic block execution counts of 16 GPU benchmarks. We achieve an accuracy of 93.5% in extrapolating the basic block counts for large input sets when trained on smaller input sets and an accuracy of 97.7% in predicting basic block counts on random instances. In a case study, we apply the ML model to CUDA GPU benchmarks for performance prediction across a spectrum of applications. We use a variety of metrics for evaluation, including global memory requests and the active cycles of tensor cores, ALU, and FMA units. Results demonstrate the model's capability of predicting the performance of large datasets with an average error rate of 0.85% and 0.17% for global and shared memory requests, respectively. Additionally, to address the utilization of the main functional units in Ampere architecture GPUs, we calculate the active cycles for tensor cores, ALU, FMA, and FP64 units and achieve an average error of 2.3% and 10.66% for ALU and FMA units while the maximum observed error across all tested applications and units reaches 18.5%.
△ Less
Submitted 11 November, 2023; v1 submitted 15 February, 2022;
originally announced February 2022.
-
Neural density estimation and uncertainty quantification for laser induced breakdown spectroscopy spectra
Authors:
Katiana Kontolati,
Natalie Klein,
Nishant Panda,
Diane Oyen
Abstract:
Constructing probability densities for inference in high-dimensional spectral data is often intractable. In this work, we use normalizing flows on structured spectral latent spaces to estimate such densities, enabling downstream inference tasks. In addition, we evaluate a method for uncertainty quantification when predicting unobserved state vectors associated with each spectrum. We demonstrate th…
▽ More
Constructing probability densities for inference in high-dimensional spectral data is often intractable. In this work, we use normalizing flows on structured spectral latent spaces to estimate such densities, enabling downstream inference tasks. In addition, we evaluate a method for uncertainty quantification when predicting unobserved state vectors associated with each spectrum. We demonstrate the capability of this approach on laser-induced breakdown spectroscopy data collected by the ChemCam instrument on the Mars rover Curiosity. Using our approach, we are able to generate realistic spectral samples and to accurately predict state vectors with associated well-calibrated uncertainties. We anticipate that this methodology will enable efficient probabilistic modeling of spectral data, leading to potential advances in several areas, including out-of-distribution detection and sensitivity analysis.
△ Less
Submitted 16 August, 2021;
originally announced August 2021.
-
StressNet: Deep Learning to Predict Stress With Fracture Propagation in Brittle Materials
Authors:
Yinan Wang,
Diane Oyen,
Weihong,
Guo,
Anishi Mehta,
Cory Braker Scott,
Nishant Panda,
M. Giselle Fernández-Godino,
Gowri Srinivasan,
Xiaowei Yue
Abstract:
Catastrophic failure in brittle materials is often due to the rapid growth and coalescence of cracks aided by high internal stresses. Hence, accurate prediction of maximum internal stress is critical to predicting time to failure and improving the fracture resistance and reliability of materials. Existing high-fidelity methods, such as the Finite-Discrete Element Model (FDEM), are limited by their…
▽ More
Catastrophic failure in brittle materials is often due to the rapid growth and coalescence of cracks aided by high internal stresses. Hence, accurate prediction of maximum internal stress is critical to predicting time to failure and improving the fracture resistance and reliability of materials. Existing high-fidelity methods, such as the Finite-Discrete Element Model (FDEM), are limited by their high computational cost. Therefore, to reduce computational cost while preserving accuracy, a novel deep learning model, "StressNet," is proposed to predict the entire sequence of maximum internal stress based on fracture propagation and the initial stress data. More specifically, the Temporal Independent Convolutional Neural Network (TI-CNN) is designed to capture the spatial features of fractures like fracture path and spall regions, and the Bidirectional Long Short-term Memory (Bi-LSTM) Network is adapted to capture the temporal features. By fusing these features, the evolution in time of the maximum internal stress can be accurately predicted. Moreover, an adaptive loss function is designed by dynamically integrating the Mean Squared Error (MSE) and the Mean Absolute Percentage Error (MAPE), to reflect the fluctuations in maximum internal stress. After training, the proposed model is able to compute accurate multi-step predictions of maximum internal stress in approximately 20 seconds, as compared to the FDEM run time of 4 hours, with an average MAPE of 2% relative to test data.
△ Less
Submitted 20 November, 2020;
originally announced November 2020.
-
Estimating Failure in Brittle Materials using Graph Theory
Authors:
M. K. Mudunuru,
N. Panda,
S. Karra,
G. Srinivasan,
V. T. Chau,
E. Rougier,
A. Hunter,
H. S. Viswanathan
Abstract:
In brittle fracture applications, failure paths, regions where the failure occurs and damage statistics, are some of the key quantities of interest (QoI). High-fidelity models for brittle failure that accurately predict these QoI exist but are highly computationally intensive, making them infeasible to incorporate in upscaling and uncertainty quantification frameworks. The goal of this paper is to…
▽ More
In brittle fracture applications, failure paths, regions where the failure occurs and damage statistics, are some of the key quantities of interest (QoI). High-fidelity models for brittle failure that accurately predict these QoI exist but are highly computationally intensive, making them infeasible to incorporate in upscaling and uncertainty quantification frameworks. The goal of this paper is to provide a fast heuristic to reasonably estimate quantities such as failure path and damage in the process of brittle failure. Towards this goal, we first present a method to predict failure paths under tensile loading conditions and low-strain rates. The method uses a $k$-nearest neighbors algorithm built on fracture process zone theory, and identifies the set of all possible pre-existing cracks that are likely to join early to form a large crack. The method then identifies zone of failure and failure paths using weighted graphs algorithms. We compare these failure paths to those computed with a high-fidelity model called the Hybrid Optimization Software Simulation Suite (HOSS). A probabilistic evolution model for average damage in a system is also developed that is trained using 150 HOSS simulations and tested on 40 simulations. A non-parametric approach based on confidence intervals is used to determine the damage evolution over time along the dominant failure path. For upscaling, damage is the key QoI needed as an input by the continuum models. This needs to be informed accurately by the surrogate models for calculating effective modulii at continuum-scale. We show that for the proposed average damage evolution model, the prediction accuracy on the test data is more than 90\%. In terms of the computational time, the proposed models are $\approx \mathcal{O}(10^6)$ times faster compared to high-fidelity HOSS.
△ Less
Submitted 30 July, 2018;
originally announced July 2018.