-
SAGIPS: A Scalable Asynchronous Generative Inverse Problem Solver
Authors:
Daniel Lersch,
Malachi Schram,
Zhenyu Dai,
Kishansingh Rajput,
Xingfu Wu,
N. Sato,
J. Taylor Childers
Abstract:
Large scale, inverse problem solving deep learning algorithms have become an essential part of modern research and industrial applications. The complexity of the underlying inverse problem often poses challenges to the algorithm and requires the proper utilization of high-performance computing systems. Most deep learning algorithms require, due to their design, custom parallelization techniques in…
▽ More
Large scale, inverse problem solving deep learning algorithms have become an essential part of modern research and industrial applications. The complexity of the underlying inverse problem often poses challenges to the algorithm and requires the proper utilization of high-performance computing systems. Most deep learning algorithms require, due to their design, custom parallelization techniques in order to be resource efficient while showing a reasonable convergence. In this paper we introduces a \underline{S}calable \underline{A}synchronous \underline{G}enerative workflow for solving \underline{I}nverse \underline{P}roblems \underline{S}olver (SAGIPS) on high-performance computing systems. We present a workflow that utilizes a parallelization approach where the gradients of the generator network are updated in an asynchronous ring-all-reduce fashion. Experiments with a scientific proxy application demonstrate that SAGIPS shows near linear weak scaling, together with a convergence quality that is comparable to traditional methods. The approach presented here allows leveraging GANs across multiple GPUs, promising advancements in solving complex inverse problems at scale.
△ Less
Submitted 11 June, 2024;
originally announced July 2024.
-
Robust Errant Beam Prognostics with Conditional Modeling for Particle Accelerators
Authors:
Kishansingh Rajput,
Malachi Schram,
Willem Blokland,
Yasir Alanazi,
Pradeep Ramuhalli,
Alexander Zhukov,
Charles Peters,
Ricardo Vilalta
Abstract:
Particle accelerators are complex and comprise thousands of components, with many pieces of equipment running at their peak power. Consequently, particle accelerators can fault and abort operations for numerous reasons. These faults impact the availability of particle accelerators during scheduled run-time and hamper the efficiency and the overall science output. To avoid these faults, we apply an…
▽ More
Particle accelerators are complex and comprise thousands of components, with many pieces of equipment running at their peak power. Consequently, particle accelerators can fault and abort operations for numerous reasons. These faults impact the availability of particle accelerators during scheduled run-time and hamper the efficiency and the overall science output. To avoid these faults, we apply anomaly detection techniques to predict any unusual behavior and perform preemptive actions to improve the total availability of particle accelerators. Semi-supervised Machine Learning (ML) based anomaly detection approaches such as autoencoders and variational autoencoders are often used for such tasks. However, supervised ML techniques such as Siamese Neural Network (SNN) models can outperform unsupervised or semi-supervised approaches for anomaly detection by leveraging the label information. One of the challenges specific to anomaly detection for particle accelerators is the data's variability due to system configuration changes. To address this challenge, we employ Conditional Siamese Neural Network (CSNN) models and Conditional Variational Auto Encoder (CVAE) models to predict errant beam pulses at the Spallation Neutron Source (SNS) under different system configuration conditions and compare their performance. We demonstrate that CSNN outperforms CVAE in our application.
△ Less
Submitted 19 February, 2024; v1 submitted 22 November, 2023;
originally announced December 2023.
-
Dataset for Investigating Anomalies in Compute Clusters
Authors:
Diana McSpadden,
Yasir Alanazi,
Bryan Hess,
Laura Hild,
Mark Jones,
Yiyang Lub,
Ahmed Mohammed,
Wesley Moore,
Jie Ren,
Malachi Schram,
Evgenia Smirni
Abstract:
The dataset was collected for 332 compute nodes throughout May 19 - 23, 2023. May 19 - 22 characterizes normal compute cluster behavior, while May 23 includes an anomalous event. The dataset includes eight CPU, 11 disk, 47 memory, and 22 Slurm metrics. It represents five distinct hardware configurations and contains over one million records, totaling more than 180GB of raw data.
The dataset was collected for 332 compute nodes throughout May 19 - 23, 2023. May 19 - 22 characterizes normal compute cluster behavior, while May 23 includes an anomalous event. The dataset includes eight CPU, 11 disk, 47 memory, and 22 Slurm metrics. It represents five distinct hardware configurations and contains over one million records, totaling more than 180GB of raw data.
△ Less
Submitted 31 October, 2023;
originally announced November 2023.
-
Semi-Supervised Learning of Dynamical Systems with Neural Ordinary Differential Equations: A Teacher-Student Model Approach
Authors:
Yu Wang,
Yuxuan Yin,
Karthik Somayaji Nanjangud Suryanarayana,
Jan Drgona,
Malachi Schram,
Mahantesh Halappanavar,
Frank Liu,
Peng Li
Abstract:
Modeling dynamical systems is crucial for a wide range of tasks, but it remains challenging due to complex nonlinear dynamics, limited observations, or lack of prior knowledge. Recently, data-driven approaches such as Neural Ordinary Differential Equations (NODE) have shown promising results by leveraging the expressive power of neural networks to model unknown dynamics. However, these approaches…
▽ More
Modeling dynamical systems is crucial for a wide range of tasks, but it remains challenging due to complex nonlinear dynamics, limited observations, or lack of prior knowledge. Recently, data-driven approaches such as Neural Ordinary Differential Equations (NODE) have shown promising results by leveraging the expressive power of neural networks to model unknown dynamics. However, these approaches often suffer from limited labeled training data, leading to poor generalization and suboptimal predictions. On the other hand, semi-supervised algorithms can utilize abundant unlabeled data and have demonstrated good performance in classification and regression tasks. We propose TS-NODE, the first semi-supervised approach to modeling dynamical systems with NODE. TS-NODE explores cheaply generated synthetic pseudo rollouts to broaden exploration in the state space and to tackle the challenges brought by lack of ground-truth system data under a teacher-student model. TS-NODE employs an unified optimization framework that corrects the teacher model based on the student's feedback while mitigating the potential false system dynamics present in pseudo rollouts. TS-NODE demonstrates significant performance improvements over a baseline Neural ODE model on multiple dynamical system modeling tasks.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Uncertainty Aware Deep Learning for Particle Accelerators
Authors:
Kishansingh Rajput,
Malachi Schram,
Karthik Somayaji
Abstract:
Standard deep learning models for classification and regression applications are ideal for capturing complex system dynamics. However, their predictions can be arbitrarily inaccurate when the input samples are not similar to the training data. Implementation of distance aware uncertainty estimation can be used to detect these scenarios and provide a level of confidence associated with their predic…
▽ More
Standard deep learning models for classification and regression applications are ideal for capturing complex system dynamics. However, their predictions can be arbitrarily inaccurate when the input samples are not similar to the training data. Implementation of distance aware uncertainty estimation can be used to detect these scenarios and provide a level of confidence associated with their predictions. In this paper, we present results from using Deep Gaussian Process Approximation (DGPA) methods for errant beam prediction at Spallation Neutron Source (SNS) accelerator (classification) and we provide an uncertainty aware surrogate model for the Fermi National Accelerator Lab (FNAL) Booster Accelerator Complex (regression).
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Extreme Risk Mitigation in Reinforcement Learning using Extreme Value Theory
Authors:
Karthik Somayaji NS,
Yu Wang,
Malachi Schram,
Jan Drgona,
Mahantesh Halappanavar,
Frank Liu,
Peng Li
Abstract:
Risk-sensitive reinforcement learning (RL) has garnered significant attention in recent years due to the growing interest in deploying RL agents in real-world scenarios. A critical aspect of risk awareness involves modeling highly rare risk events (rewards) that could potentially lead to catastrophic outcomes. These infrequent occurrences present a formidable challenge for data-driven methods aimi…
▽ More
Risk-sensitive reinforcement learning (RL) has garnered significant attention in recent years due to the growing interest in deploying RL agents in real-world scenarios. A critical aspect of risk awareness involves modeling highly rare risk events (rewards) that could potentially lead to catastrophic outcomes. These infrequent occurrences present a formidable challenge for data-driven methods aiming to capture such risky events accurately. While risk-aware RL techniques do exist, their level of risk aversion heavily relies on the precision of the state-action value function estimation when modeling these rare occurrences. Our work proposes to enhance the resilience of RL agents when faced with very rare and risky events by focusing on refining the predictions of the extreme values predicted by the state-action value function distribution. To achieve this, we formulate the extreme values of the state-action value function distribution as parameterized distributions, drawing inspiration from the principles of extreme value theory (EVT). This approach effectively addresses the issue of infrequent occurrence by leveraging EVT-based parameterization. Importantly, we theoretically demonstrate the advantages of employing these parameterized distributions in contrast to other risk-averse algorithms. Our evaluations show that the proposed method outperforms other risk averse RL algorithms on a diverse range of benchmark tasks, each encompassing distinct risk scenarios.
△ Less
Submitted 24 August, 2023;
originally announced August 2023.
-
A comparison of machine learning surrogate models of street-scale flooding in Norfolk, Virginia
Authors:
Diana McSpadden,
Steven Goldenberg,
Binata Roy,
Malachi Schram,
Jonathan L. Goodall,
Heather Richter
Abstract:
Low-lying coastal cities, exemplified by Norfolk, Virginia, face the challenge of street flooding caused by rainfall and tides, which strain transportation and sewer systems and can lead to property damage. While high-fidelity, physics-based simulations provide accurate predictions of urban pluvial flooding, their computational complexity renders them unsuitable for real-time applications. Using d…
▽ More
Low-lying coastal cities, exemplified by Norfolk, Virginia, face the challenge of street flooding caused by rainfall and tides, which strain transportation and sewer systems and can lead to property damage. While high-fidelity, physics-based simulations provide accurate predictions of urban pluvial flooding, their computational complexity renders them unsuitable for real-time applications. Using data from Norfolk rainfall events between 2016 and 2018, this study compares the performance of a previous surrogate model based on a random forest algorithm with two deep learning models: Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU). This investigation underscores the importance of using a model architecture that supports the communication of prediction uncertainty and the effective integration of relevant, multi-modal features.
△ Less
Submitted 26 July, 2023;
originally announced July 2023.
-
Artificial Intelligence for the Electron Ion Collider (AI4EIC)
Authors:
C. Allaire,
R. Ammendola,
E. -C. Aschenauer,
M. Balandat,
M. Battaglieri,
J. Bernauer,
M. Bondì,
N. Branson,
T. Britton,
A. Butter,
I. Chahrour,
P. Chatagnon,
E. Cisbani,
E. W. Cline,
S. Dash,
C. Dean,
W. Deconinck,
A. Deshpande,
M. Diefenthaler,
R. Ent,
C. Fanelli,
M. Finger,
M. Finger, Jr.,
E. Fol,
S. Furletov
, et al. (70 additional authors not shown)
Abstract:
The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took…
▽ More
The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took place, centered on exploring all current and prospective application areas of AI for the EIC. This workshop is not only beneficial for the EIC, but also provides valuable insights for the newly established ePIC collaboration at EIC. This paper summarizes the different activities and R&D projects covered across the sessions of the workshop and provides an overview of the goals, approaches and strategies regarding AI/ML in the EIC community, as well as cutting-edge techniques currently studied in other experiments.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Distance Preserving Machine Learning for Uncertainty Aware Accelerator Capacitance Predictions
Authors:
Steven Goldenberg,
Malachi Schram,
Kishansingh Rajput,
Thomas Britton,
Chris Pappas,
Dan Lu,
Jared Walden,
Majdi I. Radaideh,
Sarah Cousineau,
Sudarshan Harave
Abstract:
Providing accurate uncertainty estimations is essential for producing reliable machine learning models, especially in safety-critical applications such as accelerator systems. Gaussian process models are generally regarded as the gold standard method for this task, but they can struggle with large, high-dimensional datasets. Combining deep neural networks with Gaussian process approximation techni…
▽ More
Providing accurate uncertainty estimations is essential for producing reliable machine learning models, especially in safety-critical applications such as accelerator systems. Gaussian process models are generally regarded as the gold standard method for this task, but they can struggle with large, high-dimensional datasets. Combining deep neural networks with Gaussian process approximation techniques have shown promising results, but dimensionality reduction through standard deep neural network layers is not guaranteed to maintain the distance information necessary for Gaussian process models. We build on previous work by comparing the use of the singular value decomposition against a spectral-normalized dense layer as a feature extractor for a deep neural Gaussian process approximation model and apply it to a capacitance prediction problem for the High Voltage Converter Modulators in the Oak Ridge Spallation Neutron Source. Our model shows improved distance preservation and predicts in-distribution capacitance values with less than 1% error.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
Multi-module based CVAE to predict HVCM faults in the SNS accelerator
Authors:
Yasir Alanazi,
Malachi Schram,
Kishansingh Rajput,
Steven Goldenberg,
Lasitha Vidyaratne,
Chris Pappas,
Majdi I. Radaideh,
Dan Lu,
Pradeep Ramuhalli,
Sarah Cousineau
Abstract:
We present a multi-module framework based on Conditional Variational Autoencoder (CVAE) to detect anomalies in the power signals coming from multiple High Voltage Converter Modulators (HVCMs). We condition the model with the specific modulator type to capture different representations of the normal waveforms and to improve the sensitivity of the model to identify a specific type of fault when we h…
▽ More
We present a multi-module framework based on Conditional Variational Autoencoder (CVAE) to detect anomalies in the power signals coming from multiple High Voltage Converter Modulators (HVCMs). We condition the model with the specific modulator type to capture different representations of the normal waveforms and to improve the sensitivity of the model to identify a specific type of fault when we have limited samples for a given module type. We studied several neural network (NN) architectures for our CVAE model and evaluated the model performance by looking at their loss landscape for stability and generalization. Our results for the Spallation Neutron Source (SNS) experimental data show that the trained model generalizes well to detecting multiple fault types for several HVCM module types. The results of this study can be used to improve the HVCM reliability and overall SNS uptime
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
Machine Learning in Nuclear Physics
Authors:
Amber Boehnlein,
Markus Diefenthaler,
Cristiano Fanelli,
Morten Hjorth-Jensen,
Tanja Horn,
Michelle P. Kuchera,
Dean Lee,
Witold Nazarewicz,
Kostas Orginos,
Peter Ostroumov,
Long-Gang Pang,
Alan Poon,
Nobuo Sato,
Malachi Schram,
Alexander Scheinker,
Michael S. Smith,
Xin-Nian Wang,
Veronique Ziegler
Abstract:
Advances in machine learning methods provide tools that have broad applicability in scientific research. These techniques are being applied across the diversity of nuclear physics research topics, leading to advances that will facilitate scientific discoveries and societal applications.
This Review gives a snapshot of nuclear physics research which has been transformed by machine learning techni…
▽ More
Advances in machine learning methods provide tools that have broad applicability in scientific research. These techniques are being applied across the diversity of nuclear physics research topics, leading to advances that will facilitate scientific discoveries and societal applications.
This Review gives a snapshot of nuclear physics research which has been transformed by machine learning techniques.
△ Less
Submitted 2 May, 2022; v1 submitted 4 December, 2021;
originally announced December 2021.
-
Uncertainty aware anomaly detection to predict errant beam pulses in the SNS accelerator
Authors:
Willem Blokland,
Pradeep Ramuhalli,
Charles Peters,
Yigit Yucesan,
Alexander Zhukov,
Malachi Schram,
Kishansingh Rajput,
Torri Jeske
Abstract:
High-power particle accelerators are complex machines with thousands of pieces of equipmentthat are frequently running at the cutting edge of technology. In order to improve the day-to-dayoperations and maximize the delivery of the science, new analytical techniques are being exploredfor anomaly detection, classification, and prognostications. As such, we describe the applicationof an uncertainty…
▽ More
High-power particle accelerators are complex machines with thousands of pieces of equipmentthat are frequently running at the cutting edge of technology. In order to improve the day-to-dayoperations and maximize the delivery of the science, new analytical techniques are being exploredfor anomaly detection, classification, and prognostications. As such, we describe the applicationof an uncertainty aware Machine Learning method, the Siamese neural network model, to predictupcoming errant beam pulses using the data from a single monitoring device. By predicting theupcoming failure, we can stop the accelerator before damage occurs. We describe the acceleratoroperation, related Machine Learning research, the prediction performance required to abort beamwhile maintaining operations, the monitoring device and its data, and the Siamese method andits results. These results show that the researched method can be applied to improve acceleratoroperations.
△ Less
Submitted 22 October, 2021;
originally announced October 2021.
-
Direct Classification of Type 2 Diabetes From Retinal Fundus Images in a Population-based Sample From The Maastricht Study
Authors:
Friso G. Heslinga,
Josien P. W. Pluim,
A. J. H. M. Houben,
Miranda T. Schram,
Ronald M. A. Henry,
Coen D. A. Stehouwer,
Marleen J. van Greevenbroek,
Tos T. J. M. Berendschot,
Mitko Veta
Abstract:
Type 2 Diabetes (T2D) is a chronic metabolic disorder that can lead to blindness and cardiovascular disease. Information about early stage T2D might be present in retinal fundus images, but to what extent these images can be used for a screening setting is still unknown. In this study, deep neural networks were employed to differentiate between fundus images from individuals with and without T2D.…
▽ More
Type 2 Diabetes (T2D) is a chronic metabolic disorder that can lead to blindness and cardiovascular disease. Information about early stage T2D might be present in retinal fundus images, but to what extent these images can be used for a screening setting is still unknown. In this study, deep neural networks were employed to differentiate between fundus images from individuals with and without T2D. We investigated three methods to achieve high classification performance, measured by the area under the receiver operating curve (ROC-AUC). A multi-target learning approach to simultaneously output retinal biomarkers as well as T2D works best (AUC = 0.746 [$\pm$0.001]). Furthermore, the classification performance can be improved when images with high prediction uncertainty are referred to a specialist. We also show that the combination of images of the left and right eye per individual can further improve the classification performance (AUC = 0.758 [$\pm$0.003]), using a simple averaging approach. The results are promising, suggesting the feasibility of screening for T2D from retinal fundus images.
△ Less
Submitted 22 November, 2019;
originally announced November 2019.
-
Deep Learning on Operational Facility Data Related to Large-Scale Distributed Area Scientific Workflows
Authors:
Alok Singh,
Eric Stephan,
Malachi Schram,
Ilkay Altintas
Abstract:
Distributed computing platforms provide a robust mechanism to perform large-scale computations by splitting the task and data among multiple locations, possibly located thousands of miles apart geographically. Although such distribution of resources can lead to benefits, it also comes with its associated problems such as rampant duplication of file transfers increasing congestion, long job complet…
▽ More
Distributed computing platforms provide a robust mechanism to perform large-scale computations by splitting the task and data among multiple locations, possibly located thousands of miles apart geographically. Although such distribution of resources can lead to benefits, it also comes with its associated problems such as rampant duplication of file transfers increasing congestion, long job completion times, unexpected site crashing, suboptimal data transfer rates, unpredictable reliability in a time range, and suboptimal usage of storage elements. In addition, each sub-system becomes a potential failure node that can trigger system wide disruptions. In this vision paper, we outline our approach to leveraging Deep Learning algorithms to discover solutions to unique problems that arise in a system with computational infrastructure that is spread over a wide area. The presented vision, motivated by a real scientific use case from Belle II experiments, is to develop multilayer neural networks to tackle forecasting, anomaly detection and optimization challenges in a complex and distributed data movement environment. Through this vision based on Deep Learning principles, we aim to achieve reduced congestion events, faster file transfer rates, and enhanced site reliability.
△ Less
Submitted 20 April, 2018; v1 submitted 17 April, 2018;
originally announced April 2018.