Skip to main content

Showing 1–50 of 68 results for author: Balaprakash, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12909  [pdf, other

    cs.LG physics.comp-ph

    Scalable Training of Graph Foundation Models for Atomistic Materials Modeling: A Case Study with HydraGNN

    Authors: Massimiliano Lupo Pasini, Jong Youl Choi, Kshitij Mehta, Pei Zhang, David Rogers, Jonghyun Bae, Khaled Z. Ibrahim, Ashwin M. Aji, Karl W. Schulz, Jorda Polo, Prasanna Balaprakash

    Abstract: We present our work on develo** and training scalable graph foundation models (GFM) using HydraGNN, a multi-headed graph convolutional neural network architecture. HydraGNN expands the boundaries of graph neural network (GNN) in both training scale and data diversity. It abstracts over message passing algorithms, allowing both reproduction of and comparison across algorithmic innovations that de… ▽ More

    Submitted 28 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 16 pages, 13 figures

    MSC Class: 68T07; 68T09 ACM Class: C.2.4; I.2.11

  2. arXiv:2405.15780  [pdf, other

    cs.CV cs.LG

    Sequence Length Scaling in Vision Transformers for Scientific Images on Frontier

    Authors: Aristeidis Tsaris, Chengming Zhang, Xiao Wang, Junqi Yin, Siyan Liu, Moetasim Ashfaq, Ming Fan, Jong Youl Choi, Mohamed Wahib, Dan Lu, Prasanna Balaprakash, Feiyi Wang

    Abstract: Vision Transformers (ViTs) are pivotal for foundational models in scientific imagery, including Earth science applications, due to their capability to process large sequence lengths. While transformers for text has inspired scaling sequence lengths in ViTs, yet adapting these for ViTs introduces unique challenges. We develop distributed sequence parallelism for ViTs, enabling them to handle up to… ▽ More

    Submitted 17 April, 2024; originally announced May 2024.

  3. arXiv:2405.10389  [pdf, other

    eess.SY cs.LG

    Physics-Informed Heterogeneous Graph Neural Networks for DC Blocker Placement

    Authors: Hongwei **, Prasanna Balaprakash, Allen Zou, Pieter Ghysels, Aditi S. Krishnapriyan, Adam Mate, Arthur Barnes, Russell Bent

    Abstract: The threat of geomagnetic disturbances (GMDs) to the reliable operation of the bulk energy system has spurred the development of effective strategies for mitigating their impacts. One such approach involves placing transformer neutral blocking devices, which interrupt the path of geomagnetically induced currents (GICs) to limit their impact. The high cost of these devices and the sparsity of trans… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Paper is accepted by PSCC 2024

  4. arXiv:2405.06133  [pdf, other

    cs.DC

    Advancing Anomaly Detection in Computational Workflows with Active Learning

    Authors: Krishnan Raghavan, George Papadimitriou, Hongwei **, Anirban Mandal, Mariam Kiran, Prasanna Balaprakash, Ewa Deelman

    Abstract: A computational workflow, also known as workflow, consists of tasks that are executed in a certain order to attain a specific computational campaign. Computational workflows are commonly employed in science domains, such as physics, chemistry, genomics, to complete large-scale experiments in distributed and heterogeneous computing environments. However, running computations at such a large scale m… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  5. arXiv:2404.14712  [pdf, other

    physics.ao-ph cs.AI cs.DC eess.IV physics.geo-ph

    ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability

    Authors: Xiao Wang, Aristeidis Tsaris, Siyan Liu, Jong-Youl Choi, Ming Fan, Wei Zhang, Junqi Yin, Moetasim Ashfaq, Dan Lu, Prasanna Balaprakash

    Abstract: Earth system predictability is challenged by the complexity of environmental dynamics and the multitude of variables involved. Current AI foundation models, although advanced by leveraging large and heterogeneous data, are often constrained by their size and data integration, limiting their effectiveness in addressing the full range of Earth system prediction challenges. To overcome these limitati… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  6. arXiv:2404.10689  [pdf, other

    cs.LG eess.SP

    Network architecture search of X-ray based scientific applications

    Authors: Adarsha Balaji, Ramyad Hadidi, Gregory Kollmer, Mohammed E. Fouda, Prasanna Balaprakash

    Abstract: X-ray and electron diffraction-based microscopy use bragg peak detection and ptychography to perform 3-D imaging at an atomic resolution. Typically, these techniques are implemented using computationally complex tasks such as a Psuedo-Voigt function or solving a complex inverse problem. Recently, the use of deep neural networks has improved the existing state-of-the-art approaches. However, the de… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  7. arXiv:2404.09703  [pdf, other

    cs.LG stat.ML

    AI Competitions and Benchmarks: Dataset Development

    Authors: Romain Egele, Julio C. S. Jacques Junior, Jan N. van Rijn, Isabelle Guyon, Xavier Baró, Albert Clapés, Prasanna Balaprakash, Sergio Escalera, Thomas Moeslund, Jun Wan

    Abstract: Machine learning is now used in many applications thanks to its ability to predict, generate, or discover patterns from large quantities of data. However, the process of collecting and transforming data for practical use is intricate. Even in today's digital era, where substantial data is generated daily, it is uncommon for it to be readily usable; most often, it necessitates meticulous manual dat… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Preprint version of the 3rd Chapter of the book: Competitions and Benchmarks, the science behind the contests (https://sites.google.com/chalearn.org/book/home)

  8. arXiv:2404.05768  [pdf, other

    cs.LG physics.ao-ph stat.ML

    Streamlining Ocean Dynamics Modeling with Fourier Neural Operators: A Multiobjective Hyperparameter and Architecture Optimization Approach

    Authors: Yixuan Sun, Ololade Sowunmi, Romain Egele, Sri Hari Krishna Narayanan, Luke Van Roekel, Prasanna Balaprakash

    Abstract: Training an effective deep learning model to learn ocean processes involves careful choices of various hyperparameters. We leverage the advanced search algorithms for multiobjective optimization in DeepHyper, a scalable hyperparameter optimization software, to streamline the development of neural networks tailored for ocean modeling. The focus is on optimizing Fourier neural operators (FNOs), a da… ▽ More

    Submitted 10 April, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

  9. arXiv:2404.04111  [pdf, other

    cs.LG

    The Unreasonable Effectiveness Of Early Discarding After One Epoch In Neural Network Hyperparameter Optimization

    Authors: Romain Egele, Felix Mohr, Tom Viering, Prasanna Balaprakash

    Abstract: To reach high performance with deep learning, hyperparameter optimization (HPO) is essential. This process is usually time-consuming due to costly evaluations of neural networks. Early discarding techniques limit the resources granted to unpromising candidates by observing the empirical learning curves and canceling neural network training as soon as the lack of competitiveness of a candidate beco… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  10. arXiv:2402.09222  [pdf, other

    cs.PF

    Integrating ytopt and libEnsemble to Autotune OpenMC

    Authors: Xingfu Wu, John R. Tramm, Jeffrey Larson, John-Luke Navarro, Prasanna Balaprakash, Brice Videau, Michael Kruse, Paul Hovland, Valerie Taylor, Mary Hall

    Abstract: ytopt is a Python machine-learning-based autotuning software package developed within the ECP PROTEAS-TUNE project. The ytopt software adopts an asynchronous search framework that consists of sampling a small number of input parameter configurations and progressively fitting a surrogate model over the input-output space until exhausting the user-defined maximum number of evaluations or the wall-cl… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  11. Transfer-Learning-Based Autotuning Using Gaussian Copula

    Authors: Thomas Randall, Jaehoon Koo, Brice Videau, Michael Kruse, Xingfu Wu, Paul Hovland, Mary Hall, Rong Ge, Prasanna Balaprakash

    Abstract: As diverse high-performance computing (HPC) systems are built, many opportunities arise for applications to solve larger problems than ever before. Given the significantly increased complexity of these HPC systems and application tuning, empirical performance tuning, such as autotuning, has emerged as a promising approach in recent years. Despite its effectiveness, autotuning is often a computatio… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: 13 pages, 5 figures, 7 tables, the definitive version of this work is published in the Proceedings of the ACM International Conference on Supercomputing 2023, available at https://dl.acm.org/doi/10.1145/3577193.3593712

    ACM Class: I.2.4; G.3; D.2.8

    Journal ref: Proceedings of the 37th International Conference on Supercomputing (2023) 37-49

  12. arXiv:2312.12705  [pdf, other

    cs.DC cs.AI

    Optimizing Distributed Training on Frontier for Large Language Models

    Authors: Sajal Dash, Isaac Lyngaas, Junqi Yin, Xiao Wang, Romain Egele, Guo**g Cong, Feiyi Wang, Prasanna Balaprakash

    Abstract: Large language models (LLMs) have demonstrated remarkable success as foundational models, benefiting various downstream applications through fine-tuning. Recent studies on loss scaling have demonstrated the superior performance of larger LLMs compared to their smaller counterparts. Nevertheless, training LLMs with billions of parameters poses significant challenges and requires considerable comput… ▽ More

    Submitted 21 December, 2023; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: Edited the abstract to better communicate the scope of the work

  13. arXiv:2310.04610  [pdf, other

    cs.AI cs.LG

    DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

    Authors: Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri , et al. (67 additional authors not shown)

    Abstract: In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique… ▽ More

    Submitted 11 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

  14. arXiv:2310.01247  [pdf, other

    cs.LG cs.DC

    Self-supervised Learning for Anomaly Detection in Computational Workflows

    Authors: Hongwei **, Krishnan Raghavan, George Papadimitriou, Cong Wang, Anirban Mandal, Ewa Deelman, Prasanna Balaprakash

    Abstract: Anomaly detection is the task of identifying abnormal behavior of a system. Anomaly detection in computational workflows is of special interest because of its wide implications in various domains such as cybersecurity, finance, and social networks. However, anomaly detection in computational workflows~(often modeled as graphs) is a relatively unexplored problem and poses distinct challenges. For i… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  15. arXiv:2309.14936  [pdf, other

    cs.LG cs.DC

    Parallel Multi-Objective Hyperparameter Optimization with Uniform Normalization and Bounded Objectives

    Authors: Romain Egele, Tyler Chang, Yixuan Sun, Venkatram Vishwanath, Prasanna Balaprakash

    Abstract: Machine learning (ML) methods offer a wide range of configurable hyperparameters that have a significant influence on their performance. While accuracy is a commonly used performance objective, in many settings, it is not sufficient. Optimizing the ML models with respect to multiple objectives such as accuracy, confidence, fairness, calibration, privacy, latency, and memory consumption is becoming… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: Preprint with appendices

  16. arXiv:2309.07103  [pdf, other

    cs.SE cs.AI cs.DC cs.PL

    Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

    Authors: Pedro Valero-Lara, Alexis Huante, Mustafa Al Lail, William F. Godoy, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter

    Abstract: We evaluate the use of the open-source Llama-2 model for generating well-known, high-performance computing kernels (e.g., AXPY, GEMV, GEMM) on different parallel programming models and languages (e.g., C++: OpenMP, OpenMP Offload, OpenACC, CUDA, HIP; Fortran: OpenMP, OpenMP Offload, OpenACC; Python: numpy, Numba, pyCUDA, cuPy; and Julia: Threads, CUDA.jl, AMDGPU.jl). We built upon our previous wor… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: Accepted at LCPC 2023, The 36th International Workshop on Languages and Compilers for Parallel Computing http://www.lcpcworkshop.org/LCPC23/ . 13 pages, 5 figures, 1 table

  17. arXiv:2308.04539  [pdf, other

    cs.LG cs.AI cs.NE

    Improving Performance in Continual Learning Tasks using Bio-Inspired Architectures

    Authors: Sandeep Madireddy, Angel Yanguas-Gil, Prasanna Balaprakash

    Abstract: The ability to learn continuously from an incoming data stream without catastrophic forgetting is critical to designing intelligent systems. Many approaches to continual learning rely on stochastic gradient descent and its variants that employ global error updates, and hence need to adopt strategies such as memory buffers or replay to circumvent its stability, greed, and short-term memory limitati… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  18. arXiv:2307.15422  [pdf, other

    cs.LG

    Is One Epoch All You Need For Multi-Fidelity Hyperparameter Optimization?

    Authors: Romain Egele, Isabelle Guyon, Yixuan Sun, Prasanna Balaprakash

    Abstract: Hyperparameter optimization (HPO) is crucial for fine-tuning machine learning models but can be computationally expensive. To reduce costs, Multi-fidelity HPO (MF-HPO) leverages intermediate accuracy levels in the learning process and discards low-performing models early on. We compared various representative MF-HPO methods against a simple baseline on classical benchmark data. The baseline involv… ▽ More

    Submitted 26 September, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: 5 pages, with extended appendices

  19. arXiv:2307.10438  [pdf

    cs.LG physics.chem-ph q-bio.BM

    Uncertainty Quantification for Molecular Property Predictions with Graph Neural Architecture Search

    Authors: Shengli Jiang, Shiyi Qin, Reid C. Van Lehn, Prasanna Balaprakash, Victor M. Zavala

    Abstract: Graph Neural Networks (GNNs) have emerged as a prominent class of data-driven methods for molecular property prediction. However, a key limitation of typical GNN models is their inability to quantify uncertainties in the predictions. This capability is crucial for ensuring the trustworthy use and deployment of models in downstream tasks. To that end, we introduce AutoGNNUQ, an automated uncertaint… ▽ More

    Submitted 28 June, 2024; v1 submitted 19 July, 2023; originally announced July 2023.

  20. arXiv:2306.15121  [pdf, other

    cs.AI cs.ET cs.PL

    Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation

    Authors: William F. Godoy, Pedro Valero-Lara, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter

    Abstract: We evaluate AI-assisted generative capabilities on fundamental numerical kernels in high-performance computing (HPC), including AXPY, GEMV, GEMM, SpMV, Jacobi Stencil, and CG. We test the generated kernel codes for a variety of language-supported programming models, including (1) C++ (e.g., OpenMP [including offload], OpenACC, Kokkos, SyCL, CUDA, and HIP), (2) Fortran (e.g., OpenMP [including offl… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: Accepted at the Sixteenth International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), 2023 to be held in conjunction with ICPP 2023: The 52nd International Conference on Parallel Processing. 10 pages, 6 figures, 5 tables

  21. arXiv:2306.09930  [pdf, other

    cs.DC

    Flow-Bench: A Dataset for Computational Workflow Anomaly Detection

    Authors: George Papadimitriou, Hongwei **, Cong Wang, Rajiv Mayani, Krishnan Raghavan, Anirban Mandal, Prasanna Balaprakash, Ewa Deelman

    Abstract: A computational workflow, also known as workflow, consists of tasks that must be executed in a specific order to attain a specific goal. Often, in fields such as biology, chemistry, physics, and data science, among others, these workflows are complex and are executed in large-scale, distributed, and heterogeneous computing environments prone to failures and performance degradation. Therefore, anom… ▽ More

    Submitted 13 June, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: Work under review, updated with more workflow data

  22. arXiv:2305.12030  [pdf, other

    cs.LG cs.AI math.OC

    Learning Continually on a Sequence of Graphs -- The Dynamical System Way

    Authors: Krishnan Raghavan, Prasanna Balaprakash

    Abstract: Continual learning~(CL) is a field concerned with learning a series of inter-related task with the tasks typically defined in the sense of either regression or classification. In recent years, CL has been studied extensively when these tasks are defined using Euclidean data -- data, such as images, that can be described by a set of vectors in an n-dimensional real space. However, the literature is… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  23. arXiv:2303.16869  [pdf, other

    cs.CE cs.LG math.NA

    Application of probabilistic modeling and automated machine learning framework for high-dimensional stress field

    Authors: Lele Luan, Nesar Ramachandra, Sandipp Krishnan Ravi, Anindya Bhaduri, Piyush Pandita, Prasanna Balaprakash, Mihai Anitescu, Changjie Sun, Li** Wang

    Abstract: Modern computational methods, involving highly sophisticated mathematical formulations, enable several tasks like modeling complex physical phenomenon, predicting key properties and design optimization. The higher fidelity in these computer models makes it computationally intensive to query them hundreds of times for optimization and one usually relies on a simplified model albeit at the cost of l… ▽ More

    Submitted 11 April, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 17 pages, 16 figures, IDETC Conference Submission

  24. arXiv:2303.16245  [pdf, other

    cs.DC cs.LG cs.PF

    ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales

    Authors: Xingfu Wu, Prasanna Balaprakash, Michael Kruse, Jaehoon Koo, Brice Videau, Paul Hovland, Valerie Taylor, Brad Geltz, Siddhartha Jana, Mary Hall

    Abstract: As we enter the exascale computing era, efficiently utilizing power and optimizing the performance of scientific applications under power and energy constraints has become critical and challenging. We propose a low-overhead autotuning framework to autotune performance and energy for various hybrid MPI/OpenMP scientific applications at large scales and to explore the tradeoffs between application r… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Journal ref: to be pushilshed in CUG2023

  25. arXiv:2302.09748  [pdf, other

    cs.LG math.DS

    Quantifying uncertainty for deep learning based forecasting and flow-reconstruction using neural architecture search ensembles

    Authors: Romit Maulik, Romain Egele, Krishnan Raghavan, Prasanna Balaprakash

    Abstract: Classical problems in computational physics such as data-driven forecasting and signal reconstruction from sparse sensors have recently seen an explosion in deep neural network (DNN) based algorithmic approaches. However, most DNN models do not provide uncertainty estimates, which are crucial for establishing the trustworthiness of these techniques in downstream decision making tasks and scenarios… ▽ More

    Submitted 19 February, 2023; originally announced February 2023.

  26. arXiv:2302.01887  [pdf, other

    cs.LG

    Analyzing the impact of climate change on critical infrastructure from the scientific literature: A weakly supervised NLP approach

    Authors: Tanwi Mallick, Joshua David Bergerson, Duane R. Verner, John K Hutchison, Leslie-Anne Levy, Prasanna Balaprakash

    Abstract: Natural language processing (NLP) is a promising approach for analyzing large volumes of climate-change and infrastructure-related scientific literature. However, best-in-practice NLP techniques require large collections of relevant documents (corpus). Furthermore, NLP techniques using machine learning and deep learning techniques require labels grou** the articles based on user-defined criteria… ▽ More

    Submitted 5 February, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

  27. arXiv:2210.04083  [pdf, other

    cs.LG cs.AI

    Unified Probabilistic Neural Architecture and Weight Ensembling Improves Model Robustness

    Authors: Sumegha Premchandar, Sandeep Madireddy, Sanket Jantre, Prasanna Balaprakash

    Abstract: Robust machine learning models with accurately calibrated uncertainties are crucial for safety-critical applications. Probabilistic machine learning and especially the Bayesian formalism provide a systematic framework to incorporate robustness through the distributional estimates and reason about uncertainty. Recent works have shown that approximate inference approaches that take the weight space… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

  28. HPC Storage Service Autotuning Using Variational-Autoencoder-Guided Asynchronous Bayesian Optimization

    Authors: Matthieu Dorier, Romain Egele, Prasanna Balaprakash, Jaehoon Koo, Sandeep Madireddy, Srinivasan Ramesh, Allen D. Malony, Rob Ross

    Abstract: Distributed data storage services tailored to specific applications have grown popular in the high-performance computing (HPC) community as a way to address I/O and storage challenges. These services offer a variety of specific interfaces, semantics, and data representations. They also expose many tuning parameters, making it difficult for their users to find the best configuration for a given wor… ▽ More

    Submitted 3 October, 2022; originally announced October 2022.

    Comments: Accepted at IEEE Cluster 2022

  29. arXiv:2209.13123  [pdf, other

    cs.LG

    Explainable Graph Pyramid Autoformer for Long-Term Traffic Forecasting

    Authors: Weiheng Zhong, Tanwi Mallick, Hadi Meidani, Jane Macfarlane, Prasanna Balaprakash

    Abstract: Accurate traffic forecasting is vital to an intelligent transportation system. Although many deep learning models have achieved state-of-art performance for short-term traffic forecasting of up to 1 hour, long-term traffic forecasting that spans multiple hours remains a major challenge. Moreover, most of the existing deep learning traffic forecasting models are black box, presenting additional cha… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

  30. arXiv:2207.00479  [pdf, other

    cs.LG

    Asynchronous Decentralized Bayesian Optimization for Large Scale Hyperparameter Optimization

    Authors: Romain Egele, Isabelle Guyon, Venkatram Vishwanath, Prasanna Balaprakash

    Abstract: Bayesian optimization (BO) is a promising approach for hyperparameter optimization of deep neural networks (DNNs), where each model training can take minutes to hours. In BO, a computationally cheap surrogate model is employed to learn the relationship between parameter configurations and their performance such as accuracy. Parallel BO methods often adopt single manager/multiple workers strategies… ▽ More

    Submitted 26 September, 2023; v1 submitted 1 July, 2022; originally announced July 2022.

  31. arXiv:2206.05165  [pdf, other

    cs.LG cs.AI

    Multifidelity Reinforcement Learning with Control Variates

    Authors: Sami Khairy, Prasanna Balaprakash

    Abstract: In many computational science and engineering applications, the output of a system of interest corresponding to a given input can be queried at different levels of fidelity with different costs. Typically, low-fidelity data is cheap and abundant, while high-fidelity data is expensive and scarce. In this work we study the reinforcement learning (RL) problem in the presence of multiple environments… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

    Comments: Preprint. Under review

  32. arXiv:2206.00794  [pdf, other

    stat.ML cs.LG math.ST

    Sequential Bayesian Neural Subnetwork Ensembles

    Authors: Sanket Jantre, Sandeep Madireddy, Shrijita Bhattacharya, Tapabrata Maiti, Prasanna Balaprakash

    Abstract: Deep neural network ensembles that appeal to model diversity have been used successfully to improve predictive performance and model robustness in several applications. Whereas, it has recently been shown that sparse subnetworks of dense models can match the performance of their dense counterparts and increase their robustness while effectively decreasing the model complexity. However, most ensemb… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  33. arXiv:2204.08180  [pdf, other

    cs.DC cs.PF

    A Taxonomy of Error Sources in HPC I/O Machine Learning Models

    Authors: Mihailo Isakov, Mikaela Currier, Eliakin del Rosario, Sandeep Madireddy, Prasanna Balaprakash, Philip Carns, Robert B. Ross, Glenn K. Lockwood, Michel A. Kinsy

    Abstract: I/O efficiency is crucial to productivity in scientific computing, but the increasing complexity of the system and the applications makes it difficult for practitioners to understand and optimize I/O behavior at scale. Data-driven machine learning-based I/O throughput models offer a solution: they can be used to identify bottlenecks, automate I/O tuning, or optimize job scheduling with minimal hum… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Report number: STAM01

  34. arXiv:2204.01618  [pdf, other

    cs.LG

    Deep-Ensemble-Based Uncertainty Quantification in Spatiotemporal Graph Neural Networks for Traffic Forecasting

    Authors: Tanwi Mallick, Prasanna Balaprakash, Jane Macfarlane

    Abstract: Deep-learning-based data-driven forecasting methods have produced impressive results for traffic forecasting. A major limitation of these methods, however, is that they provide forecasts without estimates of uncertainty, which are critical for real-time deployments. We focus on a diffusion convolutional recurrent neural network (DCRNN), a state-of-the-art method for short-term traffic forecasting.… ▽ More

    Submitted 5 April, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

  35. Stabilized Neural Ordinary Differential Equations for Long-Time Forecasting of Dynamical Systems

    Authors: Alec J. Linot, Joshua W. Burby, Qi Tang, Prasanna Balaprakash, Michael D. Graham, Romit Maulik

    Abstract: In data-driven modeling of spatiotemporal phenomena careful consideration often needs to be made in capturing the dynamics of the high wavenumbers. This problem becomes especially challenging when the system of interest exhibits shocks or chaotic dynamics. We present a data-driven modeling method that accurately captures shocks and chaotic dynamics by proposing a novel architecture, stabilized neu… ▽ More

    Submitted 3 October, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

  36. arXiv:2203.02592  [pdf, other

    stat.ML cs.LG stat.ME

    Sparsity-Inducing Categorical Prior Improves Robustness of the Information Bottleneck

    Authors: Anirban Samaddar, Sandeep Madireddy, Prasanna Balaprakash, Tapabrata Maiti, Gustavo de los Campos, Ian Fischer

    Abstract: The information bottleneck framework provides a systematic approach to learning representations that compress nuisance information in the input and extract semantically meaningful information about predictions. However, the choice of a prior distribution that fixes the dimensionality across all the data can restrict the flexibility of this approach for learning robust representations. We present a… ▽ More

    Submitted 27 October, 2022; v1 submitted 4 March, 2022; originally announced March 2022.

  37. arXiv:2202.11170  [pdf, other

    cs.LG physics.flu-dyn

    Multi-fidelity reinforcement learning framework for shape optimization

    Authors: Sahil Bhola, Suraj Pawar, Prasanna Balaprakash, Romit Maulik

    Abstract: Deep reinforcement learning (DRL) is a promising outer-loop intelligence paradigm which can deploy problem solving strategies for complex tasks. Consequently, DRL has been utilized for several scientific applications, specifically in cases where classical optimization or control methods are limited. One key limitation of conventional DRL methods is their episode-hungry nature which proves to be a… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  38. arXiv:2112.09792  [pdf, other

    cs.LG

    A data-centric weak supervised learning for highway traffic incident detection

    Authors: Yixuan Sun, Tanwi Mallick, Prasanna Balaprakash, Jane Macfarlane

    Abstract: Using the data from loop detector sensors for near-real-time detection of traffic incidents in highways is crucial to averting major traffic congestion. While recent supervised machine learning methods offer solutions to incident detection by leveraging human-labeled incident data, the false alarm rate is often too high to be used in practice. Specifically, the inconsistency in the human labeling… ▽ More

    Submitted 2 August, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

  39. arXiv:2111.10489  [pdf, other

    math.OC cs.LG

    Modeling Design and Control Problems Involving Neural Network Surrogates

    Authors: Dominic Yang, Prasanna Balaprakash, Sven Leyffer

    Abstract: We consider nonlinear optimization problems that involve surrogate models represented by neural networks. We demonstrate first how to directly embed neural network evaluation into optimization models, highlight a difficulty with this approach that can prevent convergence, and then characterize stationarity of such models. We then present two alternative formulations of these problems in the specif… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

    Comments: 24 Pages, 11 Figures

  40. arXiv:2110.13511  [pdf, other

    cs.LG

    AutoDEUQ: Automated Deep Ensemble with Uncertainty Quantification

    Authors: Romain Egele, Romit Maulik, Krishnan Raghavan, Bethany Lusch, Isabelle Guyon, Prasanna Balaprakash

    Abstract: Deep neural networks are powerful predictors for a variety of tasks. However, they do not capture uncertainty directly. Using neural network ensembles to quantify uncertainty is competitive with approaches based on Bayesian neural networks while benefiting from better computational scalability. However, building ensembles of neural networks is a challenging task because, in addition to choosing th… ▽ More

    Submitted 4 July, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

  41. arXiv:2109.14053  [pdf, other

    physics.app-ph cond-mat.mtrl-sci cs.AI cs.CV

    AutoPhaseNN: Unsupervised Physics-aware Deep Learning of 3D Nanoscale Bragg Coherent Diffraction Imaging

    Authors: Yudong Yao, Henry Chan, Subramanian Sankaranarayanan, Prasanna Balaprakash, Ross J. Harder, Mathew J. Cherukara

    Abstract: The problem of phase retrieval, or the algorithmic recovery of lost phase information from measured intensity alone, underlies various imaging methods from astronomy to nanoscale imaging. Traditional methods of phase retrieval are iterative in nature, and are therefore computationally expensive and time consuming. More recently, deep learning (DL) models have been developed to either provide learn… ▽ More

    Submitted 4 April, 2022; v1 submitted 28 September, 2021; originally announced September 2021.

    MSC Class: 68T07; 00A79

  42. arXiv:2109.14035  [pdf, other

    cs.LG cs.AI math.DS math.OC

    Formalizing the Generalization-Forgetting Trade-off in Continual Learning

    Authors: Krishnan Raghavan, Prasanna Balaprakash

    Abstract: We formulate the continual learning (CL) problem via dynamic programming and model the trade-off between catastrophic forgetting and generalization as a two-player sequential game. In this approach, player 1 maximizes the cost due to lack of generalization whereas player 2 minimizes the cost due to catastrophic forgetting. We show theoretically that a balance point between the two players exists f… ▽ More

    Submitted 24 February, 2022; v1 submitted 28 September, 2021; originally announced September 2021.

  43. arXiv:2105.04555  [pdf

    cs.PL cs.AI cs.DC cs.LG cs.PF

    Customized Monte Carlo Tree Search for LLVM/Polly's Composable Loop Optimization Transformations

    Authors: Jaehoon Koo, Prasanna Balaprakash, Michael Kruse, Xingfu Wu, Paul Hovland, Mary Hall

    Abstract: Polly is the LLVM project's polyhedral loop nest optimizer. Recently, user-directed loop transformation pragmas were proposed based on LLVM/Clang and Polly. The search space exposed by the transformation pragmas is a tree, wherein each node represents a specific combination of loop transformations that can be applied to the code resulting from the parent node's loop transformations. We have develo… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

  44. arXiv:2104.13242  [pdf, other

    cs.LG cs.PF

    Autotuning PolyBench Benchmarks with LLVM Clang/Polly Loop Optimization Pragmas Using Bayesian Optimization (extended version)

    Authors: Xingfu Wu, Michael Kruse, Prasanna Balaprakash, Hal Finkel, Paul Hovland, Valerie Taylor, Mary Hall

    Abstract: In this paper, we develop a ytopt autotuning framework that leverages Bayesian optimization to explore the parameter space search and compare four different supervised learning methods within Bayesian optimization and evaluate their effectiveness. We select six of the most complex PolyBench benchmarks and apply the newly developed LLVM Clang/Polly loop optimization pragmas to the benchmarks to opt… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

    Comments: Submitted to CCPE journal. arXiv admin note: substantial text overlap with arXiv:2010.08040

  45. arXiv:2102.08351  [pdf, ps, other

    math.OC cs.LG

    Learning Symbolic Expressions: Mixed-Integer Formulations, Cuts, and Heuristics

    Authors: Jongeun Kim, Sven Leyffer, Prasanna Balaprakash

    Abstract: In this paper we consider the problem of learning a regression function without assuming its functional form. This problem is referred to as symbolic regression. An expression tree is typically used to represent a solution function, which is determined by assigning operators and operands to the nodes. The symbolic regression problem can be formulated as a nonconvex mixed-integer nonlinear program… ▽ More

    Submitted 24 February, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

  46. arXiv:2101.00464  [pdf, other

    cs.NI cs.AI eess.SY

    Data-Driven Random Access Optimization in Multi-Cell IoT Networks with NOMA

    Authors: Sami Khairy, Prasanna Balaprakash, Lin X. Cai, H. Vincent Poor

    Abstract: Non-orthogonal multiple access (NOMA) is a key technology to enable massive machine type communications (mMTC) in 5G networks and beyond. In this paper, NOMA is applied to improve the random access efficiency in high-density spatially-distributed multi-cell wireless IoT networks, where IoT devices contend for accessing the shared wireless channel using an adaptive p-persistent slotted Aloha protoc… ▽ More

    Submitted 11 March, 2021; v1 submitted 2 January, 2021; originally announced January 2021.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  47. arXiv:2010.16358  [pdf, other

    cs.LG cs.NE stat.ML

    AgEBO-Tabular: Joint Neural Architecture and Hyperparameter Search with Autotuned Data-Parallel Training for Tabular Data

    Authors: Romain Egele, Prasanna Balaprakash, Venkatram Vishwanath, Isabelle Guyon, Zhengying Liu

    Abstract: Develo** high-performing predictive models for large tabular data sets is a challenging task. The state-of-the-art methods are based on expert-developed model ensembles from different supervised learning methods. Recently, automated machine learning (AutoML) is emerging as a promising approach to automate predictive model development. Neural architecture search (NAS) is an AutoML approach that g… ▽ More

    Submitted 26 October, 2021; v1 submitted 30 October, 2020; originally announced October 2020.

  48. arXiv:2010.08040  [pdf, other

    cs.PF cs.LG cs.PL

    Autotuning PolyBench Benchmarks with LLVM Clang/Polly Loop Optimization Pragmas Using Bayesian Optimization

    Authors: Xingfu Wu, Michael Kruse, Prasanna Balaprakash, Hal Finkel, Paul Hovland, Valerie Taylor, Mary Hall

    Abstract: An autotuning is an approach that explores a search space of possible implementations/configurations of a kernel or an application by selecting and evaluating a subset of implementations/configurations on a target platform and/or use models to identify a high performance implementation/configuration. In this paper, we develop an autotuning framework that leverages Bayesian optimization to explore… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: to be published in the 11th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS20)

  49. arXiv:2008.12767  [pdf, other

    cs.LG cs.NI eess.SP stat.ML

    Dynamic Graph Neural Network for Traffic Forecasting in Wide Area Networks

    Authors: Tanwi Mallick, Mariam Kiran, Bashir Mohammed, Prasanna Balaprakash

    Abstract: Wide area networking infrastructures (WANs), particularly science and research WANs, are the backbone for moving large volumes of scientific data between experimental facilities and data centers. With demands growing at exponential rates, these networks are struggling to cope with large data volumes, real-time responses, and overall network performance. Network operators are increasingly looking f… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.

    Comments: 10 Pages, 11 Figures

  50. arXiv:2008.12187  [pdf, other

    cs.LG q-bio.BM stat.ML

    Graph Neural Network Architecture Search for Molecular Property Prediction

    Authors: Shengli Jiang, Prasanna Balaprakash

    Abstract: Predicting the properties of a molecule from its structure is a challenging task. Recently, deep learning methods have improved the state of the art for this task because of their ability to learn useful features from the given data. By treating molecule structure as graphs, where atoms and bonds are modeled as nodes and edges, graph neural networks (GNNs) have been widely used to predict molecula… ▽ More

    Submitted 27 August, 2020; originally announced August 2020.