Search | arXiv e-print repository

arXiv:2406.00133 [pdf, other]

Streamflow Prediction with Uncertainty Quantification for Water Management: A Constrained Reasoning and Learning Approach

Authors: Mohammed Amine Gharsallaoui, Bhupinderjeet Singh, Supriya Savalkar, Aryan Deshwal, Yan Yan, Ananth Kalyanaraman, Kirti Rajagopalan, Janardhan Rao Doppa

Abstract: Predicting the spatiotemporal variation in streamflow along with uncertainty quantification enables decision-making for sustainable management of scarce water resources. Process-based hydrological models (aka physics-based models) are based on physical laws, but using simplifying assumptions which can lead to poor accuracy. Data-driven approaches offer a powerful alternative, but they require larg… ▽ More Predicting the spatiotemporal variation in streamflow along with uncertainty quantification enables decision-making for sustainable management of scarce water resources. Process-based hydrological models (aka physics-based models) are based on physical laws, but using simplifying assumptions which can lead to poor accuracy. Data-driven approaches offer a powerful alternative, but they require large amount of training data and tend to produce predictions that are inconsistent with physical laws. This paper studies a constrained reasoning and learning (CRL) approach where physical laws represented as logical constraints are integrated as a layer in the deep neural network. To address small data setting, we develop a theoretically-grounded training approach to improve the generalization accuracy of deep models. For uncertainty quantification, we combine the synergistic strengths of Gaussian processes (GPs) and deep temporal models (i.e., deep models for time-series forecasting) by passing the learned latent representation as input to a standard distance-based kernel. Experiments on multiple real-world datasets demonstrate the effectiveness of both CRL and GP with deep kernel approaches over strong baseline methods. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2401.10522 [pdf]

FARe: Fault-Aware GNN Training on ReRAM-based PIM Accelerators

Authors: Pratyush Dhingra, Chukwufumnanya Ogbogu, Biresh Kumar Joardar, Janardhan Rao Doppa, Ananth Kalyanaraman, Partha Pratim Pande

Abstract: Resistive random-access memory (ReRAM)-based processing-in-memory (PIM) architecture is an attractive solution for training Graph Neural Networks (GNNs) on edge platforms. However, the immature fabrication process and limited write endurance of ReRAMs make them prone to hardware faults, thereby limiting their widespread adoption for GNN training. Further, the existing fault-tolerant solutions prov… ▽ More Resistive random-access memory (ReRAM)-based processing-in-memory (PIM) architecture is an attractive solution for training Graph Neural Networks (GNNs) on edge platforms. However, the immature fabrication process and limited write endurance of ReRAMs make them prone to hardware faults, thereby limiting their widespread adoption for GNN training. Further, the existing fault-tolerant solutions prove inadequate for effectively training GNNs in the presence of faults. In this paper, we propose a fault-aware framework referred to as FARe that mitigates the effect of faults during GNN training. FARe outperforms existing approaches in terms of both accuracy and timing overhead. Experimental results demonstrate that FARe framework can restore GNN test accuracy by 47.6% on faulty ReRAM hardware with a ~1% timing overhead compared to the fault-free counterpart. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: This paper has been accepted to the conference DATE (Design, Automation and Test in Europe) - 2024

ACM Class: B.8.1

arXiv:2311.10201 [pdf, other]

doi 10.1145/3650200.3656621

Fused Breadth-First Probabilistic Traversals on Distributed GPU Systems

Authors: Reece Neff, Mostafa Eghbali Zarch, Marco Minutoli, Mahantesh Halappanavar, Antonino Tumeo, Ananth Kalyanaraman, Michela Becchi

Abstract: Probabilistic breadth-first traversals (BPTs) are used in many network science and graph machine learning applications. In this paper, we are motivated by the application of BPTs in stochastic diffusion-based graph problems such as influence maximization. These applications heavily rely on BPTs to implement a Monte-Carlo sampling step for their approximations. Given the large sampling complexity,… ▽ More Probabilistic breadth-first traversals (BPTs) are used in many network science and graph machine learning applications. In this paper, we are motivated by the application of BPTs in stochastic diffusion-based graph problems such as influence maximization. These applications heavily rely on BPTs to implement a Monte-Carlo sampling step for their approximations. Given the large sampling complexity, stochasticity of the diffusion process, and the inherent irregularity in real-world graph topologies, efficiently parallelizing these BPTs remains significantly challenging. In this paper, we present a new algorithm to fuse massive number of concurrently executing BPTs with random starts on the input graph. Our algorithm is designed to fuse BPTs by combining separate traversals into a unified frontier on distributed multi-GPU systems. To show the general applicability of the fused BPT technique, we have incorporated it into two state-of-the-art influence maximization parallel implementations (gIM and Ripples). Our experiments on up to 4K nodes of the OLCF Frontier supercomputer ($32,768$ GPUs and $196$K CPU cores) show strong scaling behavior, and that fused BPTs can improve the performance of these implementations up to 34$\times$ (for gIM) and ~360$\times$ (for Ripples). △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: 12 pages, 11 figures

arXiv:2311.03388 [pdf, other]

Attention-based Models for Snow-Water Equivalent Prediction

Authors: Krishu K. Thapa, Bhupinderjeet Singh, Supriya Savalkar, Alan Fern, Kirti Rajagopalan, Ananth Kalyanaraman

Abstract: Snow Water-Equivalent (SWE) -- the amount of water available if snowpack is melted -- is a key decision variable used by water management agencies to make irrigation, flood control, power generation and drought management decisions. SWE values vary spatiotemporally -- affected by weather, topography and other environmental factors. While daily SWE can be measured by Snow Telemetry (SNOTEL) station… ▽ More Snow Water-Equivalent (SWE) -- the amount of water available if snowpack is melted -- is a key decision variable used by water management agencies to make irrigation, flood control, power generation and drought management decisions. SWE values vary spatiotemporally -- affected by weather, topography and other environmental factors. While daily SWE can be measured by Snow Telemetry (SNOTEL) stations with requisite instrumentation, such stations are spatially sparse requiring interpolation techniques to create spatiotemporally complete data. While recent efforts have explored machine learning (ML) for SWE prediction, a number of recent ML advances have yet to be considered. The main contribution of this paper is to explore one such ML advance, attention mechanisms, for SWE prediction. Our hypothesis is that attention has a unique ability to capture and exploit correlations that may exist across locations or the temporal spectrum (or both). We present a generic attention-based modeling framework for SWE prediction and adapt it to capture spatial attention and temporal attention. Our experimental results on 323 SNOTEL stations in the Western U.S. demonstrate that our attention-based models outperform other machine learning approaches. We also provide key results highlighting the differences between spatial and temporal attention in this context and a roadmap toward deployment for generating spatially-complete SWE maps. △ Less

Submitted 3 November, 2023; originally announced November 2023.

Comments: 7 pages, To be published in Proceedings of The Thirty-Sixth Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-24)

ACM Class: I.2

arXiv:2208.00613 [pdf, other]

HBMax: Optimizing Memory Efficiency for Parallel Influence Maximization on Multicore Architectures

Authors: Xinyu Chen, Marco Minutoli, Jiannan Tian, Mahantesh Halappanavar, Ananth Kalyanaraman, Dingwen Tao

Abstract: Influence maximization aims to select k most-influential vertices or seeds in a network, where influence is defined by a given diffusion process. Although computing optimal seed set is NP-Hard, efficient approximation algorithms exist. However, even state-of-the-art parallel implementations are limited by a sampling step that incurs large memory footprints. This in turn limits the problem size rea… ▽ More Influence maximization aims to select k most-influential vertices or seeds in a network, where influence is defined by a given diffusion process. Although computing optimal seed set is NP-Hard, efficient approximation algorithms exist. However, even state-of-the-art parallel implementations are limited by a sampling step that incurs large memory footprints. This in turn limits the problem size reach and approximation quality. In this work, we study the memory footprint of the sampling process collecting reverse reachability information in the IMM (Influence Maximization via Martingales) algorithm over large real-world social networks. We present a memory-efficient optimization approach (called HBMax) based on Ripples, a state-of-the-art multi-threaded parallel influence maximization solution. Our approach, HBMax, uses a portion of the reverse reachable (RR) sets collected by the algorithm to learn the characteristics of the graph. Then, it compresses the intermediate reverse reachability information with Huffman coding or bitmap coding, and queries on the partially decoded data, or directly on the compressed data to preserve the memory savings obtained through compression. Considering a NUMA architecture, we scale up our solution on 64 CPU cores and reduce the memory footprint by up to 82.1% with average 6.3% speedup (encoding overhead is offset by performance gain from memory reduction) without loss of accuracy. For the largest tested graph Twitter7 (with 1.4 billion edges), HBMax achieves 5.9X compression ratio and 2.2X speedup. △ Less

Submitted 4 August, 2022; v1 submitted 1 August, 2022; originally announced August 2022.

Comments: 13 pages, 6 figures, 8 tables, accepted by PACT' 22

arXiv:2106.13397 [pdf, other]

Pheno-Mapper: An Interactive Toolbox for the Visual Exploration of Phenomics Data

Authors: Youjia Zhou, Methun Kamruzzaman, Patrick Schnable, Bala Krishnamoorthy, Ananth Kalyanaraman, Bei Wang

Abstract: High-throughput technologies to collect field data have made observations possible at scale in several branches of life sciences. The data collected can range from the molecular level (genotypes) to physiological (phenotypic traits) and environmental observations (e.g., weather, soil conditions). These vast swathes of data, collectively referred to as phenomics data, represent a treasure trove of… ▽ More High-throughput technologies to collect field data have made observations possible at scale in several branches of life sciences. The data collected can range from the molecular level (genotypes) to physiological (phenotypic traits) and environmental observations (e.g., weather, soil conditions). These vast swathes of data, collectively referred to as phenomics data, represent a treasure trove of key scientific knowledge on the dynamics of the underlying biological system. However, extracting information and insights from these complex datasets remains a significant challenge owing to their multidimensionality and lack of prior knowledge about their complex structure. In this paper, we present Pheno-Mapper, an interactive toolbox for the exploratory analysis and visualization of large-scale phenomics data. Our approach uses the mapper framework to perform a topological analysis of the data, and subsequently render visual representations with built-in data analysis and machine learning capabilities. We demonstrate the utility of this new tool on real-world plant (e.g., maize) phenomics datasets. In comparison to existing approaches, the main advantage of Pheno-Mapper is that it provides rich, interactive capabilities in the exploratory analysis of phenomics data, and it integrates visual analytics with data analysis and machine learning in an easily extensible way. In particular, Pheno-Mapper allows the interactive selection of subpopulations guided by a topological summary of the data and applies data mining and machine learning to these selected subpopulations for in-depth exploration. △ Less

Submitted 6 July, 2021; v1 submitted 24 June, 2021; originally announced June 2021.

Comments: This is a preprint version. For a published version, please refer to ACM DOI: 10.1145/3459930.3469511

arXiv:1904.08553 [pdf, other]

doi 10.1145/3341161.3342877

A Fast and Efficient Incremental Approach toward Dynamic Community Detection

Authors: Neda Zarayeneh, Ananth Kalyanaraman

Abstract: Community detection is a discovery tool used by network scientists to analyze the structure of real-world networks. It seeks to identify natural divisions that may exist in the input networks that partition the vertices into coherent modules (or communities). While this problem space is rich with efficient algorithms and software, most of this literature caters to the static use-case where the und… ▽ More Community detection is a discovery tool used by network scientists to analyze the structure of real-world networks. It seeks to identify natural divisions that may exist in the input networks that partition the vertices into coherent modules (or communities). While this problem space is rich with efficient algorithms and software, most of this literature caters to the static use-case where the underlying network does not change. However, many emerging real-world use-cases give rise to a need to incorporate dynamic graphs as inputs. In this paper, we present a fast and efficient incremental approach toward dynamic community detection. The key contribution is a generic technique called $Δ-screening$, which examines the most recent batch of changes made to an input graph and selects a subset of vertices to reevaluate for potential community (re)assignment. This technique can be incorporated into any of the community detection methods that use modularity as its objective function for clustering. For demonstration purposes, we incorporated the technique into two well-known community detection tools. Our experiments demonstrate that our new incremental approach is able to generate performance speedups without compromising on the output quality (despite its heuristic nature). For instance, on a real-world network with 63M temporal edges (over 12 time steps), our approach was able to complete in 1056 seconds, yielding a 3x speedup over a baseline implementation. In addition to demonstrating the performance benefits, we also show how to use our approach to delineate appropriate intervals of temporal resolutions at which to analyze an input network. △ Less

Submitted 19 April, 2019; v1 submitted 17 April, 2019; originally announced April 2019.

arXiv:1712.10197 [pdf, other]

Interesting Paths in the Mapper

Authors: Ananth Kalyanaraman, Methun Kamruzzaman, Bala Krishnamoorthy

Abstract: The Mapper produces a compact summary of high dimensional data as a simplicial complex. We study the problem of quantifying the interestingness of subpopulations in a Mapper, which appear as long paths, flares, or loops. First, we create a weighted directed graph G using the 1-skeleton of the Mapper. We use the average values at the vertices of a target function to direct edges (from low to high).… ▽ More The Mapper produces a compact summary of high dimensional data as a simplicial complex. We study the problem of quantifying the interestingness of subpopulations in a Mapper, which appear as long paths, flares, or loops. First, we create a weighted directed graph G using the 1-skeleton of the Mapper. We use the average values at the vertices of a target function to direct edges (from low to high). The difference between the average values at vertices (high-low) is set as the edge's weight. Covariation of the remaining h functions (independent variables) is captured by a h-bit binary signature assigned to the edge. An interesting path in G is a directed path whose edges all have the same signature. We define the interestingness score of such a path as a sum of its edge weights multiplied by a nonlinear function of their ranks in the path. Second, we study three optimization problems on this graph G. In the problem Max-IP, we seek an interesting path in G with the maximum interestingness score. We show that Max-IP is NP-complete. For the special case when G is a directed acyclic graph (DAG), we show that Max-IP can be solved in polynomial time - in O(mnd_i) where d_i is the maximum indegree of a vertex in G. In the more general problem IP, the goal is to find a collection of edge-disjoint interesting paths such that the overall sum of their interestingness scores is maximized. We also study a variant of IP termed k-IP, where the goal is to identify a collection of edge-disjoint interesting paths each with k edges, and their total interestingness score is maximized. While k-IP can be solved in polynomial time for k <= 2, we show k-IP is NP-complete for k >= 3 even when G is a DAG. We develop polynomial time heuristics for IP and k-IP on DAGs. △ Less

Submitted 10 April, 2018; v1 submitted 29 December, 2017; originally announced December 2017.

Comments: NP-completeness of k-IP shown only for DAGs now; connections to coboundary operations outlined

MSC Class: 05C85; 68Q25; 62H30; 55U99 ACM Class: G.2.2; F.2.2

arXiv:1707.04362 [pdf, other]

Hyppo-X: A Scalable Exploratory Framework for Analyzing Complex Phenomics Data

Authors: Methun Kamruzzaman, Ananth Kalyanaraman, Bala Krishnamoorthy, Stefan Hey, Patrick Schnable

Abstract: Phenomics is an emerging branch of modern biology that uses high throughput phenoty** tools to capture multiple environmental and phenotypic traits, often at massive spatial and temporal scales. The resulting high dimensional data represent a treasure trove of information for providing an in-depth understanding of how multiple factors interact and contribute to the overall growth and behavior of… ▽ More Phenomics is an emerging branch of modern biology that uses high throughput phenoty** tools to capture multiple environmental and phenotypic traits, often at massive spatial and temporal scales. The resulting high dimensional data represent a treasure trove of information for providing an in-depth understanding of how multiple factors interact and contribute to the overall growth and behavior of different genotypes. However, computational tools that can parse through such complex data and aid in extracting plausible hypotheses are currently lacking. In this paper, we present Hyppo-X, a new algorithmic approach to visually explore complex phenomics data and in the process characterize the role of environment on phenotypic traits. We model the problem as one of unsupervised structure discovery, and use emerging principles from algebraic topology and graph theory for discovering higher-order structures of complex phenomics data. We present an open source software which has interactive visualization capabilities to facilitate data navigation and hypothesis formulation. We test and evaluate Hyppo-X on two real-world plant (maize) data sets. Our results demonstrate the ability of our approach to delineate divergent subpopulation-level behavior. Notably, our approach shows how environmental factors could influence phenotypic behavior, and how that effect varies across different genotypes and different time scales. To the best of our knowledge, this effort provides one of the first approaches to systematically formalize the problem of hypothesis extraction for phenomics data. Considering the infancy of the phenomics field, tools that help users explore complex data and extract plausible hypotheses in a data-guided manner will be critical to future advancements in the use of such data. △ Less

Submitted 5 June, 2019; v1 submitted 13 July, 2017; originally announced July 2017.

Comments: Substantially expanded from previous version. Now illustrating interesting flares and paths on two different data sets

MSC Class: 68U05; 55U10; 05C20 ACM Class: J.3; I.3.5; F.2.2

arXiv:1410.1237 [pdf, other]

Parallel Heuristics for Scalable Community Detection

Authors: Hao Lu, Mahantesh Halappanavar, Ananth Kalyanaraman

Abstract: Community detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to… ▽ More Community detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to the irregular and inherently sequential nature of the underlying heuristics. In this paper, we present parallelization heuristics for fast community detection using the Louvain method as the serial template. The Louvain method is an iterative heuristic for modularity optimization. Originally developed by Blondel et al. in 2008, the method has become increasingly popular owing to its ability to detect high modularity community partitions in a fast and memory-efficient manner. However, the method is also inherently sequential, thereby limiting its scalability. Here, we observe certain key properties of this method that present challenges for its parallelization, and consequently propose heuristics that are designed to break the sequential barrier. For evaluation purposes, we implemented our heuristics using OpenMP multithreading, and tested them over real world graphs derived from multiple application domains (e.g., internet, citation, biological). Compared to the serial Louvain implementation, our parallel implementation is able to produce community outputs with a higher modularity for most of the inputs tested, in comparable number or fewer iterations, while providing absolute speedups of up to 16x using 32 threads. △ Less

Submitted 6 October, 2014; v1 submitted 5 October, 2014; originally announced October 2014.

Comments: Submitted to a journal

Showing 1–10 of 10 results for author: Kalyanaraman, A