Search | arXiv e-print repository

AGS-GNN: Attribute-guided Sampling for Graph Neural Networks

Authors: Siddhartha Shankar Das, S M Ferdous, Mahantesh M Halappanavar, Edoardo Serra, Alex Pothen

Abstract: We propose AGS-GNN, a novel attribute-guided sampling algorithm for Graph Neural Networks (GNNs) that exploits node features and connectivity structure of a graph while simultaneously adapting for both homophily and heterophily in graphs. (In homophilic graphs vertices of the same class are more likely to be connected, and vertices of different classes tend to be linked in heterophilic graphs.) Wh… ▽ More We propose AGS-GNN, a novel attribute-guided sampling algorithm for Graph Neural Networks (GNNs) that exploits node features and connectivity structure of a graph while simultaneously adapting for both homophily and heterophily in graphs. (In homophilic graphs vertices of the same class are more likely to be connected, and vertices of different classes tend to be linked in heterophilic graphs.) While GNNs have been successfully applied to homophilic graphs, their application to heterophilic graphs remains challenging. The best-performing GNNs for heterophilic graphs do not fit the sampling paradigm, suffer high computational costs, and are not inductive. We employ samplers based on feature-similarity and feature-diversity to select subsets of neighbors for a node, and adaptively capture information from homophilic and heterophilic neighborhoods using dual channels. Currently, AGS-GNN is the only algorithm that we know of that explicitly controls homophily in the sampled subgraph through similar and diverse neighborhood samples. For diverse neighborhood sampling, we employ submodularity, which was not used in this context prior to our work. The sampling distribution is pre-computed and highly parallel, achieving the desired scalability. Using an extensive dataset consisting of 35 small ($\le$ 100K nodes) and large (>100K nodes) homophilic and heterophilic graphs, we demonstrate the superiority of AGS-GNN compare to the current approaches in the literature. AGS-GNN achieves comparable test accuracy to the best-performing heterophilic GNNs, even outperforming methods using the entire graph for node classification. AGS-GNN also converges faster compared to methods that sample neighborhoods randomly, and can be incorporated into existing GNN models that employ node or graph sampling. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: The paper has been accepted to KDD'24 in the research track

arXiv:2404.03701 [pdf, other]

Predictive Analytics of Varieties of Potatoes

Authors: Fabiana Ferracina, Bala Krishnamoorthy, Mahantesh Halappanavar, Shengwei Hu, Vidyasagar Sathuvalli

Abstract: We explore the application of machine learning algorithms to predict the suitability of Russet potato clones for advancement in breeding trials. Leveraging data from manually collected trials in the state of Oregon, we investigate the potential of a wide variety of state-of-the-art binary classification models. We conduct a comprehensive analysis of the dataset that includes preprocessing, feature… ▽ More We explore the application of machine learning algorithms to predict the suitability of Russet potato clones for advancement in breeding trials. Leveraging data from manually collected trials in the state of Oregon, we investigate the potential of a wide variety of state-of-the-art binary classification models. We conduct a comprehensive analysis of the dataset that includes preprocessing, feature engineering, and imputation to address missing values. We focus on several key metrics such as accuracy, F1-score, and Matthews correlation coefficient (MCC) for model evaluation. The top-performing models, namely the multi-layer perceptron classifier (MLPC), histogram-based gradient boosting classifier (HGBC), and a support vector machine classifier (SVC), demonstrate consistent and significant results. Variable selection further enhances model performance and identifies influential features in predicting trial outcomes. The findings emphasize the potential of machine learning in streamlining the selection process for potato varieties, offering benefits such as increased efficiency, substantial cost savings, and judicious resource utilization. Our study contributes insights into precision agriculture and showcases the relevance of advanced technologies for informed decision-making in breeding programs. △ Less

Submitted 18 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

Comments: 19 pages, 3 figures, submitted to Artificial Intelligence in Agriculture

arXiv:2403.12334 [pdf, other]

Structural Validation Of Synthetic Power Distribution Networks Using The Multiscale Flat Norm

Authors: Kostiantyn Lyman, Rounak Meyur, Bala Krishnamoorthy, Mahantesh Halappanavar

Abstract: We study the problem of comparing a pair of geometric networks that may not be similarly defined, i.e., when they do not have one-to-one correspondences between their nodes and edges. Our motivating application is to compare power distribution networks of a region. Due to the lack of openly available power network datasets, researchers synthesize realistic networks resembling their actual counterp… ▽ More We study the problem of comparing a pair of geometric networks that may not be similarly defined, i.e., when they do not have one-to-one correspondences between their nodes and edges. Our motivating application is to compare power distribution networks of a region. Due to the lack of openly available power network datasets, researchers synthesize realistic networks resembling their actual counterparts. But the synthetic digital twins may vary significantly from one another and from actual networks due to varying underlying assumptions and approaches. Hence the user wants to evaluate the quality of networks in terms of their structural similarity to actual power networks. But the lack of correspondence between the networks renders most standard approaches, e.g., subgraph isomorphism and edit distance, unsuitable. We propose an approach based on the multiscale flat norm, a notion of distance between objects defined in the field of geometric measure theory, to compute the distance between a pair of planar geometric networks. Using a triangulation of the domain containing the input networks, the flat norm distance between two networks at a given scale can be computed by solving a linear program. In addition, this computation automatically identifies the 2D regions (patches) that capture where the two networks are different. We demonstrate through 2D examples that the flat norm distance can capture the variations of inputs more accurately than the commonly used Hausdorff distance. As a notion of stability, we also derive upper bounds on the flat norm distance between a simple 1D curve and its perturbed version as a function of the radius of perturbation for a restricted class of perturbations. We demonstrate our approach on a set of actual power networks from a county in the USA. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: A shorter version (with subset of results) appeared in ICCS 2023

arXiv:2403.05781 [pdf, other]

Approximate Bipartite $b$-Matching using Multiplicative Auction

Authors: Bhargav Samineni, S M Ferdous, Mahantesh Halappanavar, Bala Krishnamoorthy

Abstract: Given a bipartite graph $G(V= (A \cup B),E)$ with $n$ vertices and $m$ edges and a function $b \colon V \to \mathbb{Z}_+$, a $b$-matching is a subset of edges such that every vertex $v \in V$ is incident to at most $b(v)$ edges in the subset. When we are also given edge weights, the Max Weight $b$-Matching problem is to find a $b$-matching of maximum weight, which is a fundamental combinatorial op… ▽ More Given a bipartite graph $G(V= (A \cup B),E)$ with $n$ vertices and $m$ edges and a function $b \colon V \to \mathbb{Z}_+$, a $b$-matching is a subset of edges such that every vertex $v \in V$ is incident to at most $b(v)$ edges in the subset. When we are also given edge weights, the Max Weight $b$-Matching problem is to find a $b$-matching of maximum weight, which is a fundamental combinatorial optimization problem with many applications. Extending on the recent work of Zheng and Henzinger (IPCO, 2023) on standard bipartite matching problems, we develop a simple auction algorithm to approximately solve Max Weight $b$-Matching. Specifically, we present a multiplicative auction algorithm that gives a $(1 - \varepsilon)$-approximation in $O(m \varepsilon^{-1} \log \varepsilon^{-1} \log β)$ worst case time, where $β$ the maximum $b$-value. Although this is a $\log β$ factor greater than the current best approximation algorithm by Huang and Pettie (Algorithmica, 2022), it is considerably simpler to present, analyze, and implement. △ Less

Submitted 8 March, 2024; originally announced March 2024.

Comments: 14 pages; Accepted as a refereed paper in the 2024 INFORMS Optimization Society conference

arXiv:2401.06713 [pdf, other]

Picasso: Memory-Efficient Graph Coloring Using Palettes With Applications in Quantum Computing

Authors: S M Ferdous, Reece Neff, Bo Peng, Salman Shuvo, Marco Minutoli, Sayak Mukherjee, Karol Kowalski, Michela Becchi, Mahantesh Halappanavar

Abstract: A coloring of a graph is an assignment of colors to vertices such that no two neighboring vertices have the same color. The need for memory-efficient coloring algorithms is motivated by their application in computing clique partitions of graphs arising in quantum computations where the objective is to map a large set of Pauli strings into a compact set of unitaries. We present Picasso, a randomize… ▽ More A coloring of a graph is an assignment of colors to vertices such that no two neighboring vertices have the same color. The need for memory-efficient coloring algorithms is motivated by their application in computing clique partitions of graphs arising in quantum computations where the objective is to map a large set of Pauli strings into a compact set of unitaries. We present Picasso, a randomized memory-efficient iterative parallel graph coloring algorithm with theoretical sublinear space guarantees under practical assumptions. The parameters of our algorithm provide a trade-off between coloring quality and resource consumption. To assist the user, we also propose a machine learning model to predict the coloring algorithm's parameters considering these trade-offs. We provide a sequential and a parallel implementation of the proposed algorithm. We perform an experimental evaluation on a 64-core AMD CPU equipped with 512 GB of memory and an Nvidia A100 GPU with 40GB of memory. For a small dataset where existing coloring algorithms can be executed within the 512 GB memory budget, we show up to 68x memory savings. On massive datasets we demonstrate that GPU-accelerated Picasso can process inputs with 49.5x more Pauli strings (vertex set in our graph) and 2,478x more edges than state-of-the-art parallel approaches. △ Less

Submitted 12 February, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

Comments: Accepted by IPDPS 2024

arXiv:2311.10201 [pdf, other]

doi 10.1145/3650200.3656621

Fused Breadth-First Probabilistic Traversals on Distributed GPU Systems

Authors: Reece Neff, Mostafa Eghbali Zarch, Marco Minutoli, Mahantesh Halappanavar, Antonino Tumeo, Ananth Kalyanaraman, Michela Becchi

Abstract: Probabilistic breadth-first traversals (BPTs) are used in many network science and graph machine learning applications. In this paper, we are motivated by the application of BPTs in stochastic diffusion-based graph problems such as influence maximization. These applications heavily rely on BPTs to implement a Monte-Carlo sampling step for their approximations. Given the large sampling complexity,… ▽ More Probabilistic breadth-first traversals (BPTs) are used in many network science and graph machine learning applications. In this paper, we are motivated by the application of BPTs in stochastic diffusion-based graph problems such as influence maximization. These applications heavily rely on BPTs to implement a Monte-Carlo sampling step for their approximations. Given the large sampling complexity, stochasticity of the diffusion process, and the inherent irregularity in real-world graph topologies, efficiently parallelizing these BPTs remains significantly challenging. In this paper, we present a new algorithm to fuse massive number of concurrently executing BPTs with random starts on the input graph. Our algorithm is designed to fuse BPTs by combining separate traversals into a unified frontier on distributed multi-GPU systems. To show the general applicability of the fused BPT technique, we have incorporated it into two state-of-the-art influence maximization parallel implementations (gIM and Ripples). Our experiments on up to 4K nodes of the OLCF Frontier supercomputer ($32,768$ GPUs and $196$K CPU cores) show strong scaling behavior, and that fused BPTs can improve the performance of these implementations up to 34$\times$ (for gIM) and ~360$\times$ (for Ripples). △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: 12 pages, 11 figures

arXiv:2311.08496 [pdf, other]

A Robust, Efficient Predictive Safety Filter

Authors: Wenceslao Shaw Cortez, Jan Drgona, Draguna Vrabie, Mahantesh Halappanavar

Abstract: In this paper, we propose a novel predictive safety filter that is robust to bounded perturbations and is implemented in an even-triggered fashion to reduce online computation. The proposed safety filter extends upon existing work to reject disturbances for discrete-time, time-varying nonlinear systems with time-varying constraints. The safety filter is based on novel concepts of robust, discrete-… ▽ More In this paper, we propose a novel predictive safety filter that is robust to bounded perturbations and is implemented in an even-triggered fashion to reduce online computation. The proposed safety filter extends upon existing work to reject disturbances for discrete-time, time-varying nonlinear systems with time-varying constraints. The safety filter is based on novel concepts of robust, discrete-time barrier functions and can be used to filter any control law. Here, we use the safety filter in conjunction with Differentiable Predictive Control (DPC) as a promising offline learning-based policy optimization method. The approach is demonstrated on a two-tank system, building, and single-integrator example. △ Less

Submitted 26 April, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

arXiv:2311.02073 [pdf, other]

Semi-Streaming Algorithms for Weighted $k$-Disjoint Matchings

Authors: S M Ferdous, Bhargav Samineni, Alex Pothen, Mahantesh Halappanavar, Bala Krishnamoorthy

Abstract: We design and implement two single-pass semi-streaming algorithms for the maximum weight $k$-disjoint matching ($k$-DM) problem. Given an integer $k$, the $k$-DM problem is to find $k$ pairwise edge-disjoint matchings such that the sum of the weights of the matchings is maximized. For $k \geq 2$, this problem is NP-hard. Our first algorithm is based on the primal-dual framework of a linear program… ▽ More We design and implement two single-pass semi-streaming algorithms for the maximum weight $k$-disjoint matching ($k$-DM) problem. Given an integer $k$, the $k$-DM problem is to find $k$ pairwise edge-disjoint matchings such that the sum of the weights of the matchings is maximized. For $k \geq 2$, this problem is NP-hard. Our first algorithm is based on the primal-dual framework of a linear programming relaxation of the problem and is $\frac{1}{3+\varepsilon}$-approximate. We also develop an approximation preserving reduction from $k$-DM to the maximum weight $b$-matching problem. Leveraging this reduction and an existing semi-streaming $b$-matching algorithm, we design a $(\frac{1}{2+\varepsilon})(1 - \frac{1}{k+1})$-approximate semi-streaming algorithm for $k$-DM. For any constant $\varepsilon > 0$, both of these algorithms require $O(nk \log_{1+\varepsilon}^2 n)$ bits of space. To the best of our knowledge, this is the first study of semi-streaming algorithms for the $k$-DM problem. We compare our two algorithms to state-of-the-art offline algorithms on 95 real-world and synthetic test problems, including thirteen graphs generated from data center network traces. On these instances, our streaming algorithms used significantly less memory (ranging from 6$\times$ to 512$\times$ less) and were faster in runtime than the offline algorithms. Our solutions were often within 5% of the best weights from the offline algorithms. We highlight that the existing offline algorithms run out of 1 TB memory for most of the large instances ($>1$ billion edges), whereas our streaming algorithms can solve these problems using only 100 GB memory for $k=8$. △ Less

Submitted 5 July, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

Comments: 24 pages, To appear in ESA 2024

arXiv:2310.13110 [pdf, other]

Semi-Supervised Learning of Dynamical Systems with Neural Ordinary Differential Equations: A Teacher-Student Model Approach

Authors: Yu Wang, Yuxuan Yin, Karthik Somayaji Nanjangud Suryanarayana, Jan Drgona, Malachi Schram, Mahantesh Halappanavar, Frank Liu, Peng Li

Abstract: Modeling dynamical systems is crucial for a wide range of tasks, but it remains challenging due to complex nonlinear dynamics, limited observations, or lack of prior knowledge. Recently, data-driven approaches such as Neural Ordinary Differential Equations (NODE) have shown promising results by leveraging the expressive power of neural networks to model unknown dynamics. However, these approaches… ▽ More Modeling dynamical systems is crucial for a wide range of tasks, but it remains challenging due to complex nonlinear dynamics, limited observations, or lack of prior knowledge. Recently, data-driven approaches such as Neural Ordinary Differential Equations (NODE) have shown promising results by leveraging the expressive power of neural networks to model unknown dynamics. However, these approaches often suffer from limited labeled training data, leading to poor generalization and suboptimal predictions. On the other hand, semi-supervised algorithms can utilize abundant unlabeled data and have demonstrated good performance in classification and regression tasks. We propose TS-NODE, the first semi-supervised approach to modeling dynamical systems with NODE. TS-NODE explores cheaply generated synthetic pseudo rollouts to broaden exploration in the state space and to tackle the challenges brought by lack of ground-truth system data under a teacher-student model. TS-NODE employs an unified optimization framework that corrects the teacher model based on the student's feedback while mitigating the potential false system dynamics present in pseudo rollouts. TS-NODE demonstrates significant performance improvements over a baseline Neural ODE model on multiple dynamical system modeling tasks. △ Less

Submitted 19 October, 2023; originally announced October 2023.

arXiv:2308.13011 [pdf, other]

Extreme Risk Mitigation in Reinforcement Learning using Extreme Value Theory

Authors: Karthik Somayaji NS, Yu Wang, Malachi Schram, Jan Drgona, Mahantesh Halappanavar, Frank Liu, Peng Li

Abstract: Risk-sensitive reinforcement learning (RL) has garnered significant attention in recent years due to the growing interest in deploying RL agents in real-world scenarios. A critical aspect of risk awareness involves modeling highly rare risk events (rewards) that could potentially lead to catastrophic outcomes. These infrequent occurrences present a formidable challenge for data-driven methods aimi… ▽ More Risk-sensitive reinforcement learning (RL) has garnered significant attention in recent years due to the growing interest in deploying RL agents in real-world scenarios. A critical aspect of risk awareness involves modeling highly rare risk events (rewards) that could potentially lead to catastrophic outcomes. These infrequent occurrences present a formidable challenge for data-driven methods aiming to capture such risky events accurately. While risk-aware RL techniques do exist, their level of risk aversion heavily relies on the precision of the state-action value function estimation when modeling these rare occurrences. Our work proposes to enhance the resilience of RL agents when faced with very rare and risky events by focusing on refining the predictions of the extreme values predicted by the state-action value function distribution. To achieve this, we formulate the extreme values of the state-action value function distribution as parameterized distributions, drawing inspiration from the principles of extreme value theory (EVT). This approach effectively addresses the issue of infrequent occurrence by leveraging EVT-based parameterization. Importantly, we theoretically demonstrate the advantages of employing these parameterized distributions in contrast to other risk-averse algorithms. Our evaluations show that the proposed method outperforms other risk averse RL algorithms on a diverse range of benchmark tasks, each encompassing distinct risk scenarios. △ Less

Submitted 24 August, 2023; originally announced August 2023.

arXiv:2305.19871 [pdf, other]

There is more to graphs than meets the eye: Learning universal features with self-supervision

Authors: Laya Das, Sai Munikoti, Mahantesh Halappanavar

Abstract: We study the problem of learning universal features across multiple graphs through self-supervision. Graph self supervised learning has been shown to facilitate representation learning, and produce competitive models compared to supervised baselines. However, existing methods of self-supervision learn features from one graph, and thus, produce models that are specialized to a particular graph. We… ▽ More We study the problem of learning universal features across multiple graphs through self-supervision. Graph self supervised learning has been shown to facilitate representation learning, and produce competitive models compared to supervised baselines. However, existing methods of self-supervision learn features from one graph, and thus, produce models that are specialized to a particular graph. We hypothesize that leveraging multiple graphs of the same type/class can improve the quality of learnt representations in the model by extracting features that are universal to the class of graphs. We adopt a transformer backbone that acts as a universal representation learning module for multiple graphs. We leverage neighborhood aggregation coupled with graph-specific embedding generator to transform disparate node embeddings from multiple graphs to a common space for the universal backbone. We learn both universal and graph-specific parameters in an end-to-end manner. Our experiments reveal that leveraging multiple graphs of the same type -- citation networks -- improves the quality of representations and results in better performance on downstream node classification task compared to self-supervision with one graph. The results of our study improve the state-of-the-art in graph self-supervised learning, and bridge the gap between self-supervised and supervised performance. △ Less

Submitted 31 May, 2023; originally announced May 2023.

Comments: arXiv admin note: text overlap with arXiv:2302.11939, arXiv:2301.13287, arXiv:2305.12686, arXiv:2305.02299

arXiv:2304.03748 [pdf, other]

Perspectives on AI Architectures and Co-design for Earth System Predictability

Authors: Maruti K. Mudunuru, James A. Ang, Mahantesh Halappanavar, Simon D. Hammond, Maya B. Gokhale, James C. Hoe, Tushar Krishna, Sarat S. Sreepathi, Matthew R. Norman, Ivy B. Peng, Philip W. Jones

Abstract: Recently, the U.S. Department of Energy (DOE), Office of Science, Biological and Environmental Research (BER), and Advanced Scientific Computing Research (ASCR) programs organized and held the Artificial Intelligence for Earth System Predictability (AI4ESP) workshop series. From this workshop, a critical conclusion that the DOE BER and ASCR community came to is the requirement to develop a new par… ▽ More Recently, the U.S. Department of Energy (DOE), Office of Science, Biological and Environmental Research (BER), and Advanced Scientific Computing Research (ASCR) programs organized and held the Artificial Intelligence for Earth System Predictability (AI4ESP) workshop series. From this workshop, a critical conclusion that the DOE BER and ASCR community came to is the requirement to develop a new paradigm for Earth system predictability focused on enabling artificial intelligence (AI) across the field, lab, modeling, and analysis activities, called ModEx. The BER's `Model-Experimentation', ModEx, is an iterative approach that enables process models to generate hypotheses. The developed hypotheses inform field and laboratory efforts to collect measurement and observation data, which are subsequently used to parameterize, drive, and test model (e.g., process-based) predictions. A total of 17 technical sessions were held in this AI4ESP workshop series. This paper discusses the topic of the `AI Architectures and Co-design' session and associated outcomes. The AI Architectures and Co-design session included two invited talks, two plenary discussion panels, and three breakout rooms that covered specific topics, including: (1) DOE HPC Systems, (2) Cloud HPC Systems, and (3) Edge computing and Internet of Things (IoT). We also provide forward-looking ideas and perspectives on potential research in this co-design area that can be achieved by synergies with the other 16 session topics. These ideas include topics such as: (1) reimagining co-design, (2) data acquisition to distribution, (3) heterogeneous HPC solutions for integration of AI/ML and other data analytics like uncertainty quantification with earth system modeling and simulation, and (4) AI-enabled sensor integration into earth system measurements and observations. Such perspectives are a distinguishing aspect of this paper. △ Less

Submitted 7 April, 2023; originally announced April 2023.

Comments: 23 pages, 1 figure

arXiv:2302.01595 [pdf, other]

Deep Reinforcement Learning for Cyber System Defense under Dynamic Adversarial Uncertainties

Authors: Ashutosh Dutta, Samrat Chatterjee, Arnab Bhattacharya, Mahantesh Halappanavar

Abstract: Development of autonomous cyber system defense strategies and action recommendations in the real-world is challenging, and includes characterizing system state uncertainties and attack-defense dynamics. We propose a data-driven deep reinforcement learning (DRL) framework to learn proactive, context-aware, defense countermeasures that dynamically adapt to evolving adversarial behaviors while minimi… ▽ More Development of autonomous cyber system defense strategies and action recommendations in the real-world is challenging, and includes characterizing system state uncertainties and attack-defense dynamics. We propose a data-driven deep reinforcement learning (DRL) framework to learn proactive, context-aware, defense countermeasures that dynamically adapt to evolving adversarial behaviors while minimizing loss of cyber system operations. A dynamic defense optimization problem is formulated with multiple protective postures against different types of adversaries with varying levels of skill and persistence. A custom simulation environment was developed and experiments were devised to systematically evaluate the performance of four model-free DRL algorithms against realistic, multi-stage attack sequences. Our results suggest the efficacy of DRL algorithms for proactive cyber defense under multi-stage attack profiles and system uncertainties. △ Less

Submitted 3 February, 2023; originally announced February 2023.

arXiv:2210.02271 [pdf, other]

Extending Conformal Prediction to Hidden Markov Models with Exact Validity via de Finetti's Theorem for Markov Chains

Authors: Buddhika Nettasinghe, Samrat Chatterjee, Ramakrishna Tipireddy, Mahantesh Halappanavar

Abstract: Conformal prediction is a widely used method to quantify the uncertainty of a classifier under the assumption of exchangeability (e.g., IID data). We generalize conformal prediction to the Hidden Markov Model (HMM) framework where the assumption of exchangeability is not valid. The key idea of the proposed method is to partition the non-exchangeable Markovian data from the HMM into exchangeable bl… ▽ More Conformal prediction is a widely used method to quantify the uncertainty of a classifier under the assumption of exchangeability (e.g., IID data). We generalize conformal prediction to the Hidden Markov Model (HMM) framework where the assumption of exchangeability is not valid. The key idea of the proposed method is to partition the non-exchangeable Markovian data from the HMM into exchangeable blocks by exploiting the de Finetti's Theorem for Markov Chains discovered by Diaconis and Freedman (1980). The permutations of the exchangeable blocks are viewed as randomizations of the observed Markovian data from the HMM. The proposed method provably retains all desirable theoretical guarantees offered by the classical conformal prediction framework in both exchangeable and Markovian settings. In particular, while the lack of exchangeability introduced by Markovian samples constitutes a violation of a crucial assumption for classical conformal prediction, the proposed method views it as an advantage that can be exploited to improve the performance further. Detailed numerical and empirical results that complement the theoretical conclusions are provided to illustrate the practical feasibility of the proposed method. △ Less

Submitted 22 May, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

Comments: Accepted to the International Conference on Machine Learning (ICML), 2023

arXiv:2208.02319 [pdf, other]

Differentiable Predictive Control with Safety Guarantees: A Control Barrier Function Approach

Authors: Wenceslao Shaw Cortez, Jan Drgona, Aaron Tuor, Mahantesh Halappanavar, Draguna Vrabie

Abstract: We develop a novel form of differentiable predictive control (DPC) with safety and robustness guarantees based on control barrier functions. DPC is an unsupervised learning-based method for obtaining approximate solutions to explicit model predictive control (MPC) problems. In DPC, the predictive control policy parametrized by a neural network is optimized offline via direct policy gradients obtai… ▽ More We develop a novel form of differentiable predictive control (DPC) with safety and robustness guarantees based on control barrier functions. DPC is an unsupervised learning-based method for obtaining approximate solutions to explicit model predictive control (MPC) problems. In DPC, the predictive control policy parametrized by a neural network is optimized offline via direct policy gradients obtained by automatic differentiation of the MPC problem. The proposed approach exploits a new form of sampled-data barrier function to enforce offline and online safety requirements in DPC settings while only interrupting the neural network-based controller near the boundary of the safe set. The effectiveness of the proposed approach is demonstrated in simulation. △ Less

Submitted 3 August, 2022; originally announced August 2022.

Comments: Accepted to IEEE Conference on Decision and Control Conference 2022

arXiv:2208.00613 [pdf, other]

HBMax: Optimizing Memory Efficiency for Parallel Influence Maximization on Multicore Architectures

Authors: Xinyu Chen, Marco Minutoli, Jiannan Tian, Mahantesh Halappanavar, Ananth Kalyanaraman, Dingwen Tao

Abstract: Influence maximization aims to select k most-influential vertices or seeds in a network, where influence is defined by a given diffusion process. Although computing optimal seed set is NP-Hard, efficient approximation algorithms exist. However, even state-of-the-art parallel implementations are limited by a sampling step that incurs large memory footprints. This in turn limits the problem size rea… ▽ More Influence maximization aims to select k most-influential vertices or seeds in a network, where influence is defined by a given diffusion process. Although computing optimal seed set is NP-Hard, efficient approximation algorithms exist. However, even state-of-the-art parallel implementations are limited by a sampling step that incurs large memory footprints. This in turn limits the problem size reach and approximation quality. In this work, we study the memory footprint of the sampling process collecting reverse reachability information in the IMM (Influence Maximization via Martingales) algorithm over large real-world social networks. We present a memory-efficient optimization approach (called HBMax) based on Ripples, a state-of-the-art multi-threaded parallel influence maximization solution. Our approach, HBMax, uses a portion of the reverse reachable (RR) sets collected by the algorithm to learn the characteristics of the graph. Then, it compresses the intermediate reverse reachability information with Huffman coding or bitmap coding, and queries on the partially decoded data, or directly on the compressed data to preserve the memory savings obtained through compression. Considering a NUMA architecture, we scale up our solution on 64 CPU cores and reduce the memory footprint by up to 82.1% with average 6.3% speedup (encoding overhead is offset by performance gain from memory reduction) without loss of accuracy. For the largest tested graph Twitter7 (with 1.4 billion edges), HBMax achieves 5.9X compression ratio and 2.2X speedup. △ Less

Submitted 4 August, 2022; v1 submitted 1 August, 2022; originally announced August 2022.

Comments: 13 pages, 6 figures, 8 tables, accepted by PACT' 22

arXiv:2206.07922 [pdf, other]

Challenges and Opportunities in Deep Reinforcement Learning with Graph Neural Networks: A Comprehensive review of Algorithms and Applications

Authors: Sai Munikoti, Deepesh Agarwal, Laya Das, Mahantesh Halappanavar, Balasubramaniam Natarajan

Abstract: Deep reinforcement learning (DRL) has empowered a variety of artificial intelligence fields, including pattern recognition, robotics, recommendation-systems, and gaming. Similarly, graph neural networks (GNN) have also demonstrated their superior performance in supervised learning for graph-structured data. In recent times, the fusion of GNN with DRL for graph-structured environments has attracted… ▽ More Deep reinforcement learning (DRL) has empowered a variety of artificial intelligence fields, including pattern recognition, robotics, recommendation-systems, and gaming. Similarly, graph neural networks (GNN) have also demonstrated their superior performance in supervised learning for graph-structured data. In recent times, the fusion of GNN with DRL for graph-structured environments has attracted a lot of attention. This paper provides a comprehensive review of these hybrid works. These works can be classified into two categories: (1) algorithmic enhancement, where DRL and GNN complement each other for better utility; (2) application-specific enhancement, where DRL and GNN support each other. This fusion effectively addresses various complex problems in engineering and life sciences. Based on the review, we further analyze the applicability and benefits of fusing these two domains, especially in terms of increasing generalizability and reducing computational complexity. Finally, the key challenges in integrating DRL and GNN, and potential future research directions are highlighted, which will be of interest to the broader machine learning community. △ Less

Submitted 7 November, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

Comments: 20 pages, 3 figures, 2 tables

arXiv:2205.14834 [pdf, other]

GraMeR: Graph Meta Reinforcement Learning for Multi-Objective Influence Maximization

Authors: Sai Munikoti, Balasubramaniam Natarajan, Mahantesh Halappanavar

Abstract: Influence maximization (IM) is a combinatorial problem of identifying a subset of nodes called the seed nodes in a network (graph), which when activated, provide a maximal spread of influence in the network for a given diffusion model and a budget for seed set size. IM has numerous applications such as viral marketing, epidemic control, sensor placement and other network-related tasks. However, th… ▽ More Influence maximization (IM) is a combinatorial problem of identifying a subset of nodes called the seed nodes in a network (graph), which when activated, provide a maximal spread of influence in the network for a given diffusion model and a budget for seed set size. IM has numerous applications such as viral marketing, epidemic control, sensor placement and other network-related tasks. However, the uses are limited due to the computational complexity of current algorithms. Recently, learning heuristics for IM have been explored to ease the computational burden. However, there are serious limitations in current approaches such as: (1) IM formulations only consider influence via spread and ignore self activation; (2) scalability to large graphs; (3) generalizability across graph families; (4) low computational efficiency with a large running time to identify seed sets for every test network. In this work, we address each of these limitations through a unique approach that involves (1) formulating a generic IM problem as a Markov decision process that handles both intrinsic and influence activations; (2) employing double Q learning to estimate seed nodes; (3) ensuring scalability via sub-graph based representations; and (4) incorporating generalizability via meta-learning across graph families. Extensive experiments are carried out in various standard networks to validate performance of the proposed Graph Meta Reinforcement learning (GraMeR) framework. The results indicate that GraMeR is multiple orders faster and generic than conventional approaches. △ Less

Submitted 29 May, 2022; originally announced May 2022.

Comments: 11 pages, 6 figures

arXiv:2205.10728 [pdf, other]

Neural Lyapunov Differentiable Predictive Control

Authors: Sayak Mukherjee, Ján Drgoňa, Aaron Tuor, Mahantesh Halappanavar, Draguna Vrabie

Abstract: We present a learning-based predictive control methodology using the differentiable programming framework with probabilistic Lyapunov-based stability guarantees. The neural Lyapunov differentiable predictive control (NLDPC) learns the policy by constructing a computational graph encompassing the system dynamics, state and input constraints, and the necessary Lyapunov certification constraints, and… ▽ More We present a learning-based predictive control methodology using the differentiable programming framework with probabilistic Lyapunov-based stability guarantees. The neural Lyapunov differentiable predictive control (NLDPC) learns the policy by constructing a computational graph encompassing the system dynamics, state and input constraints, and the necessary Lyapunov certification constraints, and thereafter using the automatic differentiation to update the neural policy parameters. In conjunction, our approach jointly learns a Lyapunov function that certifies the regions of state-space with stable dynamics. We also provide a sampling-based statistical guarantee for the training of NLDPC from the distribution of initial conditions. Our offline training approach provides a computationally efficient and scalable alternative to classical explicit model predictive control solutions. We substantiate the advantages of the proposed approach with simulations to stabilize the double integrator model and on an example of controlling an aircraft model. △ Less

Submitted 21 May, 2022; originally announced May 2022.

Comments: 8 pages; 9 figures

arXiv:2203.01447 [pdf, other]

Learning Stochastic Parametric Differentiable Predictive Control Policies

Authors: Ján Drgoňa, Sayak Mukherjee, Aaron Tuor, Mahantesh Halappanavar, Draguna Vrabie

Abstract: The problem of synthesizing stochastic explicit model predictive control policies is known to be quickly intractable even for systems of modest complexity when using classical control-theoretic methods. To address this challenge, we present a scalable alternative called stochastic parametric differentiable predictive control (SP-DPC) for unsupervised learning of neural control policies governing s… ▽ More The problem of synthesizing stochastic explicit model predictive control policies is known to be quickly intractable even for systems of modest complexity when using classical control-theoretic methods. To address this challenge, we present a scalable alternative called stochastic parametric differentiable predictive control (SP-DPC) for unsupervised learning of neural control policies governing stochastic linear systems subject to nonlinear chance constraints. SP-DPC is formulated as a deterministic approximation to the stochastic parametric constrained optimal control problem. This formulation allows us to directly compute the policy gradients via automatic differentiation of the problem's value function, evaluated over sampled parameters and uncertainties. In particular, the computed expectation of the SP-DPC problem's value function is backpropagated through the closed-loop system rollouts parametrized by a known nominal system dynamics model and neural control policy which allows for direct model-based policy optimization. We provide theoretical probabilistic guarantees for policies learned via the SP-DPC method on closed-loop stability and chance constraints satisfaction. Furthermore, we demonstrate the computational efficiency and scalability of the proposed policy optimization algorithm in three numerical examples, including systems with a large number of states or subject to nonlinear constraints. △ Less

Submitted 21 May, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

Comments: Full version for the paper accepted at the 10th IFAC Symposium on Robust Control Design (ROCOND) 2022

arXiv:2112.09815 [pdf, other]

Gradient-based Novelty Detection Boosted by Self-supervised Binary Classification

Authors: **gbo Sun, Li Yang, Jiaxin Zhang, Frank Liu, Mahantesh Halappanavar, Deliang Fan, Yu Cao

Abstract: Novelty detection aims to automatically identify out-of-distribution (OOD) data, without any prior knowledge of them. It is a critical step in data monitoring, behavior analysis and other applications, hel** enable continual learning in the field. Conventional methods of OOD detection perform multi-variate analysis on an ensemble of data or features, and usually resort to the supervision with OO… ▽ More Novelty detection aims to automatically identify out-of-distribution (OOD) data, without any prior knowledge of them. It is a critical step in data monitoring, behavior analysis and other applications, hel** enable continual learning in the field. Conventional methods of OOD detection perform multi-variate analysis on an ensemble of data or features, and usually resort to the supervision with OOD data to improve the accuracy. In reality, such supervision is impractical as one cannot anticipate the anomalous data. In this paper, we propose a novel, self-supervised approach that does not rely on any pre-defined OOD data: (1) The new method evaluates the Mahalanobis distance of the gradients between the in-distribution and OOD data. (2) It is assisted by a self-supervised binary classifier to guide the label selection to generate the gradients, and maximize the Mahalanobis distance. In the evaluation with multiple datasets, such as CIFAR-10, CIFAR-100, SVHN and TinyImageNet, the proposed approach consistently outperforms state-of-the-art supervised and unsupervised methods in the area under the receiver operating characteristic (AUROC) and area under the precision-recall curve (AUPR) metrics. We further demonstrate that this detector is able to accurately learn one OOD class in continual learning. △ Less

Submitted 17 December, 2021; originally announced December 2021.

arXiv:2111.04601 [pdf, other]

On the Stochastic Stability of Deep Markov Models

Authors: Ján Drgoňa, Sayak Mukherjee, Jiaxin Zhang, Frank Liu, Mahantesh Halappanavar

Abstract: Deep Markov models (DMM) are generative models that are scalable and expressive generalization of Markov models for representation, learning, and inference problems. However, the fundamental stochastic stability guarantees of such models have not been thoroughly investigated. In this paper, we provide sufficient conditions of DMM's stochastic stability as defined in the context of dynamical system… ▽ More Deep Markov models (DMM) are generative models that are scalable and expressive generalization of Markov models for representation, learning, and inference problems. However, the fundamental stochastic stability guarantees of such models have not been thoroughly investigated. In this paper, we provide sufficient conditions of DMM's stochastic stability as defined in the context of dynamical systems and propose a stability analysis method based on the contraction of probabilistic maps modeled by deep neural networks. We make connections between the spectral properties of neural network's weights and different types of used activation functions on the stability and overall dynamic behavior of DMMs with Gaussian distributions. Based on the theory, we propose a few practical methods for designing constrained DMMs with guaranteed stability. We empirically substantiate our theoretical results via intuitive numerical experiments using the proposed stability constraints. △ Less

Submitted 8 November, 2021; originally announced November 2021.

Comments: 35th Conference on Neural Information Processing Systems (NeurIPS 2021), Sydney, Australia

arXiv:2107.05793 [pdf, other]

A Parallel Approximation Algorithm for Maximizing Submodular $b$-Matching

Authors: S M Ferdous, Alex Pothen, Arif Khan, Ajay Panyala, Mahantesh Halappanavar

Abstract: We design new serial and parallel approximation algorithms for computing a maximum weight $b$-matching in an edge-weighted graph with a submodular objective function. This problem is NP-hard; the new algorithms have approximation ratio $1/3$, and are relaxations of the Greedy algorithm that rely only on local information in the graph, making them parallelizable. We have designed and implemented Lo… ▽ More We design new serial and parallel approximation algorithms for computing a maximum weight $b$-matching in an edge-weighted graph with a submodular objective function. This problem is NP-hard; the new algorithms have approximation ratio $1/3$, and are relaxations of the Greedy algorithm that rely only on local information in the graph, making them parallelizable. We have designed and implemented Local Lazy Greedy algorithms for both serial and parallel computers. We have applied the approximate submodular $b$-matching algorithm to assign tasks to processors in the computation of Fock matrices in quantum chemistry on parallel computers. The assignment seeks to reduce the run time by balancing the computational load on the processors and bounding the number of messages that each processor sends. We show that the new assignment of tasks to processors provides a four fold speedup over the currently used assignment in the NWChemEx software on $8000$ processors on the Summit supercomputer at Oak Ridge National Lab. △ Less

Submitted 12 July, 2021; originally announced July 2021.

Comments: 10 pages, accepted for SIAM ACDA 21

arXiv:2102.11498 [pdf, other]

V2W-BERT: A Framework for Effective Hierarchical Multiclass Classification of Software Vulnerabilities

Authors: Siddhartha Shankar Das, Edoardo Serra, Mahantesh Halappanavar, Alex Pothen, Ehab Al-Shaer

Abstract: Weaknesses in computer systems such as faults, bugs and errors in the architecture, design or implementation of software provide vulnerabilities that can be exploited by attackers to compromise the security of a system. Common Weakness Enumerations (CWE) are a hierarchically designed dictionary of software weaknesses that provide a means to understand software flaws, potential impact of their expl… ▽ More Weaknesses in computer systems such as faults, bugs and errors in the architecture, design or implementation of software provide vulnerabilities that can be exploited by attackers to compromise the security of a system. Common Weakness Enumerations (CWE) are a hierarchically designed dictionary of software weaknesses that provide a means to understand software flaws, potential impact of their exploitation, and means to mitigate these flaws. Common Vulnerabilities and Exposures (CVE) are brief low-level descriptions that uniquely identify vulnerabilities in a specific product or protocol. Classifying or map** of CVEs to CWEs provides a means to understand the impact and mitigate the vulnerabilities. Since manual map** of CVEs is not a viable option, automated approaches are desirable but challenging. We present a novel Transformer-based learning framework (V2W-BERT) in this paper. By using ideas from natural language processing, link prediction and transfer learning, our method outperforms previous approaches not only for CWE instances with abundant data to train, but also rare CWE classes with little or no data to train. Our approach also shows significant improvements in using historical data to predict links for future instances of CVEs, and therefore, provides a viable approach for practical applications. Using data from MITRE and National Vulnerability Database, we achieve up to 97% prediction accuracy for randomly partitioned data and up to 94% prediction accuracy in temporally partitioned data. We believe that our work will influence the design of better methods and training models, as well as applications to solve increasingly harder problems in cybersecurity. △ Less

Submitted 23 February, 2021; originally announced February 2021.

Comments: Under submission to KDD 2021 Applied Data Science Track

arXiv:1804.08016 [pdf, other]

A 2/3-Approximation Algorithm for Vertex-weighted Matching in Bipartite Graphs

Authors: Florin Dobrian, Mahantesh Halappanavar, Alex Pothen, Ahmed Al-Herz

Abstract: We consider the maximum vertex-weighted matching problem (MVM), in which non-negative weights are assigned to the vertices of a graph, the weight of a matching is the sum of the weights of the matched vertices, and we are required to compute a matching of maximum weight. We describe an exact algorithm for MVM with $O(|V|\, |E|)$ time complexity, and then we design a $2/3$-approximation algorithm f… ▽ More We consider the maximum vertex-weighted matching problem (MVM), in which non-negative weights are assigned to the vertices of a graph, the weight of a matching is the sum of the weights of the matched vertices, and we are required to compute a matching of maximum weight. We describe an exact algorithm for MVM with $O(|V|\, |E|)$ time complexity, and then we design a $2/3$-approximation algorithm for MVM on bipartite graphs by restricting the length of augmenting paths to at most three. The latter algorithm has time complexity $O(|E| + |V| \log |V|)$. The approximation algorithm solves two MVM problems on bipartite graphs, each with weights only on one vertex part, and then finds a matching from these two matchings using the Mendelsohn-Dulmage Theorem. The approximation ratio of the algorithm is obtained by considering failed vertices, i.e., vertices that the approximation algorithm fails to match but the exact algorithm does. We show that at every step of the algorithm there are two distinct heavier vertices that we can charge each failed vertex to. We have implemented the $2/3$-approximation algorithm for MVM and compare it with four other algorithms: an exact MEM algorithm, the exact MVM algorithm, a $1/2$-approximation algorithm for MVM, and a scaling-based $(1-ε)$-approximation algorithm for MEM. We also show that MVM problems should not be first transformed to MEM problems and solved using exact algorithms for the latter, since this transformation can increase runtimes by several orders of magnitude. △ Less

Submitted 11 October, 2018; v1 submitted 21 April, 2018; originally announced April 2018.

arXiv:1711.11098 [pdf, ps, other]

doi 10.1093/comnet/cny016

A generative graph model for electrical infrastructure networks

Authors: Sinan G. Aksoy, Emilie Purvine, Eduardo Cotilla-Sanchez, Mahantesh Halappanavar

Abstract: We propose a generative graph model for electrical infrastructure networks that accounts for heterogeneity in both node and edge type. To inform the model design, we analyze the properties of power grid graphs derived from the U.S. Eastern Interconnection, Texas Interconnection, and Poland transmission system power grids. Across these datasets, we find subgraphs induced by nodes of the same voltag… ▽ More We propose a generative graph model for electrical infrastructure networks that accounts for heterogeneity in both node and edge type. To inform the model design, we analyze the properties of power grid graphs derived from the U.S. Eastern Interconnection, Texas Interconnection, and Poland transmission system power grids. Across these datasets, we find subgraphs induced by nodes of the same voltage level exhibit shared structural properties atypical to small-world networks, including low local clustering, large diameter, and large average distance. On the other hand, we find subgraphs induced by transformer edges linking nodes of different voltage types contain a more limited structure, consisting mainly of small, disjoint star graphs. The goal of our proposed model is to match both these inter and intra-network properties by proceeding in two phases: the first phase adapts the Chung-Lu random graph model, taking desired vertex degrees and desired diameter as inputs, while the second phase of the model is based on a simpler random star graph generation process. We test the model's performance by comparing its output across many runs to the aforementioned real data. In nearly all categories tested, we find our model is more accurate in reproducing the unusual mixture of properties apparent in the data than the Chung-Lu model. We also include graph visualization comparisons, a brief analysis of edge-deletion resiliency, and guidelines for artificially generating the model inputs in the absence of real data. △ Less

Submitted 19 July, 2018; v1 submitted 29 November, 2017; originally announced November 2017.

arXiv:1707.05287 [pdf, other]

Exploring the Role of Intrinsic Nodal Activation on the Spread of Influence in Complex Networks

Authors: Arun Sathanur, Mahantesh Halappanavar, Yi Shi, Walin Sagduyu

Abstract: In many complex networked systems, such as online social networks, activity originates at certain nodes and subsequently spreads on the network through influence. In this work, we consider the problem of modeling the spread of influence and the identification of influential entities in a complex network when nodal activation can happen via two different mechanisms. The first mechanism of activatio… ▽ More In many complex networked systems, such as online social networks, activity originates at certain nodes and subsequently spreads on the network through influence. In this work, we consider the problem of modeling the spread of influence and the identification of influential entities in a complex network when nodal activation can happen via two different mechanisms. The first mechanism of activation stems from factors that are intrinsic to the node. The second mechanism comes from the influence of connected neighbors. After introducing the model, we provide an algorithm to mine for the influential nodes in such a scenario by modifying the well-known influence maximization algorithm to work with our model that incorporates both forms of activation. Our model can be considered as a variation of the independent cascade diffusion model. We provide small motivating examples to facilitate an intuitive understanding of the effect of including the intrinsic activation mechanism. We sketch a proof of the submodularity of the influence function under the new formulation and demonstrate the same on larger graphs. Based on the model, we explain how influential content creators can drive engagement on social media platforms. Using additional experiments on a Twitter dataset, we then show how the formulation can be applied to real-world social media datasets. Finally, we derive a centrality metric that takes into account, both the mechanisms of activation and provides for an accurate, computationally efficient, alternate approach to the problem of identifying influencers under intrinsic activation. △ Less

Submitted 17 July, 2017; originally announced July 2017.

arXiv:1706.03147 [pdf, ps, other]

AMPS: An Augmented Matrix Formulation for Principal Submatrix Updates with Application to Power Grids

Authors: Yu-Hong Yeung, Alex Pothen, Mahantesh Halappanavar, Zhenyu Huang

Abstract: We present AMPS, an augmented matrix approach to update the solution to a linear system of equations when the matrix is modified by a few elements within a principal submatrix. This problem arises in the dynamic security analysis of a power grid, where operators need to perform N - k contingency analysis, i.e., determine the state of the system when exactly k links from N fail. Our algorithms augm… ▽ More We present AMPS, an augmented matrix approach to update the solution to a linear system of equations when the matrix is modified by a few elements within a principal submatrix. This problem arises in the dynamic security analysis of a power grid, where operators need to perform N - k contingency analysis, i.e., determine the state of the system when exactly k links from N fail. Our algorithms augment the matrix to account for the changes in it, and then compute the solution to the augmented system without refactoring the modified matrix. We provide two algorithms, a direct method, and a hybrid direct-iterative method for solving the augmented system. We also exploit the sparsity of the matrices and vectors to accelerate the overall computation. We analyze the time complexity of both algorithms, and show that it is bounded by the number of nonzeros in a subset of the columns of the Cholesky factor that are selected by the nonzeros in the sparse right-hand-side vector. Our algorithms are compared on three power grids with PARDISO, a parallel direct solver, and CHOLMOD, a direct solver with the ability to modify the Cholesky factors of the matrix. We show that our augmented algorithms outperform PARDISO (by two orders of magnitude), and CHOLMOD (by a factor of up to 5). Further, our algorithms scale better than CHOLMOD as the number of elements updated increases. The solutions are computed with high accuracy. Our algorithms are capable of computing N - k contingency analysis on a 778 thousand bus grid, updating a solution with k = 20 elements in 16 milliseconds on an Intel Xeon processor. △ Less

Submitted 9 June, 2017; originally announced June 2017.

Comments: 19 pages, 4 figures, 2 tables, SIAM Journal on Scientific Computing

MSC Class: 65F50; 65F10; 65F05; 65Y20 ACM Class: G.1.3, G.1.10

arXiv:1512.01436 [pdf, other]

A Network-of-Networks Model for Electrical Infrastructure Networks

Authors: Mahantesh Halappanavar, Eduardo Cotilla-Sanchez, Emilie Hogan, Daniel Duncan, Zhenyu, Huang, Paul D. H. Hines

Abstract: Modeling power transmission networks is an important area of research with applications such as vulnerability analysis, study of cascading failures, and location of measurement devices. Graph-theoretic approaches have been widely used to solve these problems, but are subject to several limitations. One of the limitations is the ability to model a heterogeneous system in a consistent manner using t… ▽ More Modeling power transmission networks is an important area of research with applications such as vulnerability analysis, study of cascading failures, and location of measurement devices. Graph-theoretic approaches have been widely used to solve these problems, but are subject to several limitations. One of the limitations is the ability to model a heterogeneous system in a consistent manner using the standard graph-theoretic formulation. In this paper, we propose a {\em network-of-networks} approach for modeling power transmission networks in order to explicitly incorporate heterogeneity in the model. This model distinguishes between different components of the network that operate at different voltage ratings, and also captures the intra and inter-network connectivity patterns. By building the graph in this fashion we present a novel, and fundamentally different, perspective of power transmission networks. Consequently, this novel approach will have a significant impact on the graph-theoretic modeling of power grids that we believe will lead to a better understanding of transmission networks. △ Less

Submitted 26 November, 2015; originally announced December 2015.

arXiv:1501.00943 [pdf, other]

Comparative Study of Clustering Techniques for Real-Time Dynamic Model Reduction

Authors: Emilie Purvine, Eduardo Cotilla-Sanchez, Mahantesh Halappanavar, Zhenyu Huang, Guang Lin, Shuai Lu, Shaobu Wang

Abstract: Dynamic model reduction in power systems is necessary for improving computational efficiency. Traditional model reduction using linearized models or online analysis is not adequate to capture dynamic behaviors of the power system, especially with the new mix of intermittent generation and intelligent consumption making the power system more dynamic and non-linear. Real-time dynamic model reduction… ▽ More Dynamic model reduction in power systems is necessary for improving computational efficiency. Traditional model reduction using linearized models or online analysis is not adequate to capture dynamic behaviors of the power system, especially with the new mix of intermittent generation and intelligent consumption making the power system more dynamic and non-linear. Real-time dynamic model reduction has emerged to fill this important need. This paper explores using clustering techniques to analyze real-time phasor measurements to identify groups of generators with similar behavior, as well as a representative generator from each group for dynamic model reduction. Two clustering techniques -- graph clustering and k-means -- are considered. These techniques are compared with a previously developed dynamic model reduction approach using Singular Value Decomposition. Two sample power grid data sets are used to test these different model reduction techniques. Based on the algorithms' relative performance, recommendations are provided for practical use. △ Less

Submitted 18 July, 2017; v1 submitted 5 January, 2015; originally announced January 2015.

Comments: Statistical Analysis and Data Mining, in press, 2017

arXiv:1410.1237 [pdf, other]

Parallel Heuristics for Scalable Community Detection

Authors: Hao Lu, Mahantesh Halappanavar, Ananth Kalyanaraman

Abstract: Community detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to… ▽ More Community detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to the irregular and inherently sequential nature of the underlying heuristics. In this paper, we present parallelization heuristics for fast community detection using the Louvain method as the serial template. The Louvain method is an iterative heuristic for modularity optimization. Originally developed by Blondel et al. in 2008, the method has become increasingly popular owing to its ability to detect high modularity community partitions in a fast and memory-efficient manner. However, the method is also inherently sequential, thereby limiting its scalability. Here, we observe certain key properties of this method that present challenges for its parallelization, and consequently propose heuristics that are designed to break the sequential barrier. For evaluation purposes, we implemented our heuristics using OpenMP multithreading, and tested them over real world graphs derived from multiple application domains (e.g., internet, citation, biological). Compared to the serial Louvain implementation, our parallel implementation is able to produce community outputs with a higher modularity for most of the inputs tested, in comparable number or fewer iterations, while providing absolute speedups of up to 16x using 32 threads. △ Less

Submitted 6 October, 2014; v1 submitted 5 October, 2014; originally announced October 2014.

Comments: Submitted to a journal

arXiv:1304.6761 [pdf, other]

doi 10.1109/ISI.2013.6578796

Towards a Networks-of-Networks Framework for Cyber Security

Authors: Mahantesh Halappanavar, Sutanay Choudhury, Emilie Hogan, Peter Hui, John R. Johnson, Indrajit Ray, Lawrence Holder

Abstract: Networks-of-networks (NoN) is a graph-theoretic model of interdependent networks that have distinct dynamics at each network (layer). By adding special edges to represent relationships between nodes in different layers, NoN provides a unified mechanism to study interdependent systems intertwined in a complex relationship. While NoN based models have been proposed for cyber-physical systems, in thi… ▽ More Networks-of-networks (NoN) is a graph-theoretic model of interdependent networks that have distinct dynamics at each network (layer). By adding special edges to represent relationships between nodes in different layers, NoN provides a unified mechanism to study interdependent systems intertwined in a complex relationship. While NoN based models have been proposed for cyber-physical systems, in this position paper we build towards a three-layered NoN model for an enterprise cyber system. Each layer captures a different facet of a cyber system. We present in-depth discussion for four major graph- theoretic applications to demonstrate how the three-layered NoN model can be leveraged for continuous system monitoring and mission assurance. △ Less

Submitted 24 April, 2013; originally announced April 2013.

Comments: A shorter (3-page) version of this paper will appear in the Proceedings of the IEEE Intelligence and Security Informatics 2013, Seattle Washington, USA, June 4-7, 2013

arXiv:1212.4174 [pdf, other]

Feature Clustering for Accelerating Parallel Coordinate Descent

Authors: Chad Scherrer, Ambuj Tewari, Mahantesh Halappanavar, David Haglin

Abstract: Large-scale L1-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. High-performance algorithms and implementations are critical to efficiently solving these problems. Building upon previous work on coordinate descent algorithms for L1-regularized problems… ▽ More Large-scale L1-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. High-performance algorithms and implementations are critical to efficiently solving these problems. Building upon previous work on coordinate descent algorithms for L1-regularized problems, we introduce a novel family of algorithms called block-greedy coordinate descent that includes, as special cases, several existing algorithms such as SCD, Greedy CD, Shotgun, and Thread-Greedy. We give a unified convergence analysis for the family of block-greedy algorithms. The analysis suggests that block-greedy coordinate descent can better exploit parallelism if features are clustered so that the maximum inner product between features in different blocks is small. Our theoretical convergence analysis is supported with experimental re- sults using data from diverse real-world applications. We hope that algorithmic approaches and convergence analysis we provide will not only advance the field, but will also encourage researchers to systematically explore the design space of algorithms for solving large-scale L1-regularization problems. △ Less

Submitted 17 December, 2012; originally announced December 2012.

Comments: Accepted for publication in the proceedings of NIPS (Neural Information Processing Systems Foundations) 2012, Lake Tahoe, Nevada

arXiv:1206.6409 [pdf]

Scaling Up Coordinate Descent Algorithms for Large $\ell_1$ Regularization Problems

Authors: Chad Scherrer, Mahantesh Halappanavar, Ambuj Tewari, David Haglin

Abstract: We present a generic framework for parallel coordinate descent (CD) algorithms that includes, as special cases, the original sequential algorithms Cyclic CD and Stochastic CD, as well as the recent parallel Shotgun algorithm. We introduce two novel parallel algorithms that are also special cases---Thread-Greedy CD and Coloring-Based CD---and give performance measurements for an OpenMP implementati… ▽ More We present a generic framework for parallel coordinate descent (CD) algorithms that includes, as special cases, the original sequential algorithms Cyclic CD and Stochastic CD, as well as the recent parallel Shotgun algorithm. We introduce two novel parallel algorithms that are also special cases---Thread-Greedy CD and Coloring-Based CD---and give performance measurements for an OpenMP implementation of these. △ Less

Submitted 27 June, 2012; originally announced June 2012.

Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

arXiv:1205.3809 [pdf, other]

Graph Coloring Algorithms for Muti-core and Massively Multithreaded Architectures

Authors: Umit Catalyurek, John Feo, Assefaw Gebremedhin, Mahantesh Halappanavar, Alex Pothen

Abstract: We explore the interplay between architectures and algorithm design in the context of shared-memory platforms and a specific graph problem of central importance in scientific and high-performance computing, distance-1 graph coloring. We introduce two different kinds of multithreaded heuristic algorithms for the stated, NP-hard, problem. The first algorithm relies on speculation and iteration, and… ▽ More We explore the interplay between architectures and algorithm design in the context of shared-memory platforms and a specific graph problem of central importance in scientific and high-performance computing, distance-1 graph coloring. We introduce two different kinds of multithreaded heuristic algorithms for the stated, NP-hard, problem. The first algorithm relies on speculation and iteration, and is suitable for any shared-memory system. The second algorithm uses dataflow principles, and is targeted at the non-conventional, massively multithreaded Cray XMT system. We study the performance of the algorithms on the Cray XMT and two multi-core systems, Sun Niagara 2 and Intel Nehalem. Together, the three systems represent a spectrum of multithreading capabilities and memory structure. As testbed, we use synthetically generated large-scale graphs carefully chosen to cover a wide range of input types. The results show that the algorithms have scalable runtime performance and use nearly the same number of colors as the underlying serial algorithm, which in turn is effective in practice. The study provides insight into the design of high performance algorithms for irregular problems on many-core architectures. △ Less

Submitted 16 May, 2012; originally announced May 2012.

Comments: 25 pages, 11 figures, 4 tables

Showing 1–35 of 35 results for author: Halappanavar, M