Search | arXiv e-print repository

arXiv:2406.19596 [pdf, other]

Optimizing Cyber Defense in Dynamic Active Directories through Reinforcement Learning

Authors: Diksha Goel, Kristen Moore, Mingyu Guo, Derui Wang, Minjune Kim, Seyit Camtepe

Abstract: This paper addresses a significant gap in Autonomous Cyber Operations (ACO) literature: the absence of effective edge-blocking ACO strategies in dynamic, real-world networks. It specifically targets the cybersecurity vulnerabilities of organizational Active Directory (AD) systems. Unlike the existing literature on edge-blocking defenses which considers AD systems as static entities, our study coun… ▽ More This paper addresses a significant gap in Autonomous Cyber Operations (ACO) literature: the absence of effective edge-blocking ACO strategies in dynamic, real-world networks. It specifically targets the cybersecurity vulnerabilities of organizational Active Directory (AD) systems. Unlike the existing literature on edge-blocking defenses which considers AD systems as static entities, our study counters this by recognizing their dynamic nature and develo** advanced edge-blocking defenses through a Stackelberg game model between attacker and defender. We devise a Reinforcement Learning (RL)-based attack strategy and an RL-assisted Evolutionary Diversity Optimization-based defense strategy, where the attacker and defender improve each other strategy via parallel gameplay. To address the computational challenges of training attacker-defender strategies on numerous dynamic AD graphs, we propose an RL Training Facilitator that prunes environments and neural networks to eliminate irrelevant elements, enabling efficient and scalable training for large graphs. We extensively train the attacker strategy, as a sophisticated attacker model is essential for a robust defense. Our empirical results successfully demonstrate that our proposed approach enhances defender's proficiency in hardening dynamic AD graphs while ensuring scalability for large-scale AD. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: The manuscript has been accepted as full paper at European Symposium on Research in Computer Security (ESORICS) 2024

arXiv:2404.03662 [pdf, other]

X-lifecycle Learning for Cloud Incident Management using LLMs

Authors: Drishti Goel, Fiza Husain, Aditya Singh, Supriyo Ghosh, Anjaly Parayil, Chetan Bansal, Xuchao Zhang, Saravan Rajmohan

Abstract: Incident management for large cloud services is a complex and tedious process and requires significant amount of manual efforts from on-call engineers (OCEs). OCEs typically leverage data from different stages of the software development lifecycle [SDLC] (e.g., codes, configuration, monitor data, service properties, service dependencies, trouble-shooting documents, etc.) to generate insights for d… ▽ More Incident management for large cloud services is a complex and tedious process and requires significant amount of manual efforts from on-call engineers (OCEs). OCEs typically leverage data from different stages of the software development lifecycle [SDLC] (e.g., codes, configuration, monitor data, service properties, service dependencies, trouble-shooting documents, etc.) to generate insights for detection, root causing and mitigating of incidents. Recent advancements in large language models [LLMs] (e.g., ChatGPT, GPT-4, Gemini) created opportunities to automatically generate contextual recommendations to the OCEs assisting them to quickly identify and mitigate critical issues. However, existing research typically takes a silo-ed view for solving a certain task in incident management by leveraging data from a single stage of SDLC. In this paper, we demonstrate that augmenting additional contextual data from different stages of SDLC improves the performance of two critically important and practically challenging tasks: (1) automatically generating root cause recommendations for dependency failure related incidents, and (2) identifying ontology of service monitors used for automatically detecting incidents. By leveraging 353 incident and 260 monitor dataset from Microsoft, we demonstrate that augmenting contextual information from different stages of the SDLC improves the performance over State-of-The-Art methods. △ Less

Submitted 15 February, 2024; originally announced April 2024.

arXiv:2403.15169 [pdf, other]

Towards Deep Learning Enabled Cybersecurity Risk Assessment for Microservice Architectures

Authors: Majid Abdulsatar, Hussain Ahmad, Diksha Goel, Faheem Ullah

Abstract: The widespread adoption of microservice architectures has given rise to a new set of software security challenges. These challenges stem from the unique features inherent in microservices. It is important to systematically assess and address software security challenges such as software security risk assessment. However, existing approaches prove inefficient in accurately evaluating the security r… ▽ More The widespread adoption of microservice architectures has given rise to a new set of software security challenges. These challenges stem from the unique features inherent in microservices. It is important to systematically assess and address software security challenges such as software security risk assessment. However, existing approaches prove inefficient in accurately evaluating the security risks associated with microservice architectures. To address this issue, we propose CyberWise Predictor, a framework designed for predicting and assessing security risks associated with microservice architectures. Our framework employs deep learning-based natural language processing models to analyze vulnerability descriptions for predicting vulnerability metrics to assess security risks. Our experimental evaluation shows the effectiveness of CyberWise Predictor, achieving an average accuracy of 92% in automatically predicting vulnerability metrics for new vulnerabilities. Our framework and findings serve as a guide for software developers to identify and mitigate security risks in microservice architectures. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2310.10667 [pdf, other]

Enhancing Network Resilience through Machine Learning-powered Graph Combinatorial Optimization: Applications in Cyber Defense and Information Diffusion

Authors: Diksha Goel

Abstract: With the burgeoning advancements of computing and network communication technologies, network infrastructures and their application environments have become increasingly complex. Due to the increased complexity, networks are more prone to hardware faults and highly susceptible to cyber-attacks. Therefore, for rapidly growing network-centric applications, network resilience is essential to minimize… ▽ More With the burgeoning advancements of computing and network communication technologies, network infrastructures and their application environments have become increasingly complex. Due to the increased complexity, networks are more prone to hardware faults and highly susceptible to cyber-attacks. Therefore, for rapidly growing network-centric applications, network resilience is essential to minimize the impact of attacks and to ensure that the network provides an acceptable level of services during attacks, faults or disruptions. In this regard, this thesis focuses on develo** effective approaches for enhancing network resilience. Existing approaches for enhancing network resilience emphasize on determining bottleneck nodes and edges in the network and designing proactive responses to safeguard the network against attacks. However, existing solutions generally consider broader application domains and possess limited applicability when applied to specific application areas such as cyber defense and information diffusion, which are highly popular application domains among cyber attackers. This thesis aims to design effective, efficient and scalable techniques for discovering bottleneck nodes and edges in the network to enhance network resilience in cyber defense and information diffusion application domains. We first investigate a cyber defense graph optimization problem, i.e., hardening active directory systems by discovering bottleneck edges in the network. We then study the problem of identifying bottleneck structural hole spanner nodes, which are crucial for information diffusion in the network. We transform both problems into graph-combinatorial optimization problems and design machine learning based approaches for discovering bottleneck points vital for enhancing network resilience. △ Less

Submitted 21 September, 2023; originally announced October 2023.

Comments: PhD Thesis

arXiv:2305.11657 [pdf, other]

Cost Sharing Public Project with Minimum Release Delay

Authors: Mingyu Guo, Diksha Goel, Guanhua Wang, Yong Yang, Muhammad Ali Babar

Abstract: We study the excludable public project model where the decision is binary (build or not build). In a classic excludable and binary public project model, an agent either consumes the project in its whole or is completely excluded. We study a setting where the mechanism can set different project release time for different agents, in the sense that high-paying agents can consume the project earlier t… ▽ More We study the excludable public project model where the decision is binary (build or not build). In a classic excludable and binary public project model, an agent either consumes the project in its whole or is completely excluded. We study a setting where the mechanism can set different project release time for different agents, in the sense that high-paying agents can consume the project earlier than low-paying agents. The release delay, while hurting the social welfare, is implemented to incentivize payments to cover the project cost. The mechanism design objective is to minimize the maximum release delay and the total release delay among all agents. We first consider the setting where we know the prior distribution of the agents' types. Our objectives are minimizing the expected maximum release delay and the expected total release delay. We propose the single deadline mechanisms. We show that the optimal single deadline mechanism is asymptotically optimal for both objectives, regardless of the prior distribution. For small number of agents, we propose the sequential unanimous mechanisms by extending the largest unanimous mechanisms from [Ohseto 2000]. We propose an automated mechanism design approach via evolutionary computation to optimize within the sequential unanimous mechanisms. We next study prior-free mechanism design. We propose the group-based optimal deadline mechanism and show that it is competitive against an undominated mechanism under minor technical assumptions. △ Less

Submitted 19 May, 2023; originally announced May 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2204.07315

arXiv:2304.03998 [pdf, other]

Evolving Reinforcement Learning Environment to Minimize Learner's Achievable Reward: An Application on Hardening Active Directory Systems

Authors: Diksha Goel, Aneta Neumann, Frank Neumann, Hung Nguyen, Mingyu Guo

Abstract: We study a Stackelberg game between one attacker and one defender in a configurable environment. The defender picks a specific environment configuration. The attacker observes the configuration and attacks via Reinforcement Learning (RL trained against the observed environment). The defender's goal is to find the environment with minimum achievable reward for the attacker. We apply Evolutionary Di… ▽ More We study a Stackelberg game between one attacker and one defender in a configurable environment. The defender picks a specific environment configuration. The attacker observes the configuration and attacks via Reinforcement Learning (RL trained against the observed environment). The defender's goal is to find the environment with minimum achievable reward for the attacker. We apply Evolutionary Diversity Optimization (EDO) to generate diverse population of environments for training. Environments with clearly high rewards are killed off and replaced by new offsprings to avoid wasting training time. Diversity not only improves training quality but also fits well with our RL scenario: RL agents tend to improve gradually, so a slightly worse environment earlier on may become better later. We demonstrate the effectiveness of our approach by focusing on a specific application, Active Directory (AD). AD is the default security management system for Windows domain networks. AD environment describes an attack graph, where nodes represent computers/accounts/etc., and edges represent accesses. The attacker aims to find the best attack path to reach the highest-privilege node. The defender can change the graph by removing a limited number of edges (revoke accesses). Our approach generates better defensive plans than the existing approach and scales better. △ Less

Submitted 8 April, 2023; originally announced April 2023.

arXiv:2302.13292 [pdf, other]

Discovering Top-k Structural Hole Spanners in Dynamic Networks

Authors: Diksha Goel, Hong Shen, Hui Tian, Mingyu Guo

Abstract: Structural Hole (SH) theory states that the node which acts as a connecting link among otherwise disconnected communities gets positional advantages in the network. These nodes are called Structural Hole Spanners (SHS). Numerous solutions are proposed to discover SHSs; however, most of the solutions are only applicable to static networks. Since real-world networks are dynamic networks; consequentl… ▽ More Structural Hole (SH) theory states that the node which acts as a connecting link among otherwise disconnected communities gets positional advantages in the network. These nodes are called Structural Hole Spanners (SHS). Numerous solutions are proposed to discover SHSs; however, most of the solutions are only applicable to static networks. Since real-world networks are dynamic networks; consequently, in this study, we aim to discover SHSs in dynamic networks. Discovering SHSs is an NP-hard problem, due to which, instead of discovering exact k SHSs, we adopt a greedy approach to discover Top-k SHSs. We first propose an efficient Tracking-SHS algorithm for updating SHSs in dynamic networks. Our algorithm reuses the information obtained during the initial runs of the static algorithm and avoids the recomputations for the nodes unaffected by the updates. Besides, motivated from the success of Graph Neural Networks (GNNs) on various graph mining problems, we also design a Graph Neural Network-based model, GNN-SHS, to discover SHSs in dynamic networks, aiming to reduce the computational cost while achieving high accuracy. We provide a theoretical analysis of the Tracking-SHS algorithm, and our theoretical results prove that for a particular type of graphs, such as Preferential Attachment graphs [1], Tracking-SHS algorithm achieves 1.6 times of speedup compared with the static algorithm. We perform extensive experiments, and our results demonstrate that the Tracking-SHS algorithm attains a minimum of 3.24 times speedup over the static algorithm. Also, the proposed second model GNN-SHS is on an average 671.6 times faster than the Tracking-SHS algorithm. △ Less

Submitted 26 February, 2023; originally announced February 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2212.08239

arXiv:2302.12442 [pdf, other]

Effective Graph-Neural-Network based Models for Discovering Structural Hole Spanners in Large-Scale and Diverse Networks

Authors: Diksha Goel, Hong Shen, Hui Tian, Mingyu Guo

Abstract: A Structural Hole Spanner (SHS) is a set of nodes in a network that act as a bridge among different otherwise disconnected communities. Numerous solutions have been proposed to discover SHSs that generally require high run time on large-scale networks. Another challenge is discovering SHSs across different types of networks for which the traditional one-model-fit-all approach fails to capture the… ▽ More A Structural Hole Spanner (SHS) is a set of nodes in a network that act as a bridge among different otherwise disconnected communities. Numerous solutions have been proposed to discover SHSs that generally require high run time on large-scale networks. Another challenge is discovering SHSs across different types of networks for which the traditional one-model-fit-all approach fails to capture the inter-graph difference, particularly in the case of diverse networks. Therefore, there is an urgent need of develo** effective solutions for discovering SHSs in large-scale and diverse networks. Inspired by the recent advancement of graph neural network approaches on various graph problems, we propose graph neural network-based models to discover SHS nodes in large scale networks and diverse networks. We transform the problem into a learning problem and propose an efficient model GraphSHS, that exploits both the network structure and node features to discover SHS nodes in large scale networks, endeavouring to lessen the computational cost while maintaining high accuracy. To effectively discover SHSs across diverse networks, we propose another model Meta-GraphSHS based on meta-learning that learns generalizable knowledge from diverse training graphs (instead of directly learning the model) and utilizes the learned knowledge to create a customized model to identify SHSs in each new graph. We theoretically show that the depth of the proposed graph neural network model should be at least $Ω(\sqrt{n}/\log n)$ to accurately calculate the SHSs discovery problem. We evaluate the performance of the proposed models through extensive experiments on synthetic and real-world datasets. Our experimental results show that GraphSHS discovers SHSs with high accuracy and is at least 167.1 times faster than the comparative methods on large-scale real-world datasets. △ Less

Submitted 23 February, 2023; originally announced February 2023.

arXiv:2212.08239 [pdf, other]

Discovering Structural Hole Spanners in Dynamic Networks via Graph Neural Networks

Authors: Diksha Goel, Hong Shen, Hui Tian, Mingyu Guo

Abstract: Structural Hole (SH) theory states that the node which acts as a connecting link among otherwise disconnected communities gets positional advantages in the network. These nodes are called Structural Hole Spanners (SHS). SHSs have many applications, including viral marketing, information dissemination, community detection, etc. Numerous solutions are proposed to discover SHSs; however, most of the… ▽ More Structural Hole (SH) theory states that the node which acts as a connecting link among otherwise disconnected communities gets positional advantages in the network. These nodes are called Structural Hole Spanners (SHS). SHSs have many applications, including viral marketing, information dissemination, community detection, etc. Numerous solutions are proposed to discover SHSs; however, most of the solutions are only applicable to static networks. Since real-world networks are dynamic networks; consequently, in this study, we aim to discover SHSs in dynamic networks. Discovering SHSs is an NP-hard problem, due to which, instead of discovering exact k SHSs, we adopt a greedy approach to discover top-k SHSs. Motivated from the success of Graph Neural Networks (GNNs) on various graph mining problems, we design a Graph Neural Network-based model, GNN-SHS, to discover SHSs in dynamic networks, aiming to reduce the computational cost while achieving high accuracy. We analyze the efficiency of the proposed model through exhaustive experiments, and our results show that the proposed GNN-SHS model is at least 31.8 times faster and, on an average 671.6 times faster than the comparative method, providing a considerable efficiency advantage. △ Less

Submitted 15 December, 2022; originally announced December 2022.

arXiv:2210.09962 [pdf, other]

Nighttime Dehaze-Enhancement

Authors: Harshan Baskar, Anirudh S Chakravarthy, Prateek Garg, Divyam Goel, Abhijith S Raj, Kshitij Kumar, Lakshya, Ravichandra Parvatham, V Sushant, Bijay Kumar Rout

Abstract: In this paper, we introduce a new computer vision task called nighttime dehaze-enhancement. This task aims to jointly perform dehazing and lightness enhancement. Our task fundamentally differs from nighttime dehazing -- our goal is to jointly dehaze and enhance scenes, while nighttime dehazing aims to dehaze scenes under a nighttime setting. In order to facilitate further research on this task, we… ▽ More In this paper, we introduce a new computer vision task called nighttime dehaze-enhancement. This task aims to jointly perform dehazing and lightness enhancement. Our task fundamentally differs from nighttime dehazing -- our goal is to jointly dehaze and enhance scenes, while nighttime dehazing aims to dehaze scenes under a nighttime setting. In order to facilitate further research on this task, we release a new benchmark dataset called Reside-$β$ Night dataset, consisting of 4122 nighttime hazed images from 2061 scenes and 2061 ground truth images. Moreover, we also propose a new network called NDENet (Nighttime Dehaze-Enhancement Network), which jointly performs dehazing and low-light enhancement in an end-to-end manner. We evaluate our method on the proposed benchmark and achieve SSIM of 0.8962 and PSNR of 26.25. We also compare our network with other baseline networks on our benchmark to demonstrate the effectiveness of our approach. We believe that nighttime dehaze-enhancement is an essential task particularly for autonomous navigation applications, and hope that our work will open up new frontiers in research. Our dataset and code will be made publicly available upon acceptance of our paper. △ Less

Submitted 18 October, 2022; originally announced October 2022.

arXiv:2209.01346 [pdf, other]

HammingMesh: A Network Topology for Large-Scale Deep Learning

Authors: Torsten Hoefler, Tommaso Bonato, Daniele De Sensi, Salvatore Di Girolamo, Shigang Li, Marco Heddes, Jon Belk, Deepak Goel, Miguel Castro, Steve Scott

Abstract: Numerous microarchitectural optimizations unlocked tremendous processing power for deep neural networks that in turn fueled the AI revolution. With the exhaustion of such optimizations, the growth of modern AI is now gated by the performance of training systems, especially their data movement. Instead of focusing on single accelerators, we investigate data-movement characteristics of large-scale t… ▽ More Numerous microarchitectural optimizations unlocked tremendous processing power for deep neural networks that in turn fueled the AI revolution. With the exhaustion of such optimizations, the growth of modern AI is now gated by the performance of training systems, especially their data movement. Instead of focusing on single accelerators, we investigate data-movement characteristics of large-scale training at full system scale. Based on our workload analysis, we design HammingMesh, a novel network topology that provides high bandwidth at low cost with high job scheduling flexibility. Specifically, HammingMesh can support full bandwidth and isolation to deep learning training jobs with two dimensions of parallelism. Furthermore, it also supports high global bandwidth for generic traffic. Thus, HammingMesh will power future large-scale deep learning systems with extreme bandwidth requirements. △ Less

Submitted 21 October, 2022; v1 submitted 3 September, 2022; originally announced September 2022.

Comments: published at ACM/IEEE Supercomputing (SC22)

arXiv:2205.13164 [pdf, other]

Leveraging Dependency Grammar for Fine-Grained Offensive Language Detection using Graph Convolutional Networks

Authors: Divyam Goel, Raksha Sharma

Abstract: The last few years have witnessed an exponential rise in the propagation of offensive text on social media. Identification of this text with high precision is crucial for the well-being of society. Most of the existing approaches tend to give high toxicity scores to innocuous statements (e.g., "I am a gay man"). These false positives result from over-generalization on the training data where speci… ▽ More The last few years have witnessed an exponential rise in the propagation of offensive text on social media. Identification of this text with high precision is crucial for the well-being of society. Most of the existing approaches tend to give high toxicity scores to innocuous statements (e.g., "I am a gay man"). These false positives result from over-generalization on the training data where specific terms in the statement may have been used in a pejorative sense (e.g., "gay"). Emphasis on such words alone can lead to discrimination against the classes these systems are designed to protect. In this paper, we address the problem of offensive language detection on Twitter, while also detecting the type and the target of the offence. We propose a novel approach called SyLSTM, which integrates syntactic features in the form of the dependency parse tree of a sentence and semantic features in the form of word embeddings into a deep learning architecture using a Graph Convolutional Network. Results show that the proposed approach significantly outperforms the state-of-the-art BERT model with orders of magnitude fewer number of parameters. △ Less

Submitted 26 May, 2022; originally announced May 2022.

Comments: 10 pages, 2 figures, accepted as a long paper at The 10th International Workshop on Natural Language Processing for Social Media (SocialNLP @ NAACL 2022)

arXiv:2204.08653 [pdf, other]

On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules

Authors: Divyam Goel, Ramansh Grover, Fatemeh H. Fard

Abstract: Pre-trained neural Language Models (PTLM), such as CodeBERT, are recently used in software engineering as models pre-trained on large source code corpora. Their knowledge is transferred to downstream tasks (e.g. code clone detection) via fine-tuning. In natural language processing (NLP), other alternatives for transferring the knowledge of PTLMs are explored through using adapters, compact, parame… ▽ More Pre-trained neural Language Models (PTLM), such as CodeBERT, are recently used in software engineering as models pre-trained on large source code corpora. Their knowledge is transferred to downstream tasks (e.g. code clone detection) via fine-tuning. In natural language processing (NLP), other alternatives for transferring the knowledge of PTLMs are explored through using adapters, compact, parameter efficient modules inserted in the layers of the PTLM. Although adapters are known to facilitate adapting to many downstream tasks compared to fine-tuning the model that require retraining all of the models' parameters -- which owes to the adapters' plug and play nature and being parameter efficient -- their usage in software engineering is not explored. Here, we explore the knowledge transfer using adapters and based on the Naturalness Hypothesis proposed by Hindle et. al \cite{hindle2016naturalness}. Thus, studying the bimodality of adapters for two tasks of cloze test and code clone detection, compared to their benchmarks from the CodeXGLUE platform. These adapters are trained using programming languages and are inserted in a PTLM that is pre-trained on English corpora (N-PTLM). Three programming languages, C/C++, Python, and Java, are studied along with extensive experiments on the best setup used for adapters. Improving the results of the N-PTLM confirms the success of the adapters in knowledge transfer to software engineering, which sometimes are in par with or exceed the results of a PTLM trained on source code; while being more efficient in terms of the number of parameters, memory usage, and inference time. Our results can open new directions to build smaller models for more software engineering tasks. We open source all the scripts and the trained adapters. △ Less

Submitted 19 April, 2022; originally announced April 2022.

Comments: 11 pages, 6 figures, ICPC 2022. 30th International Conference on Program Comprehension (ICPC '22), May 16--17, 2022, Virtual Event, USA}

arXiv:2204.03397 [pdf, other]

doi 10.1145/3512290.3528729

Defending Active Directory by Combining Neural Network based Dynamic Program and Evolutionary Diversity Optimisation

Authors: Diksha Goel, Max Ward, Aneta Neumann, Frank Neumann, Hung Nguyen, Mingyu Guo

Abstract: Active Directory (AD) is the default security management system for Windows domain networks. We study a Stackelberg game model between one attacker and one defender on an AD attack graph. The attacker initially has access to a set of entry nodes. The attacker can expand this set by strategically exploring edges. Every edge has a detection rate and a failure rate. The attacker aims to maximize thei… ▽ More Active Directory (AD) is the default security management system for Windows domain networks. We study a Stackelberg game model between one attacker and one defender on an AD attack graph. The attacker initially has access to a set of entry nodes. The attacker can expand this set by strategically exploring edges. Every edge has a detection rate and a failure rate. The attacker aims to maximize their chance of successfully reaching the destination before getting detected. The defender's task is to block a constant number of edges to decrease the attacker's chance of success. We show that the problem is #P-hard and, therefore, intractable to solve exactly. We convert the attacker's problem to an exponential sized Dynamic Program that is approximated by a Neural Network (NN). Once trained, the NN provides an efficient fitness function for the defender's Evolutionary Diversity Optimisation (EDO). The diversity emphasis on the defender's solution provides a diverse set of training samples, which improves the training accuracy of our NN for modelling the attacker. We go back and forth between NN training and EDO. Experimental results show that for R500 graph, our proposed EDO based defense is less than 1% away from the optimal defense. △ Less

Submitted 4 January, 2023; v1 submitted 7 April, 2022; originally announced April 2022.

Comments: Added reference [12] on page 3 and 4. Corrected spelling EVC to VEC on page 10

Journal ref: Proceedings of the Genetic and Evolutionary Computation Conference, 2022, Pages 1191 to 1199

Showing 1–14 of 14 results for author: Goel, D