Search | arXiv e-print repository

Towards Automated Generation of Smart Grid Cyber Range for Cybersecurity Experiments and Training

Authors: Daisuke Mashima, Muhammad M. Roomi, Bennet Ng, Zbigniew Kalbarczyk, S. M. Suhail Hussain, Ee-chien Chang

Abstract: Assurance of cybersecurity is crucial to ensure dependability and resilience of smart power grid systems. In order to evaluate the impact of potential cyber attacks, to assess deployability and effectiveness of cybersecurity measures, and to enable hands-on exercise and training of personals, an interactive, virtual environment that emulates the behaviour of a smart grid system, namely smart grid… ▽ More Assurance of cybersecurity is crucial to ensure dependability and resilience of smart power grid systems. In order to evaluate the impact of potential cyber attacks, to assess deployability and effectiveness of cybersecurity measures, and to enable hands-on exercise and training of personals, an interactive, virtual environment that emulates the behaviour of a smart grid system, namely smart grid cyber range, has been demanded by industry players as well as academia. A smart grid cyber range is typically implemented as a combination of cyber system emulation, which allows interactivity, and physical system (i.e., power grid) simulation that are tightly coupled for consistent cyber and physical behaviours. However, its design and implementation require intensive expertise and efforts in cyber and physical aspects of smart power systems as well as software/system engineering. While many industry players, including power grid operators, device vendors, research and education sectors are interested, availability of the smart grid cyber range is limited to a small number of research labs. To address this challenge, we have developed a framework for modelling a smart grid cyber range using an XML-based language, called SG-ML, and for "compiling" the model into an operational cyber range with minimal engineering efforts. The modelling language includes standardized schema from IEC 61850 and IEC 61131, which allows industry players to utilize their existing configurations. The SG-ML framework aims at making a smart grid cyber range available to broader user bases to facilitate cybersecurity R\&D and hands-on exercises. △ Less

Submitted 31 March, 2024; originally announced April 2024.

Comments: Published at DSN 2023 Industry Track

arXiv:2402.04538 [pdf, other]

Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers

Authors: Md Shamim Hussain, Mohammed J. Zaki, Dharmashankar Subramanian

Abstract: Graph transformers typically lack third-order interactions, limiting their geometric understanding which is crucial for tasks like molecular geometry prediction. We propose the Triplet Graph Transformer (TGT) that enables direct communication between pairs within a 3-tuple of nodes via novel triplet attention and aggregation mechanisms. TGT is applied to molecular property prediction by first pred… ▽ More Graph transformers typically lack third-order interactions, limiting their geometric understanding which is crucial for tasks like molecular geometry prediction. We propose the Triplet Graph Transformer (TGT) that enables direct communication between pairs within a 3-tuple of nodes via novel triplet attention and aggregation mechanisms. TGT is applied to molecular property prediction by first predicting interatomic distances from 2D graphs and then using these distances for downstream tasks. A novel three-stage training procedure and stochastic inference further improve training efficiency and model performance. Our model achieves new state-of-the-art (SOTA) results on open challenge benchmarks PCQM4Mv2 and OC20 IS2RE. We also obtain SOTA results on QM9, MOLPCBA, and LIT-PCBA molecular property prediction benchmarks via transfer learning. We also demonstrate the generality of TGT with SOTA results on the traveling salesman problem (TSP). △ Less

Submitted 9 June, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

Comments: ICML'24 Accepted Version, 25 pages, 10 figures, 18 tables

arXiv:2306.01705 [pdf, other]

doi 10.1145/3580305.3599520

The Information Pathways Hypothesis: Transformers are Dynamic Self-Ensembles

Authors: Md Shamim Hussain, Mohammed J. Zaki, Dharmashankar Subramanian

Abstract: Transformers use the dense self-attention mechanism which gives a lot of flexibility for long-range connectivity. Over multiple layers of a deep transformer, the number of possible connectivity patterns increases exponentially. However, very few of these contribute to the performance of the network, and even fewer are essential. We hypothesize that there are sparsely connected sub-networks within… ▽ More Transformers use the dense self-attention mechanism which gives a lot of flexibility for long-range connectivity. Over multiple layers of a deep transformer, the number of possible connectivity patterns increases exponentially. However, very few of these contribute to the performance of the network, and even fewer are essential. We hypothesize that there are sparsely connected sub-networks within a transformer, called information pathways which can be trained independently. However, the dynamic (i.e., input-dependent) nature of these pathways makes it difficult to prune dense self-attention during training. But the overall distribution of these pathways is often predictable. We take advantage of this fact to propose Stochastically Subsampled self-Attention (SSA) - a general-purpose training strategy for transformers that can reduce both the memory and computational cost of self-attention by 4 to 8 times during training while also serving as a regularization method - improving generalization over dense training. We show that an ensemble of sub-models can be formed from the subsampled pathways within a network, which can achieve better performance than its densely attended counterpart. We perform experiments on a variety of NLP, computer vision and graph learning tasks in both generative and discriminative settings to provide empirical evidence for our claims and show the effectiveness of the proposed method. △ Less

Submitted 2 June, 2023; originally announced June 2023.

Comments: KDD23 preprint, 12 pages, 7 figures, 10 tables

arXiv:2108.03348 [pdf, other]

doi 10.1145/3534678.3539296

Global Self-Attention as a Replacement for Graph Convolution

Authors: Md Shamim Hussain, Mohammed J. Zaki, Dharmashankar Subramanian

Abstract: We propose an extension to the transformer neural network architecture for general-purpose graph learning by adding a dedicated pathway for pairwise structural information, called edge channels. The resultant framework - which we call Edge-augmented Graph Transformer (EGT) - can directly accept, process and output structural information of arbitrary form, which is important for effective learning… ▽ More We propose an extension to the transformer neural network architecture for general-purpose graph learning by adding a dedicated pathway for pairwise structural information, called edge channels. The resultant framework - which we call Edge-augmented Graph Transformer (EGT) - can directly accept, process and output structural information of arbitrary form, which is important for effective learning on graph-structured data. Our model exclusively uses global self-attention as an aggregation mechanism rather than static localized convolutional aggregation. This allows for unconstrained long-range dynamic interactions between nodes. Moreover, the edge channels allow the structural information to evolve from layer to layer, and prediction tasks on edges/links can be performed directly from the output embeddings of these channels. We verify the performance of EGT in a wide range of graph-learning experiments on benchmark datasets, in which it outperforms Convolutional/Message-Passing Graph Neural Networks. EGT sets a new state-of-the-art for the quantum-chemical regression task on the OGB-LSC PCQM4Mv2 dataset containing 3.8 million molecular graphs. Our findings indicate that global self-attention based aggregation can serve as a flexible, adaptive and effective replacement of graph convolution for general-purpose graph learning. Therefore, convolutional local neighborhood aggregation is not an essential inductive bias. △ Less

Submitted 3 June, 2022; v1 submitted 6 August, 2021; originally announced August 2021.

Comments: The accepted version in KDD '22

arXiv:2003.12568 [pdf]

An Improved Physics Based Numerical Model of Tunnel FET Using 2D NEGF Formalism

Authors: Md Shamim Hussain

Abstract: In this work, we have investigated a 2D model of band-to-band tunneling based on 2-band model and implemented it using 2D NEGF formalism. Being 2D in nature, this model better addresses the variation in the directionality of the tunneling process occurring in most practical TFET device structures. It also works as a compromise between semi-classical and multiband quantum simulation of TFETs. In th… ▽ More In this work, we have investigated a 2D model of band-to-band tunneling based on 2-band model and implemented it using 2D NEGF formalism. Being 2D in nature, this model better addresses the variation in the directionality of the tunneling process occurring in most practical TFET device structures. It also works as a compromise between semi-classical and multiband quantum simulation of TFETs. In this work, we have presented a sound step by step mathematical development of the numerical model. We have also discussed how this model can be implemented in simulators and pointed out a few optimizations that can be made to reduce complexity and to save time. Finally, we have performed elaborate simulations for a practical TFET design and compared the results with commercially available TCAD simulations, to point out the limitations of the simplistic models that are frequently used, and how our model overcomes these limitations. △ Less

Submitted 24 March, 2020; originally announced March 2020.

arXiv:1812.00149 [pdf, other]

SwishNet: A Fast Convolutional Neural Network for Speech, Music and Noise Classification and Segmentation

Authors: Md. Shamim Hussain, Mohammad Ariful Haque

Abstract: Speech, Music and Noise classification/segmentation is an important preprocessing step for audio processing/indexing. To this end, we propose a novel 1D Convolutional Neural Network (CNN) - SwishNet. It is a fast and lightweight architecture that operates on MFCC features which is suitable to be added to the front-end of an audio processing pipeline. We showed that the performance of our network c… ▽ More Speech, Music and Noise classification/segmentation is an important preprocessing step for audio processing/indexing. To this end, we propose a novel 1D Convolutional Neural Network (CNN) - SwishNet. It is a fast and lightweight architecture that operates on MFCC features which is suitable to be added to the front-end of an audio processing pipeline. We showed that the performance of our network can be improved by distilling knowledge from a 2D CNN, pretrained on ImageNet. We investigated the performance of our network on the MUSAN corpus - an openly available comprehensive collection of noise, music and speech samples, suitable for deep learning. The proposed network achieved high overall accuracy in clip (length of 0.5-2s) classification (>97% accuracy) and frame-wise segmentation (>93% accuracy) tasks with even higher accuracy (>99%) in speech/non-speech discrimination task. To verify the robustness of our model, we trained it on MUSAN and evaluated it on a different corpus - GTZAN and found good accuracy with very little fine-tuning. We also demonstrated that our model is fast on both CPU and GPU, consumes a low amount of memory and is suitable for implementation in embedded systems. △ Less

Submitted 1 December, 2018; originally announced December 2018.

Comments: 7 pages, 3 figures, 6 tables

Showing 1–6 of 6 results for author: Hussain, M S