-
VTR: An Optimized Vision Transformer for SAR ATR Acceleration on FPGA
Authors:
Sachini Wickramasinghe,
Dhruv Parikh,
Bingyi Zhang,
Rajgopal Kannan,
Viktor Prasanna,
Carl Busart
Abstract:
Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) is a key technique used in military applications like remote-sensing image recognition. Vision Transformers (ViTs) are the current state-of-the-art in various computer vision applications, outperforming their CNN counterparts. However, using ViTs for SAR ATR applications is challenging due to (1) standard ViTs require extensive trai…
▽ More
Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) is a key technique used in military applications like remote-sensing image recognition. Vision Transformers (ViTs) are the current state-of-the-art in various computer vision applications, outperforming their CNN counterparts. However, using ViTs for SAR ATR applications is challenging due to (1) standard ViTs require extensive training data to generalize well due to their low locality; the standard SAR datasets, however, have a limited number of labeled training data which reduces the learning capability of ViTs; (2) ViTs have a high parameter count and are computation intensive which makes their deployment on resource-constrained SAR platforms difficult. In this work, we develop a lightweight ViT model that can be trained directly on small datasets without any pre-training by utilizing the Shifted Patch Tokenization (SPT) and Locality Self-Attention (LSA) modules. We directly train this model on SAR datasets which have limited training samples to evaluate its effectiveness for SAR ATR applications. We evaluate our proposed model, that we call VTR (ViT for SAR ATR), on three widely used SAR datasets: MSTAR, SynthWakeSAR, and GBSAR. Further, we propose a novel FPGA accelerator for VTR, in order to enable deployment for real-time SAR ATR applications.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
A Single Graph Convolution Is All You Need: Efficient Grayscale Image Classification
Authors:
Jacob Fein-Ashley,
Sachini Wickramasinghe,
Bingyi Zhang,
Rajgopal Kannan,
Viktor Prasanna
Abstract:
Image classifiers for domain-specific tasks like Synthetic Aperture Radar Automatic Target Recognition (SAR ATR) and chest X-ray classification often rely on convolutional neural networks (CNNs). These networks, while powerful, experience high latency due to the number of operations they perform, which can be problematic in real-time applications. Many image classification models are designed to w…
▽ More
Image classifiers for domain-specific tasks like Synthetic Aperture Radar Automatic Target Recognition (SAR ATR) and chest X-ray classification often rely on convolutional neural networks (CNNs). These networks, while powerful, experience high latency due to the number of operations they perform, which can be problematic in real-time applications. Many image classification models are designed to work with both RGB and grayscale datasets, but classifiers that operate solely on grayscale images are less common. Grayscale image classification has critical applications in fields such as medical imaging and SAR ATR. In response, we present a novel grayscale image classification approach using a vectorized view of images. By leveraging the lightweight nature of Multi-Layer Perceptrons (MLPs), we treat images as vectors, simplifying the problem to grayscale image classification. Our approach incorporates a single graph convolutional layer in a batch-wise manner, enhancing accuracy and reducing performance variance. Additionally, we develop a customized accelerator on FPGA for our model, incorporating several optimizations to improve performance. Experimental results on benchmark grayscale image datasets demonstrate the effectiveness of our approach, achieving significantly lower latency (up to $16\times$ less on MSTAR) and competitive or superior performance compared to state-of-the-art models for SAR ATR and medical image classification.
△ Less
Submitted 26 June, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
RX-ADS: Interpretable Anomaly Detection using Adversarial ML for Electric Vehicle CAN data
Authors:
Chathurika S. Wickramasinghe,
Daniel L. Marino,
Harindra S. Mavikumbure,
Victor Cobilean,
Timothy D. Pennington,
Benny J. Varghese,
Craig Rieger,
Milos Manic
Abstract:
Recent year has brought considerable advancements in Electric Vehicles (EVs) and associated infrastructures/communications. Intrusion Detection Systems (IDS) are widely deployed for anomaly detection in such critical infrastructures. This paper presents an Interpretable Anomaly Detection System (RX-ADS) for intrusion detection in CAN protocol communication in EVs. Contributions include: 1) window…
▽ More
Recent year has brought considerable advancements in Electric Vehicles (EVs) and associated infrastructures/communications. Intrusion Detection Systems (IDS) are widely deployed for anomaly detection in such critical infrastructures. This paper presents an Interpretable Anomaly Detection System (RX-ADS) for intrusion detection in CAN protocol communication in EVs. Contributions include: 1) window based feature extraction method; 2) deep Autoencoder based anomaly detection method; and 3) adversarial machine learning based explanation generation methodology. The presented approach was tested on two benchmark CAN datasets: OTIDS and Car Hacking. The anomaly detection performance of RX-ADS was compared against the state-of-the-art approaches on these datasets: HIDS and GIDS. The RX-ADS approach presented performance comparable to the HIDS approach (OTIDS dataset) and has outperformed HIDS and GIDS approaches (Car Hacking dataset). Further, the proposed approach was able to generate explanations for detected abnormal behaviors arising from various intrusions. These explanations were later validated by information used by domain experts to detect anomalies. Other advantages of RX-ADS include: 1) the method can be trained on unlabeled data; 2) explanations help experts in understanding anomalies and root course analysis, and also help with AI model debugging and diagnostics, ultimately improving user trust in AI systems.
△ Less
Submitted 5 September, 2022;
originally announced September 2022.
-
Self-Supervised and Interpretable Anomaly Detection using Network Transformers
Authors:
Daniel L. Marino,
Chathurika S. Wickramasinghe,
Craig Rieger,
Milos Manic
Abstract:
Monitoring traffic in computer networks is one of the core approaches for defending critical infrastructure against cyber attacks. Machine Learning (ML) and Deep Neural Networks (DNNs) have been proposed in the past as a tool to identify anomalies in computer networks. Although detecting these anomalies provides an indication of an attack, just detecting an anomaly is not enough information for a…
▽ More
Monitoring traffic in computer networks is one of the core approaches for defending critical infrastructure against cyber attacks. Machine Learning (ML) and Deep Neural Networks (DNNs) have been proposed in the past as a tool to identify anomalies in computer networks. Although detecting these anomalies provides an indication of an attack, just detecting an anomaly is not enough information for a user to understand the anomaly. The black-box nature of off-the-shelf ML models prevents extracting important information that is fundamental to isolate the source of the fault/attack and take corrective measures. In this paper, we introduce the Network Transformer (NeT), a DNN model for anomaly detection that incorporates the graph structure of the communication network in order to improve interpretability. The presented approach has the following advantages: 1) enhanced interpretability by incorporating the graph structure of computer networks; 2) provides a hierarchical set of features that enables analysis at different levels of granularity; 3) self-supervised training that does not require labeled data. The presented approach was tested by evaluating the successful detection of anomalies in an Industrial Control System (ICS). The presented approach successfully identified anomalies, the devices affected, and the specific connections causing the anomalies, providing a data-driven hierarchical approach to analyze the behavior of a cyber network.
△ Less
Submitted 25 February, 2022;
originally announced February 2022.
-
Ordinal UNLOC: Target Localization with Noisy and Incomplete Distance Measures
Authors:
Mahesh K. Banavar,
Shandeepa Wickramasinghe,
Monalisa Achalla,
Jie Sun
Abstract:
A main challenge in target localization arises from the lack of reliable distance measures. This issue is especially pronounced in indoor settings due to the presence of walls, floors, furniture, and other dynamically changing conditions such as the movement of people and goods, varying temperature, and airflows. Here, we develop a new computational framework to estimate the location of a target w…
▽ More
A main challenge in target localization arises from the lack of reliable distance measures. This issue is especially pronounced in indoor settings due to the presence of walls, floors, furniture, and other dynamically changing conditions such as the movement of people and goods, varying temperature, and airflows. Here, we develop a new computational framework to estimate the location of a target without the need for reliable distance measures. The method, which we term Ordinal UNLOC, uses only ordinal data obtained from comparing the signal strength from anchor pairs at known locations to the target. Our estimation technique utilizes rank aggregation, function learning as well as proximity-based unfolding optimization. As a result, it yields accurate target localization for common transmission models with unknown parameters and noisy observations that are reminiscent of practical settings. Our results are validated by both numerical simulations and hardware experiments.
△ Less
Submitted 6 May, 2021;
originally announced May 2021.
-
An Adversarial Approach for Explainable AI in Intrusion Detection Systems
Authors:
Daniel L. Marino,
Chathurika S. Wickramasinghe,
Milos Manic
Abstract:
Despite the growing popularity of modern machine learning techniques (e.g. Deep Neural Networks) in cyber-security applications, most of these models are perceived as a black-box for the user. Adversarial machine learning offers an approach to increase our understanding of these models. In this paper we present an approach to generate explanations for incorrect classifications made by data-driven…
▽ More
Despite the growing popularity of modern machine learning techniques (e.g. Deep Neural Networks) in cyber-security applications, most of these models are perceived as a black-box for the user. Adversarial machine learning offers an approach to increase our understanding of these models. In this paper we present an approach to generate explanations for incorrect classifications made by data-driven Intrusion Detection Systems (IDSs). An adversarial approach is used to find the minimum modifications (of the input features) required to correctly classify a given set of misclassified samples. The magnitude of such modifications is used to visualize the most relevant features that explain the reason for the misclassification. The presented methodology generated satisfactory explanations that describe the reasoning behind the mis-classifications, with descriptions that match expert knowledge. The advantages of the presented methodology are: 1) applicable to any classifier with defined gradients. 2) does not require any modification of the classifier model. 3) can be extended to perform further diagnosis (e.g. vulnerability assessment) and gain further understanding of the system. Experimental evaluation was conducted on the NSL-KDD99 benchmark dataset using Linear and Multilayer perceptron classifiers. The results are shown using intuitive visualizations in order to improve the interpretability of the results.
△ Less
Submitted 28 November, 2018;
originally announced November 2018.
-
Modeling spatial social complex networks for dynamical processes
Authors:
Shandeepa Wickramasinghe,
Onyekachukwu Onyerikwu,
Jie Sun,
Daniel ben-Avraham
Abstract:
The study of social networks --- where people are located, geographically, and how they might be connected to one another --- is a current hot topic of interest, because of its immediate relevance to important applications, from devising efficient immunization techniques for the arrest of epidemics, to the design of better transportation and city planning paradigms, to the understanding of how rum…
▽ More
The study of social networks --- where people are located, geographically, and how they might be connected to one another --- is a current hot topic of interest, because of its immediate relevance to important applications, from devising efficient immunization techniques for the arrest of epidemics, to the design of better transportation and city planning paradigms, to the understanding of how rumors and opinions spread and take shape over time. We develop a spatial social complex network (SSCN) model that captures not only essential connectivity features of real-life social networks, including a heavy-tailed degree distribution and high clustering, but also the spatial location of individuals, reproducing Zipf's law for the distribution of city populations as well as other observed hallmarks. We then simulate Milgram's Small-World experiment on our SSCN model, obtaining good qualitative agreement with the known results and shedding light on the role played by various network attributes and the strategies used by the players in the game. This demonstrates the potential of the SSCN model for the simulation and study of the many social processes mentioned above, where both connectivity and geography play a role in the dynamics.
△ Less
Submitted 19 May, 2017;
originally announced May 2017.
-
Heterogeneous processor pipeline for a product cipher application
Authors:
I. B. Nawinne,
M. S. Wickramasinghe,
R. G. Ragel,
S. Radhakrishnan
Abstract:
Processing data received as a stream is a task commonly performed by modern embedded devices, in a wide range of applications such as multimedia (encoding/decoding/ playing media), networking (switching and routing), digital security, scientific data processing, etc. Such processing normally tends to be calculation intensive and therefore requiring significant processing power. Therefore, hardware…
▽ More
Processing data received as a stream is a task commonly performed by modern embedded devices, in a wide range of applications such as multimedia (encoding/decoding/ playing media), networking (switching and routing), digital security, scientific data processing, etc. Such processing normally tends to be calculation intensive and therefore requiring significant processing power. Therefore, hardware acceleration methods to increase the performance of such applications constitute an important area of study. In this paper, we present an evaluation of one such method to process streaming data, namely multi-processor pipeline architecture. The hardware is based on a Multiple-Processor System on Chip (MPSoC), using a data encryption algorithm as a case study. The algorithm is partitioned on a coarse grained level and mapped on to an MPSoC with five processor cores in a pipeline, using specifically configured Xtensa LX3 cores. The system is then selectively optimized by strengthening and pruning the resources of each processor core. The optimized system is evaluated and compared against an optimal single-processor System on Chip (SoC) for the same application. The multiple-processor pipeline system for data encryption algorithms used was observed to provide significant speed ups, up to 4.45 times that of the single-processor system, which is close to the ideal speed up from a five-stage pipeline.
△ Less
Submitted 28 March, 2014;
originally announced March 2014.
-
A Simple Software Application for Simulating Commercially Available Solar Panels
Authors:
Nalika Ulapane,
Sunil Abeyratne,
Prabath Binduhewa,
Chamari Dhanapala,
Shyama Wickramasinghe,
Nimal Rathnayake
Abstract:
This article addresses the formulation and validation of a simple PC based software application developed for simulating commercially available solar panels. The important feature of this application is its capability to produce speedy results in the form of solar panel output characteristics at given environmental conditions by using minimal input data. Besides, it is able to deliver critical inf…
▽ More
This article addresses the formulation and validation of a simple PC based software application developed for simulating commercially available solar panels. The important feature of this application is its capability to produce speedy results in the form of solar panel output characteristics at given environmental conditions by using minimal input data. Besides, it is able to deliver critical information about the maximum power point of the panel at a given environmental condition in quick succession. The application is based on a standard equation which governs solar panels and works by means of estimating unknown parameters in the equation to fit a given solar panel. The process of parameter estimation is described in detail with the aid of equations and data of a commercial solar panel. A validation of obtained results for commercial solar panels is also presented by comparing the panel manufacturers' results with the results generated by the application. In addition, implications of the obtained results are discussed along with possible improvements to the developed software application.
△ Less
Submitted 20 January, 2014;
originally announced January 2014.