Search | arXiv e-print repository

Embedded Deployment of Semantic Segmentation in Medicine through Low-Resolution Inputs

Authors: Erik Ostrowski, Muhammad Shafique

Abstract: When deploying neural networks in real-life situations, the size and computational effort are often the limiting factors. This is especially true in environments where big, expensive hardware is not affordable, like in embedded medical devices, where budgets are often tight. State-of-the-art proposed multiple different lightweight solutions for such use cases, mostly by changing the base model arc… ▽ More When deploying neural networks in real-life situations, the size and computational effort are often the limiting factors. This is especially true in environments where big, expensive hardware is not affordable, like in embedded medical devices, where budgets are often tight. State-of-the-art proposed multiple different lightweight solutions for such use cases, mostly by changing the base model architecture, not taking the input and output resolution into consideration. In this paper, we propose our architecture that takes advantage of the fact that in hardware-limited environments, we often refrain from using the highest available input resolutions to guarantee a higher throughput. Although using lower-resolution input leads to a significant reduction in computing and memory requirements, it may also incur reduced prediction quality. Our architecture addresses this problem by exploiting the fact that we can still utilize high-resolution ground-truths in training. The proposed model inputs lower-resolution images and high-resolution ground truths, which can improve the prediction quality by 5.5% while adding less than 200 parameters to the model. %reducing the frames per second only from 25 to 20. We conduct an extensive analysis to illustrate that our architecture enhances existing state-of-the-art frameworks for lightweight semantic segmentation of cancer in MRI images. We also tested the deployment speed of state-of-the-art lightweight networks and our architecture on Nvidia's Jetson Nano to emulate deployment in resource-constrained embedded scenarios. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2311.12082 [pdf, other]

Tiny-VBF: Resource-Efficient Vision Transformer based Lightweight Beamformer for Ultrasound Single-Angle Plane Wave Imaging

Authors: Abdul Rahoof, Vivek Chaturvedi, Mahesh Raveendranatha Panicker, Muhammad Shafique

Abstract: Accelerating compute intensive non-real-time beam-forming algorithms in ultrasound imaging using deep learning architectures has been gaining momentum in the recent past. Nonetheless, the complexity of the state-of-the-art deep learning techniques poses challenges for deployment on resource-constrained edge devices. In this work, we propose a novel vision transformer based tiny beamformer (Tiny-VB… ▽ More Accelerating compute intensive non-real-time beam-forming algorithms in ultrasound imaging using deep learning architectures has been gaining momentum in the recent past. Nonetheless, the complexity of the state-of-the-art deep learning techniques poses challenges for deployment on resource-constrained edge devices. In this work, we propose a novel vision transformer based tiny beamformer (Tiny-VBF), which works on the raw radio-frequency channel data acquired through single-angle plane wave insonification. The output of our Tiny-VBF provides fast envelope detection requiring very low frame rate, i.e. 0.34 GOPs/Frame for a frame size of 368 x 128 in comparison to the state-of-the-art deep learning models. It also exhibited an 8% increase in contrast and gains of 5% and 33% in axial and lateral resolution respectively when compared to Tiny-CNN on in-vitro dataset. Additionally, our model showed a 4.2% increase in contrast and gains of 4% and 20% in axial and lateral resolution respectively when compared against conventional Delay-and-Sum (DAS) beamformer. We further propose an accelerator architecture and implement our Tiny-VBF model on a Zynq UltraScale+ MPSoC ZCU104 FPGA using a hybrid quantization scheme with 50% less resource consumption compared to the floating-point implementation, while preserving the image quality. △ Less

Submitted 16 January, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

Comments: 6 pages, DATE 2024

arXiv:2304.12890 [pdf, other]

MRI Recovery with Self-Calibrated Denoisers without Fully-Sampled Data

Authors: Sizhuo Liu, Muhammad Shafique, Philip Schniter, Rizwan Ahmad

Abstract: Objective: Acquiring fully sampled training data is challenging for many MRI applications. We present a self-supervised image reconstruction method, termed ReSiDe, capable of recovering images solely from undersampled data. Materials and Methods: ReSiDe is inspired by plug-and-play (PnP) methods, but unlike traditional PnP approaches that utilize pre-trained denoisers, ReSiDe iteratively trains… ▽ More Objective: Acquiring fully sampled training data is challenging for many MRI applications. We present a self-supervised image reconstruction method, termed ReSiDe, capable of recovering images solely from undersampled data. Materials and Methods: ReSiDe is inspired by plug-and-play (PnP) methods, but unlike traditional PnP approaches that utilize pre-trained denoisers, ReSiDe iteratively trains the denoiser on the image or images that are being reconstructed. We introduce two variations of our method: ReSiDe-S and ReSiDe-M. ReSiDe-S is scan-specific and works with a single set of undersampled measurements, while ReSiDe-M operates on multiple sets of undersampled measurements and provides faster inference. Studies I, II, and III compare ReSiDe-S and ReSiDe-M against other self-supervised or unsupervised methods using data from T1- and T2-weighted brain MRI, MRXCAT digital perfusion phantom, and first-pass cardiac perfusion, respectively. Results: ReSiDe-S and ReSiDe-M outperform other methods in terms of reconstruction signal-to-noise ratio and structural similarity index measure for Studies I and II, and in terms of expert scoring for Study III. Discussion: We present a self-supervised image reconstruction method and validate it in both static and dynamic MRI applications. These developments can benefit MRI applications where the availability of fully sampled training data is limited. △ Less

Submitted 30 May, 2024; v1 submitted 25 April, 2023; originally announced April 2023.

arXiv:2303.07852 [pdf, other]

doi 10.1109/ACCESS.2023.3284315

FPUS23: An Ultrasound Fetus Phantom Dataset with Deep Neural Network Evaluations for Fetus Orientations, Fetal Planes, and Anatomical Features

Authors: Bharath Srinivas Prabakaran, Paul Hamelmann, Erik Ostrowski, Muhammad Shafique

Abstract: Ultrasound imaging is one of the most prominent technologies to evaluate the growth, progression, and overall health of a fetus during its gestation. However, the interpretation of the data obtained from such studies is best left to expert physicians and technicians who are trained and well-versed in analyzing such images. To improve the clinical workflow and potentially develop an at-home ultraso… ▽ More Ultrasound imaging is one of the most prominent technologies to evaluate the growth, progression, and overall health of a fetus during its gestation. However, the interpretation of the data obtained from such studies is best left to expert physicians and technicians who are trained and well-versed in analyzing such images. To improve the clinical workflow and potentially develop an at-home ultrasound-based fetal monitoring platform, we present a novel fetus phantom ultrasound dataset, FPUS23, which can be used to identify (1) the correct diagnostic planes for estimating fetal biometric values, (2) fetus orientation, (3) their anatomical features, and (4) bounding boxes of the fetus phantom anatomies at 23 weeks gestation. The entire dataset is composed of 15,728 images, which are used to train four different Deep Neural Network models, built upon a ResNet34 backbone, for detecting aforementioned fetus features and use-cases. We have also evaluated the models trained using our FPUS23 dataset, to show that the information learned by these models can be used to substantially increase the accuracy on real-world ultrasound fetus datasets. We make the FPUS23 dataset and the pre-trained models publicly accessible at https://github.com/bharathprabakaran/FPUS23, which will further facilitate future research on fetal ultrasound imaging and analysis. △ Less

Submitted 7 June, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

Comments: Accepted for Publication at IEEE Access

arXiv:2109.02909 [pdf, other]

doi 10.1109/JIOT.2021.3065815

BioNetExplorer: Architecture-Space Exploration of Bio-Signal Processing Deep Neural Networks for Wearables

Authors: Bharath Srinivas Prabakaran, Asima Akhtar, Semeen Rehman, Osman Hasan, Muhammad Shafique

Abstract: In this work, we propose the BioNetExplorer framework to systematically generate and explore multiple DNN architectures for bio-signal processing in wearables. Our framework adapts key neural architecture parameters to search for an embedded DNN with a low hardware overhead, which can be deployed in wearable edge devices to analyse the bio-signal data and to extract the relevant information, such… ▽ More In this work, we propose the BioNetExplorer framework to systematically generate and explore multiple DNN architectures for bio-signal processing in wearables. Our framework adapts key neural architecture parameters to search for an embedded DNN with a low hardware overhead, which can be deployed in wearable edge devices to analyse the bio-signal data and to extract the relevant information, such as arrhythmia and seizure. Our framework also enables hardware-aware DNN architecture search using genetic algorithms by imposing user requirements and hardware constraints (storage, FLOPs, etc.) during the exploration stage, thereby limiting the number of networks explored. Moreover, BioNetExplorer can also be used to search for DNNs based on the user-required output classes; for instance, a user might require a specific output class due to genetic predisposition or a pre-existing heart condition. The use of genetic algorithms reduces the exploration time, on average, by 9x, compared to exhaustive exploration. We are successful in identifying Pareto-optimal designs, which can reduce the storage overhead of the DNN by ~30MB for a quality loss of less than 0.5%. To enable low-cost embedded DNNs, BioNetExplorer also employs different model compression techniques to further reduce the storage overhead of the network by up to 53x for a quality loss of <0.2%. △ Less

Submitted 7 September, 2021; originally announced September 2021.

Journal ref: IEEE Internet of Things Journal (Volume: 8, Issue: 17, Sept.1, 1 2021)

arXiv:2101.02559 [pdf, other]

doi 10.1109/MDAT.2020.2971217

Robust Machine Learning Systems: Challenges, Current Trends, Perspectives, and the Road Ahead

Authors: Muhammad Shafique, Mahum Naseer, Theocharis Theocharides, Christos Kyrkou, Onur Mutlu, Lois Orosa, Jungwook Choi

Abstract: Machine Learning (ML) techniques have been rapidly adopted by smart Cyber-Physical Systems (CPS) and Internet-of-Things (IoT) due to their powerful decision-making capabilities. However, they are vulnerable to various security and reliability threats, at both hardware and software levels, that compromise their accuracy. These threats get aggravated in emerging edge ML devices that have stringent c… ▽ More Machine Learning (ML) techniques have been rapidly adopted by smart Cyber-Physical Systems (CPS) and Internet-of-Things (IoT) due to their powerful decision-making capabilities. However, they are vulnerable to various security and reliability threats, at both hardware and software levels, that compromise their accuracy. These threats get aggravated in emerging edge ML devices that have stringent constraints in terms of resources (e.g., compute, memory, power/energy), and that therefore cannot employ costly security and reliability measures. Security, reliability, and vulnerability mitigation techniques span from network security measures to hardware protection, with an increased interest towards formal verification of trained ML models. This paper summarizes the prominent vulnerabilities of modern ML systems, highlights successful defenses and mitigation techniques against these vulnerabilities, both at the cloud (i.e., during the ML training phase) and edge (i.e., during the ML inference stage), discusses the implications of a resource-constrained design on the reliability and security of the system, identifies verification methodologies to ensure correct system behavior, and describes open research challenges for building secure and reliable ML systems at both the edge and the cloud. △ Less

Submitted 4 January, 2021; originally announced January 2021.

Comments: Final version appears in https://ieeexplore.ieee.org/document/8979377

ACM Class: A.1; B.0; C.1; I.2; D.4.6

Journal ref: IEEE Design and Test (Volume: 37, Issue: 2, April 2020): 30-57

arXiv:2004.10491 [pdf, other]

doi 10.1109/DAC18072.2020.9218713

EMAP: A Cloud-Edge Hybrid Framework for EEG Monitoring and Cross-Correlation Based Real-time Anomaly Prediction

Authors: Bharath Srinivas Prabakaran, Alberto García Jiménez, Germán Moltó Martínez, Muhammad Shafique

Abstract: State-of-the-art techniques for detecting, or predicting, neurological disorders (1) focus on predicting each disorder individually, and are (2) computationally expensive, leading to a delay that can potentially render the prediction useless, especially in critical events. Towards this, we present a real-time two-tiered framework called EMAP, which cross-correlates the input with all the EEG signa… ▽ More State-of-the-art techniques for detecting, or predicting, neurological disorders (1) focus on predicting each disorder individually, and are (2) computationally expensive, leading to a delay that can potentially render the prediction useless, especially in critical events. Towards this, we present a real-time two-tiered framework called EMAP, which cross-correlates the input with all the EEG signals in our mega-database (a combination of multiple EEG datasets) at the cloud, while tracking the signal in real-time at the edge, to predict the occurrence of a neurological anomaly. Using the proposed framework, we have demonstrated a prediction accuracy of up to 94% for the three different anomalies that we have tested. △ Less

Submitted 22 April, 2020; originally announced April 2020.

Comments: Accepted for Publication at the 57th Design Automation Conference (DAC), July 2020, San Francisco, CA, USA

arXiv:1906.10554 [pdf]

Design of a 1x4 CPW Microstrip Antenna Array on PET Substrate for Biomedical Applications

Authors: U. Farooq, A. Iftikhar, M. S. Khan, M. F. Shafique, Raed M. Shubair

Abstract: In this paper, a single layer Coplanar Waveguide-fed microstrip patch antenna array is presented for biomedical applications. The proposed antenna array is realized on a transparent and flexible Polyethylene Terephthalate substrate, has 1x4 radiating elements and measures only 280 x 192 mm2. The antenna array resonates at 2.68 GHz and has a peak-simulated gain of 10 dBi. A prototype is also fabric… ▽ More In this paper, a single layer Coplanar Waveguide-fed microstrip patch antenna array is presented for biomedical applications. The proposed antenna array is realized on a transparent and flexible Polyethylene Terephthalate substrate, has 1x4 radiating elements and measures only 280 x 192 mm2. The antenna array resonates at 2.68 GHz and has a peak-simulated gain of 10 dBi. A prototype is also fabricated, and the conductive patterns are drawn using cost-efficient adhesive copper foils instead of conventional copper or silver nanoparticle ink. The corresponding measured results agree well with the simulated results. The proposed low profile and cost-efficient transmit antenna array has the potential for wearable born-worn applications, including wireless powering of implantable medical devices. △ Less

Submitted 24 June, 2019; originally announced June 2019.

Comments: 11 pages, 4 figures

arXiv:1902.02649 [pdf, other]

doi 10.1145/3316781.3317933

XBioSiP: A Methodology for Approximate Bio-Signal Processing at the Edge

Authors: Bharath Srinivas Prabakaran, Semeen Rehman, Muhammad Shafique

Abstract: Bio-signals exhibit high redundancy, and the algorithms for their processing are inherently error resilient. This property can be leveraged to improve the energy-efficiency of IoT-Edge (wearables) through the emerging trend of approximate computing. This paper presents XBioSiP, a novel methodology for approximate bio-signal processing that employs two quality evaluation stages, during the pre-proc… ▽ More Bio-signals exhibit high redundancy, and the algorithms for their processing are inherently error resilient. This property can be leveraged to improve the energy-efficiency of IoT-Edge (wearables) through the emerging trend of approximate computing. This paper presents XBioSiP, a novel methodology for approximate bio-signal processing that employs two quality evaluation stages, during the pre-processing and bio-signal processing stages, to determine the approximation parameters. It thereby achieves high energy savings while satisfying the user-determined quality constraint. Our methodology achieves, up to 19x and 22x reduction in the energy consumption of a QRS peak detection algorithm for 0% and <1% loss in peak detection accuracy, respectively. △ Less

Submitted 5 February, 2019; originally announced February 2019.

Comments: Accepted for publication at the Design Automation Conference 2019 (DAC'19), Las Vegas, Nevada, USA

arXiv:1902.01147 [pdf, other]

doi 10.1109/IJCNN48605.2020.9207297

Is Spiking Secure? A Comparative Study on the Security Vulnerabilities of Spiking and Deep Neural Networks

Authors: Alberto Marchisio, Giorgio Nanfa, Faiq Khalid, Muhammad Abdullah Hanif, Maurizio Martina, Muhammad Shafique

Abstract: Spiking Neural Networks (SNNs) claim to present many advantages in terms of biological plausibility and energy efficiency compared to standard Deep Neural Networks (DNNs). Recent works have shown that DNNs are vulnerable to adversarial attacks, i.e., small perturbations added to the input data can lead to targeted or random misclassifications. In this paper, we aim at investigating the key researc… ▽ More Spiking Neural Networks (SNNs) claim to present many advantages in terms of biological plausibility and energy efficiency compared to standard Deep Neural Networks (DNNs). Recent works have shown that DNNs are vulnerable to adversarial attacks, i.e., small perturbations added to the input data can lead to targeted or random misclassifications. In this paper, we aim at investigating the key research question: ``Are SNNs secure?'' Towards this, we perform a comparative study of the security vulnerabilities in SNNs and DNNs w.r.t. the adversarial noise. Afterwards, we propose a novel black-box attack methodology, i.e., without the knowledge of the internal structure of the SNN, which employs a greedy heuristic to automatically generate imperceptible and robust adversarial examples (i.e., attack images) for the given SNN. We perform an in-depth evaluation for a Spiking Deep Belief Network (SDBN) and a DNN having the same number of layers and neurons (to obtain a fair comparison), in order to study the efficiency of our methodology and to understand the differences between SNNs and DNNs w.r.t. the adversarial examples. Our work opens new avenues of research towards the robustness of the SNNs, considering their similarities to the human brain's functionality. △ Less

Submitted 18 May, 2020; v1 submitted 4 February, 2019; originally announced February 2019.

Comments: Accepted for publication at the 2020 International Joint Conference on Neural Networks (IJCNN)

arXiv:1901.09878 [pdf, other]

CapsAttacks: Robust and Imperceptible Adversarial Attacks on Capsule Networks

Authors: Alberto Marchisio, Giorgio Nanfa, Faiq Khalid, Muhammad Abdullah Hanif, Maurizio Martina, Muhammad Shafique

Abstract: Capsule Networks preserve the hierarchical spatial relationships between objects, and thereby bears a potential to surpass the performance of traditional Convolutional Neural Networks (CNNs) in performing tasks like image classification. A large body of work has explored adversarial examples for CNNs, but their effectiveness on Capsule Networks has not yet been well studied. In our work, we perfor… ▽ More Capsule Networks preserve the hierarchical spatial relationships between objects, and thereby bears a potential to surpass the performance of traditional Convolutional Neural Networks (CNNs) in performing tasks like image classification. A large body of work has explored adversarial examples for CNNs, but their effectiveness on Capsule Networks has not yet been well studied. In our work, we perform an analysis to study the vulnerabilities in Capsule Networks to adversarial attacks. These perturbations, added to the test inputs, are small and imperceptible to humans, but can fool the network to mispredict. We propose a greedy algorithm to automatically generate targeted imperceptible adversarial examples in a black-box attack scenario. We show that this kind of attacks, when applied to the German Traffic Sign Recognition Benchmark (GTSRB), mislead Capsule Networks. Moreover, we apply the same kind of adversarial attacks to a 5-layer CNN and a 9-layer CNN, and analyze the outcome, compared to the Capsule Networks to study differences in their behavior. △ Less

Submitted 24 May, 2019; v1 submitted 28 January, 2019; originally announced January 2019.

arXiv:1901.04986 [pdf]

Systimator: A Design Space Exploration Methodology for Systolic Array based CNNs Acceleration on the FPGA-based Edge Nodes

Authors: Hazoor Ahmad, Muhammad Tanvir, Muhammad Abdullah Hanif, Muhammad Usama Javed, Rehan Hafiz, Muhammad Shafique

Abstract: The evolution of IoT based smart applications demand porting of artificial intelligence algorithms to the edge computing devices. CNNs form a large part of these AI algorithms. Systolic array based CNN acceleration is being widely advocated due its ability to allow scalable architectures. However, CNNs are inherently memory and compute intensive algorithms, and hence pose significant challenges to… ▽ More The evolution of IoT based smart applications demand porting of artificial intelligence algorithms to the edge computing devices. CNNs form a large part of these AI algorithms. Systolic array based CNN acceleration is being widely advocated due its ability to allow scalable architectures. However, CNNs are inherently memory and compute intensive algorithms, and hence pose significant challenges to be implemented on the resource-constrained edge computing devices. Memory-constrained low-cost FPGA based devices form a substantial fraction of these edge computing devices. Thus, when porting to such edge-computing devices, the designer is left unguided as to how to select a suitable systolic array configuration that could fit in the available hardware resources. In this paper we propose Systimator, a design space exploration based methodology that provides a set of design points that can be mapped within the memory bounds of the target FPGA device. The methodology is based upon an analytical model that is formulated to estimate the required resources for systolic arrays, assuming multiple data reuse patterns. The methodology further provides the performance estimates for each of the candidate design points. We show that Systimator provides an in-depth analysis of resource-requirement of systolic array based CNNs. We provide our resource estimation results for porting of convolutional layers of TINY YOLO, a CNN based object detector, on a Xilinx ARTIX 7 FPGA. △ Less

Submitted 8 February, 2019; v1 submitted 15 December, 2018; originally announced January 2019.

Comments: 5 Pages, 3 Figures, work in progress

arXiv:1811.07330 [pdf, other]

ApproxCS: Near-Sensor Approximate Compressed Sensing for IoT-Healthcare Systems

Authors: Ayesha Siddique, Osman Hasan, Faiq Khalid, Muhammad Shafique

Abstract: Internet of Things (IoTs) is an emerging trend that has enabled an upgrade in the design of wearable healthcare monitoring systems through the (integrated) edge, fog, and cloud computing paradigm. Energy efficiency is one of the most important design metrics in such IoT-healthcare systems especially, for the edge and fog nodes. Due to the sensing noise and inherent redundancy in the input data, ev… ▽ More Internet of Things (IoTs) is an emerging trend that has enabled an upgrade in the design of wearable healthcare monitoring systems through the (integrated) edge, fog, and cloud computing paradigm. Energy efficiency is one of the most important design metrics in such IoT-healthcare systems especially, for the edge and fog nodes. Due to the sensing noise and inherent redundancy in the input data, even the most safety-critical biomedical applications can sometimes afford a slight degradation in the output quality. Hence, such inherent error tolerance in the bio-signals can be exploited to achieve high energy savings through the emerging trends like, the Approximate Computing which is applicable at both software and hardware levels. In this paper, we propose to leverage the approximate computing in digital Compressed Sensing (CS), through low-power approximate adders (LPAA) in an accurate Bernoulli sensing-based CS acquisition (BCS). We demonstrate that approximations can indeed be safely employed in IoT healthcare without affecting the detection of critical events in the biomedical signals. Towards this, we explored the trade-of between energy efficiency and output quality using the state-of-the-art lp2d RLS reconstruction algorithm. The proposed framework is validated with the MIT-BIH Arrhythmia database. Our results demonstrated approximately 59% energy savings as compared to the accurate design. △ Less

Submitted 18 November, 2018; originally announced November 2018.

Showing 1–13 of 13 results for author: Shafique, M