Search | arXiv e-print repository

Scalable and RISC-V Programmable Near-Memory Computing Architectures for Edge Nodes

Authors: Michele Caon, Clément Choné, Pasquale Davide Schiavone, Alexandre Levisse, Guido Masera, Maurizio Martina, David Atienza

Abstract: The widespread adoption of data-centric algorithms, particularly Artificial Intelligence (AI) and Machine Learning (ML), has exposed the limitations of centralized processing infrastructures, driving a shift towards edge computing. This necessitates stringent constraints on energy efficiency, which traditional von Neumann architectures struggle to meet. The Compute-In-Memory (CIM) paradigm has eme… ▽ More The widespread adoption of data-centric algorithms, particularly Artificial Intelligence (AI) and Machine Learning (ML), has exposed the limitations of centralized processing infrastructures, driving a shift towards edge computing. This necessitates stringent constraints on energy efficiency, which traditional von Neumann architectures struggle to meet. The Compute-In-Memory (CIM) paradigm has emerged as a superior candidate due to its efficient exploitation of available memory bandwidth. However, existing CIM solutions require high implementation effort and lack flexibility from a software integration standpoint. This work proposes a novel, software-friendly, general-purpose, and low-integration-effort Near-Memory Computing (NMC) approach, paving the way for the adoption of CIM-based systems in the next generation of edge computing nodes. Two architectural variants, NM-Caesar and NM-Carus, are proposed and characterized to target different trade-offs in area efficiency, performance, and flexibility, covering a wide range of embedded microcontrollers. Post-layout simulations show up to $25.8\times$ and $50.0\times$ lower execution time and $23.2\times$ and $33.1\times$ higher energy efficiency at the system level, respectively, compared to executing the same tasks on a state-of-the-art RISC-V CPU (RV32IMC). NM-Carus achieves a peak energy efficiency of $306.7$ GOPS/W in 8-bit matrix multiplications, surpassing recent state-of-the-art in- and near-memory circuits. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: 14 pages, 12 figures, submitted to IEEE Transactions on Emerging Topics in Computing

arXiv:2403.01236 [pdf, other]

Performance evaluation of acceleration of convolutional layers on OpenEdgeCGRA

Authors: Nicolò Carpentieri, Juan Sapriza, Davide Schiavone, Daniele Jahier Pagliari, David Atienza, Maurizio Martina, Alessio Burrello

Abstract: Recently, efficiently deploying deep learning solutions on the edge has received increasing attention. New platforms are emerging to support the increasing demand for flexibility and high performance. In this work, we explore the efficient map** of convolutional layers on an open-hardware, low-power Coarse-Grain Reconfigurable Array (CGRA), namely OpenEdgeCGRA. We explore both direct implementat… ▽ More Recently, efficiently deploying deep learning solutions on the edge has received increasing attention. New platforms are emerging to support the increasing demand for flexibility and high performance. In this work, we explore the efficient map** of convolutional layers on an open-hardware, low-power Coarse-Grain Reconfigurable Array (CGRA), namely OpenEdgeCGRA. We explore both direct implementations of convolution and solutions that transform it into a matrix multiplication through an Im2col transformation, and experiment with various tensor parallelism axes. We show that for this hardware target, direct convolution, coupled with weight parallelism reaches the best latency and energy efficiency, outperforming a CPU implementation by 3.4x and 9.9x in terms of energy and latency, respectively. △ Less

Submitted 2 March, 2024; originally announced March 2024.

arXiv:2402.09780 [pdf, other]

TinyCL: An Efficient Hardware Architecture for Continual Learning on Autonomous Systems

Authors: Eugenio Ressa, Alberto Marchisio, Maurizio Martina, Guido Masera, Muhammad Shafique

Abstract: The Continuous Learning (CL) paradigm consists of continuously evolving the parameters of the Deep Neural Network (DNN) model to progressively learn to perform new tasks without reducing the performance on previous tasks, i.e., avoiding the so-called catastrophic forgetting. However, the DNN parameter update in CL-based autonomous systems is extremely resource-hungry. The existing DNN accelerators… ▽ More The Continuous Learning (CL) paradigm consists of continuously evolving the parameters of the Deep Neural Network (DNN) model to progressively learn to perform new tasks without reducing the performance on previous tasks, i.e., avoiding the so-called catastrophic forgetting. However, the DNN parameter update in CL-based autonomous systems is extremely resource-hungry. The existing DNN accelerators cannot be directly employed in CL because they only support the execution of the forward propagation. Only a few prior architectures execute the backpropagation and weight update, but they lack the control and management for CL. Towards this, we design a hardware architecture, TinyCL, to perform CL on resource-constrained autonomous systems. It consists of a processing unit that executes both forward and backward propagation, and a control unit that manages memory-based CL workload. To minimize the memory accesses, the sliding window of the convolutional layer moves in a snake-like fashion. Moreover, the Multiply-and-Accumulate units can be reconfigured at runtime to execute different operations. As per our knowledge, our proposed TinyCL represents the first hardware accelerator that executes CL on autonomous systems. We synthesize the complete TinyCL architecture in a 65 nm CMOS technology node with the conventional ASIC design flow. It executes 1 epoch of training on a Conv + ReLU + Dense model on the CIFAR10 dataset in 1.76 s, while 1 training epoch of the same model using an Nvidia Tesla P100 GPU takes 103 s, thus achieving a 58 x speedup, consuming 86 mW in a 4.74 mm2 die. △ Less

Submitted 15 February, 2024; originally announced February 2024.

arXiv:2308.05636 [pdf, other]

doi 10.3390/info14100537

A Homomorphic Encryption Framework for Privacy-Preserving Spiking Neural Networks

Authors: Farzad Nikfam, Raffaele Casaburi, Alberto Marchisio, Maurizio Martina, Muhammad Shafique

Abstract: Machine learning (ML) is widely used today, especially through deep neural networks (DNNs), however, increasing computational load and resource requirements have led to cloud-based solutions. To address this problem, a new generation of networks called Spiking Neural Networks (SNN) has emerged, which mimic the behavior of the human brain to improve efficiency and reduce energy consumption. These n… ▽ More Machine learning (ML) is widely used today, especially through deep neural networks (DNNs), however, increasing computational load and resource requirements have led to cloud-based solutions. To address this problem, a new generation of networks called Spiking Neural Networks (SNN) has emerged, which mimic the behavior of the human brain to improve efficiency and reduce energy consumption. These networks often process large amounts of sensitive information, such as confidential data, and thus privacy issues arise. Homomorphic encryption (HE) offers a solution, allowing calculations to be performed on encrypted data without decrypting it. This research compares traditional DNNs and SNNs using the Brakerski/Fan-Vercauteren (BFV) encryption scheme. The LeNet-5 model, a widely-used convolutional architecture, is used for both DNN and SNN models based on the LeNet-5 architecture, and the networks are trained and compared using the FashionMNIST dataset. The results show that SNNs using HE achieve up to 40% higher accuracy than DNNs for low values of the plaintext modulus t, although their execution time is longer due to their time-coding nature with multiple time-steps. △ Less

Submitted 12 October, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

Journal ref: Information 2023, 14, 537

arXiv:2304.03986 [pdf, other]

SwiftTron: An Efficient Hardware Accelerator for Quantized Transformers

Authors: Alberto Marchisio, Davide Dura, Maurizio Capra, Maurizio Martina, Guido Masera, Muhammad Shafique

Abstract: Transformers' compute-intensive operations pose enormous challenges for their deployment in resource-constrained EdgeAI / tinyML devices. As an established neural network compression technique, quantization reduces the hardware computational and memory resources. In particular, fixed-point quantization is desirable to ease the computations using lightweight blocks, like adders and multipliers, of… ▽ More Transformers' compute-intensive operations pose enormous challenges for their deployment in resource-constrained EdgeAI / tinyML devices. As an established neural network compression technique, quantization reduces the hardware computational and memory resources. In particular, fixed-point quantization is desirable to ease the computations using lightweight blocks, like adders and multipliers, of the underlying hardware. However, deploying fully-quantized Transformers on existing general-purpose hardware, generic AI accelerators, or specialized architectures for Transformers with floating-point units might be infeasible and/or inefficient. Towards this, we propose SwiftTron, an efficient specialized hardware accelerator designed for Quantized Transformers. SwiftTron supports the execution of different types of Transformers' operations (like Attention, Softmax, GELU, and Layer Normalization) and accounts for diverse scaling factors to perform correct computations. We synthesize the complete SwiftTron architecture in a $65$ nm CMOS technology with the ASIC design flow. Our Accelerator executes the RoBERTa-base model in 1.83 ns, while consuming 33.64 mW power, and occupying an area of 273 mm^2. To ease the reproducibility, the RTL of our SwiftTron architecture is released at https://github.com/albertomarchisio/SwiftTron. △ Less

Submitted 25 April, 2023; v1 submitted 8 April, 2023; originally announced April 2023.

Comments: To appear at the 2023 International Joint Conference on Neural Networks (IJCNN), Queensland, Australia, June 2023

arXiv:2304.03973 [pdf, other]

RobCaps: Evaluating the Robustness of Capsule Networks against Affine Transformations and Adversarial Attacks

Authors: Alberto Marchisio, Antonio De Marco, Alessio Colucci, Maurizio Martina, Muhammad Shafique

Abstract: Capsule Networks (CapsNets) are able to hierarchically preserve the pose relationships between multiple objects for image classification tasks. Other than achieving high accuracy, another relevant factor in deploying CapsNets in safety-critical applications is the robustness against input transformations and malicious adversarial attacks. In this paper, we systematically analyze and evaluate dif… ▽ More Capsule Networks (CapsNets) are able to hierarchically preserve the pose relationships between multiple objects for image classification tasks. Other than achieving high accuracy, another relevant factor in deploying CapsNets in safety-critical applications is the robustness against input transformations and malicious adversarial attacks. In this paper, we systematically analyze and evaluate different factors affecting the robustness of CapsNets, compared to traditional Convolutional Neural Networks (CNNs). Towards a comprehensive comparison, we test two CapsNet models and two CNN models on the MNIST, GTSRB, and CIFAR10 datasets, as well as on the affine-transformed versions of such datasets. With a thorough analysis, we show which properties of these architectures better contribute to increasing the robustness and their limitations. Overall, CapsNets achieve better robustness against adversarial examples and affine transformations, compared to a traditional CNN with a similar number of parameters. Similar conclusions have been derived for deeper versions of CapsNets and CNNs. Moreover, our results unleash a key finding that the dynamic routing does not contribute much to improving the CapsNets' robustness. Indeed, the main generalization contribution is due to the hierarchical feature learning through capsules. △ Less

Submitted 25 April, 2023; v1 submitted 8 April, 2023; originally announced April 2023.

Comments: To appear at the 2023 International Joint Conference on Neural Networks (IJCNN), Queensland, Australia, June 2023

arXiv:2210.06888 [pdf, other]

doi 10.1109/ACCESS.2022.3213734

AccelAT: A Framework for Accelerating the Adversarial Training of Deep Neural Networks through Accuracy Gradient

Authors: Farzad Nikfam, Alberto Marchisio, Maurizio Martina, Muhammad Shafique

Abstract: Adversarial training is exploited to develop a robust Deep Neural Network (DNN) model against the malicious altered data. These attacks may have catastrophic effects on DNN models but are indistinguishable for a human being. For example, an external attack can modify an image adding noises invisible for a human eye, but a DNN model misclassified the image. A key objective for develo** robust DNN… ▽ More Adversarial training is exploited to develop a robust Deep Neural Network (DNN) model against the malicious altered data. These attacks may have catastrophic effects on DNN models but are indistinguishable for a human being. For example, an external attack can modify an image adding noises invisible for a human eye, but a DNN model misclassified the image. A key objective for develo** robust DNN models is to use a learning algorithm that is fast but can also give model that is robust against different types of adversarial attacks. Especially for adversarial training, enormously long training times are needed for obtaining high accuracy under many different types of adversarial samples generated using different adversarial attack techniques. This paper aims at accelerating the adversarial training to enable fast development of robust DNN models against adversarial attacks. The general method for improving the training performance is the hyperparameters fine-tuning, where the learning rate is one of the most crucial hyperparameters. By modifying its shape (the value over time) and value during the training, we can obtain a model robust to adversarial attacks faster than standard training. First, we conduct experiments on two different datasets (CIFAR10, CIFAR100), exploring various techniques. Then, this analysis is leveraged to develop a novel fast training methodology, AccelAT, which automatically adjusts the learning rate for different epochs based on the accuracy gradient. The experiments show comparable results with the related works, and in several experiments, the adversarial training of DNNs using our AccelAT framework is conducted up to 2 times faster than the existing techniques. Thus, our findings boost the speed of adversarial training in an era in which security and performance are fundamental optimization objectives in DNN-based applications. △ Less

Submitted 13 October, 2022; originally announced October 2022.

Comments: 12 pages

arXiv:2210.05276 [pdf, other]

RoHNAS: A Neural Architecture Search Framework with Conjoint Optimization for Adversarial Robustness and Hardware Efficiency of Convolutional and Capsule Networks

Authors: Alberto Marchisio, Vojtech Mrazek, Andrea Massa, Beatrice Bussolino, Maurizio Martina, Muhammad Shafique

Abstract: Neural Architecture Search (NAS) algorithms aim at finding efficient Deep Neural Network (DNN) architectures for a given application under given system constraints. DNNs are computationally-complex as well as vulnerable to adversarial attacks. In order to address multiple design objectives, we propose RoHNAS, a novel NAS framework that jointly optimizes for adversarial-robustness and hardware-effi… ▽ More Neural Architecture Search (NAS) algorithms aim at finding efficient Deep Neural Network (DNN) architectures for a given application under given system constraints. DNNs are computationally-complex as well as vulnerable to adversarial attacks. In order to address multiple design objectives, we propose RoHNAS, a novel NAS framework that jointly optimizes for adversarial-robustness and hardware-efficiency of DNNs executed on specialized hardware accelerators. Besides the traditional convolutional DNNs, RoHNAS additionally accounts for complex types of DNNs such as Capsule Networks. For reducing the exploration time, RoHNAS analyzes and selects appropriate values of adversarial perturbation for each dataset to employ in the NAS flow. Extensive evaluations on multi - Graphics Processing Unit (GPU) - High Performance Computing (HPC) nodes provide a set of Pareto-optimal solutions, leveraging the tradeoff between the above-discussed design objectives. For example, a Pareto-optimal DNN for the CIFAR-10 dataset exhibits 86.07% accuracy, while having an energy of 38.63 mJ, a memory footprint of 11.85 MiB, and a latency of 4.47 ms. △ Less

Submitted 11 October, 2022; originally announced October 2022.

Comments: Accepted for publication at IEEE Access

arXiv:2208.02253 [pdf, other]

LaneSNNs: Spiking Neural Networks for Lane Detection on the Loihi Neuromorphic Processor

Authors: Alberto Viale, Alberto Marchisio, Maurizio Martina, Guido Masera, Muhammad Shafique

Abstract: Autonomous Driving (AD) related features represent important elements for the next generation of mobile robots and autonomous vehicles focused on increasingly intelligent, autonomous, and interconnected systems. The applications involving the use of these features must provide, by definition, real-time decisions, and this property is key to avoid catastrophic accidents. Moreover, all the decision… ▽ More Autonomous Driving (AD) related features represent important elements for the next generation of mobile robots and autonomous vehicles focused on increasingly intelligent, autonomous, and interconnected systems. The applications involving the use of these features must provide, by definition, real-time decisions, and this property is key to avoid catastrophic accidents. Moreover, all the decision processes must require low power consumption, to increase the lifetime and autonomy of battery-driven systems. These challenges can be addressed through efficient implementations of Spiking Neural Networks (SNNs) on Neuromorphic Chips and the use of event-based cameras instead of traditional frame-based cameras. In this paper, we present a new SNN-based approach, called LaneSNN, for detecting the lanes marked on the streets using the event-based camera input. We develop four novel SNN models characterized by low complexity and fast response, and train them using an offline supervised learning rule. Afterward, we implement and map the learned SNNs models onto the Intel Loihi Neuromorphic Research Chip. For the loss function, we develop a novel method based on the linear composition of Weighted binary Cross Entropy (WCE) and Mean Squared Error (MSE) measures. Our experimental results show a maximum Intersection over Union (IoU) measure of about 0.62 and very low power consumption of about 1 W. The best IoU is achieved with an SNN implementation that occupies only 36 neurocores on the Loihi processor while providing a low latency of less than 8 ms to recognize an image, thereby enabling real-time performance. The IoU measures provided by our networks are comparable with the state-of-the-art, but at a much low power consumption of 1 W. △ Less

Submitted 3 August, 2022; originally announced August 2022.

Comments: To appear at the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022)

arXiv:2208.00331 [pdf, other]

CoNLoCNN: Exploiting Correlation and Non-Uniform Quantization for Energy-Efficient Low-precision Deep Convolutional Neural Networks

Authors: Muhammad Abdullah Hanif, Giuseppe Maria Sarda, Alberto Marchisio, Guido Masera, Maurizio Martina, Muhammad Shafique

Abstract: In today's era of smart cyber-physical systems, Deep Neural Networks (DNNs) have become ubiquitous due to their state-of-the-art performance in complex real-world applications. The high computational complexity of these networks, which translates to increased energy consumption, is the foremost obstacle towards deploying large DNNs in resource-constrained systems. Fixed-Point (FP) implementations… ▽ More In today's era of smart cyber-physical systems, Deep Neural Networks (DNNs) have become ubiquitous due to their state-of-the-art performance in complex real-world applications. The high computational complexity of these networks, which translates to increased energy consumption, is the foremost obstacle towards deploying large DNNs in resource-constrained systems. Fixed-Point (FP) implementations achieved through post-training quantization are commonly used to curtail the energy consumption of these networks. However, the uniform quantization intervals in FP restrict the bit-width of data structures to large values due to the need to represent most of the numbers with sufficient resolution and avoid high quantization errors. In this paper, we leverage the key insight that (in most of the scenarios) DNN weights and activations are mostly concentrated near zero and only a few of them have large magnitudes. We propose CoNLoCNN, a framework to enable energy-efficient low-precision deep convolutional neural network inference by exploiting: (1) non-uniform quantization of weights enabling simplification of complex multiplication operations; and (2) correlation between activation values enabling partial compensation of quantization errors at low cost without any run-time overheads. To significantly benefit from non-uniform quantization, we also propose a novel data representation format, Encoded Low-Precision Binary Signed Digit, to compress the bit-width of weights while ensuring direct use of the encoded weight for processing using a novel multiply-and-accumulate (MAC) unit design. △ Less

Submitted 30 July, 2022; originally announced August 2022.

Comments: 8 pages, 15 figures, 2 tables

arXiv:2206.10200 [pdf, other]

Enabling Capsule Networks at the Edge through Approximate Softmax and Squash Operations

Authors: Alberto Marchisio, Beatrice Bussolino, Edoardo Salvati, Maurizio Martina, Guido Masera, Muhammad Shafique

Abstract: Complex Deep Neural Networks such as Capsule Networks (CapsNets) exhibit high learning capabilities at the cost of compute-intensive operations. To enable their deployment on edge devices, we propose to leverage approximate computing for designing approximate variants of the complex operations like softmax and squash. In our experiments, we evaluate tradeoffs between area, power consumption, and c… ▽ More Complex Deep Neural Networks such as Capsule Networks (CapsNets) exhibit high learning capabilities at the cost of compute-intensive operations. To enable their deployment on edge devices, we propose to leverage approximate computing for designing approximate variants of the complex operations like softmax and squash. In our experiments, we evaluate tradeoffs between area, power consumption, and critical path delay of the designs implemented with the ASIC design flow, and the accuracy of the quantized CapsNets, compared to the exact functions. △ Less

Submitted 21 June, 2022; originally announced June 2022.

Comments: To appear at the ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED), August 2022, Boston, MA, USA

arXiv:2206.02516 [pdf]

doi 10.1016/j.renene.2019.04.075

Comparative study of different functionalized graphene-nanoplatelet aqueous nanofluids for solar energy applications

Authors: Javier P. Vallejo, Luca Mercatelli, Maria Raffaella Martina, Daniele Di Rosa, Aldo Dell'Oro, Luis Lugo, Elisa Sani

Abstract: The optical properties of nanofluids are peculiar and interesting for a variety of applications. Among them, the high light extinction coefficient of nanofluids can be useful in linear parabolic concentrating solar systems, while their properties under high light irradiation intensities can be exploited for direct solar steam generation. The optical characterization of colloids, including the stud… ▽ More The optical properties of nanofluids are peculiar and interesting for a variety of applications. Among them, the high light extinction coefficient of nanofluids can be useful in linear parabolic concentrating solar systems, while their properties under high light irradiation intensities can be exploited for direct solar steam generation. The optical characterization of colloids, including the study of non-linear optical properties, is thus a needed step to design the use of such novel materials for solar energy exploitation. In this work, we analysed two different types of nanofluids, consisting of polycarboxylate chemically modified graphene nanoplatelets (P-GnP) and sulfonic acid-functionalized graphene nanoplatelets (S-GnP) dispersed in water, at three concentrations from 0.005 wt% to 0.05 wt%. Moderately stable nanofluids were achieved with favourable light extinction properties, as well as a non-linear optical behaviour under high input solar intensities. △ Less

Submitted 28 April, 2022; originally announced June 2022.

Journal ref: Renewable Energy, vol. 141 (2019), pages 791-801

arXiv:2205.13807 [pdf, other]

fakeWeather: Adversarial Attacks for Deep Neural Networks Emulating Weather Conditions on the Camera Lens of Autonomous Systems

Authors: Alberto Marchisio, Giovanni Caramia, Maurizio Martina, Muhammad Shafique

Abstract: Recently, Deep Neural Networks (DNNs) have achieved remarkable performances in many applications, while several studies have enhanced their vulnerabilities to malicious attacks. In this paper, we emulate the effects of natural weather conditions to introduce plausible perturbations that mislead the DNNs. By observing the effects of such atmospheric perturbations on the camera lenses, we model the… ▽ More Recently, Deep Neural Networks (DNNs) have achieved remarkable performances in many applications, while several studies have enhanced their vulnerabilities to malicious attacks. In this paper, we emulate the effects of natural weather conditions to introduce plausible perturbations that mislead the DNNs. By observing the effects of such atmospheric perturbations on the camera lenses, we model the patterns to create different masks that fake the effects of rain, snow, and hail. Even though the perturbations introduced by our attacks are visible, their presence remains unnoticed due to their association with natural events, which can be especially catastrophic for fully-autonomous and unmanned vehicles. We test our proposed fakeWeather attacks on multiple Convolutional Neural Network and Capsule Network models, and report noticeable accuracy drops in the presence of such adversarial perturbations. Our work introduces a new security threat for DNNs, which is especially severe for safety-critical applications and autonomous systems. △ Less

Submitted 27 May, 2022; originally announced May 2022.

Comments: To appear at the 2022 International Joint Conference on Neural Networks (IJCNN), at the 2022 IEEE World Congress on Computational Intelligence (WCCI), July 2022, Padua, Italy

arXiv:2109.00533 [pdf, other]

R-SNN: An Analysis and Design Methodology for Robustifying Spiking Neural Networks against Adversarial Attacks through Noise Filters for Dynamic Vision Sensors

Authors: Alberto Marchisio, Giacomo Pira, Maurizio Martina, Guido Masera, Muhammad Shafique

Abstract: Spiking Neural Networks (SNNs) aim at providing energy-efficient learning capabilities when implemented on neuromorphic chips with event-based Dynamic Vision Sensors (DVS). This paper studies the robustness of SNNs against adversarial attacks on such DVS-based systems, and proposes R-SNN, a novel methodology for robustifying SNNs through efficient DVS-noise filtering. We are the first to generate… ▽ More Spiking Neural Networks (SNNs) aim at providing energy-efficient learning capabilities when implemented on neuromorphic chips with event-based Dynamic Vision Sensors (DVS). This paper studies the robustness of SNNs against adversarial attacks on such DVS-based systems, and proposes R-SNN, a novel methodology for robustifying SNNs through efficient DVS-noise filtering. We are the first to generate adversarial attacks on DVS signals (i.e., frames of events in the spatio-temporal domain) and to apply noise filters for DVS sensors in the quest for defending against adversarial attacks. Our results show that the noise filters effectively prevent the SNNs from being fooled. The SNNs in our experiments provide more than 90% accuracy on the DVS-Gesture and NMNIST datasets under different adversarial threat models. △ Less

Submitted 1 September, 2021; originally announced September 2021.

Comments: To appear at the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021). arXiv admin note: text overlap with arXiv:2107.00415

arXiv:2107.00415 [pdf, other]

DVS-Attacks: Adversarial Attacks on Dynamic Vision Sensors for Spiking Neural Networks

Authors: Alberto Marchisio, Giacomo Pira, Maurizio Martina, Guido Masera, Muhammad Shafique

Abstract: Spiking Neural Networks (SNNs), despite being energy-efficient when implemented on neuromorphic hardware and coupled with event-based Dynamic Vision Sensors (DVS), are vulnerable to security threats, such as adversarial attacks, i.e., small perturbations added to the input for inducing a misclassification. Toward this, we propose DVS-Attacks, a set of stealthy yet efficient adversarial attack meth… ▽ More Spiking Neural Networks (SNNs), despite being energy-efficient when implemented on neuromorphic hardware and coupled with event-based Dynamic Vision Sensors (DVS), are vulnerable to security threats, such as adversarial attacks, i.e., small perturbations added to the input for inducing a misclassification. Toward this, we propose DVS-Attacks, a set of stealthy yet efficient adversarial attack methodologies targeted to perturb the event sequences that compose the input of the SNNs. First, we show that noise filters for DVS can be used as defense mechanisms against adversarial attacks. Afterwards, we implement several attacks and test them in the presence of two types of noise filters for DVS cameras. The experimental results show that the filters can only partially defend the SNNs against our proposed DVS-Attacks. Using the best settings for the noise filters, our proposed Mask Filter-Aware Dash Attack reduces the accuracy by more than 20% on the DVS-Gesture dataset and by more than 65% on the MNIST dataset, compared to the original clean frames. The source code of all the proposed DVS-Attacks and noise filters is released at https://github.com/albertomarchisio/DVS-Attacks. △ Less

Submitted 1 July, 2021; originally announced July 2021.

Comments: Accepted for publication at IJCNN 2021

arXiv:2107.00401 [pdf, other]

CarSNN: An Efficient Spiking Neural Network for Event-Based Autonomous Cars on the Loihi Neuromorphic Research Processor

Authors: Alberto Viale, Alberto Marchisio, Maurizio Martina, Guido Masera, Muhammad Shafique

Abstract: Autonomous Driving (AD) related features provide new forms of mobility that are also beneficial for other kind of intelligent and autonomous systems like robots, smart transportation, and smart industries. For these applications, the decisions need to be made fast and in real-time. Moreover, in the quest for electric mobility, this task must follow low power policy, without affecting much the auto… ▽ More Autonomous Driving (AD) related features provide new forms of mobility that are also beneficial for other kind of intelligent and autonomous systems like robots, smart transportation, and smart industries. For these applications, the decisions need to be made fast and in real-time. Moreover, in the quest for electric mobility, this task must follow low power policy, without affecting much the autonomy of the mean of transport or the robot. These two challenges can be tackled using the emerging Spiking Neural Networks (SNNs). When deployed on a specialized neuromorphic hardware, SNNs can achieve high performance with low latency and low power consumption. In this paper, we use an SNN connected to an event-based camera for facing one of the key problems for AD, i.e., the classification between cars and other objects. To consume less power than traditional frame-based cameras, we use a Dynamic Vision Sensor (DVS). The experiments are made following an offline supervised learning rule, followed by map** the learnt SNN model on the Intel Loihi Neuromorphic Research Chip. Our best experiment achieves an accuracy on offline implementation of 86%, that drops to 83% when it is ported onto the Loihi Chip. The Neuromorphic Hardware implementation has maximum 0.72 ms of latency for every sample, and consumes only 310 mW. To the best of our knowledge, this work is the first implementation of an event-based car classifier on a Neuromorphic Chip. △ Less

Submitted 1 July, 2021; originally announced July 2021.

Comments: Accepted for publication at IJCNN 2021

arXiv:2012.11233 [pdf, other]

doi 10.1109/ACCESS.2020.3039858

Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead

Authors: Maurizio Capra, Beatrice Bussolino, Alberto Marchisio, Guido Masera, Maurizio Martina, Muhammad Shafique

Abstract: Currently, Machine Learning (ML) is becoming ubiquitous in everyday life. Deep Learning (DL) is already present in many applications ranging from computer vision for medicine to autonomous driving of modern cars as well as other sectors in security, healthcare, and finance. However, to achieve impressive performance, these algorithms employ very deep networks, requiring a significant computational… ▽ More Currently, Machine Learning (ML) is becoming ubiquitous in everyday life. Deep Learning (DL) is already present in many applications ranging from computer vision for medicine to autonomous driving of modern cars as well as other sectors in security, healthcare, and finance. However, to achieve impressive performance, these algorithms employ very deep networks, requiring a significant computational power, both during the training and inference time. A single inference of a DL model may require billions of multiply-and-accumulated operations, making the DL extremely compute- and energy-hungry. In a scenario where several sophisticated algorithms need to be executed with limited energy and low latency, the need for cost-effective hardware platforms capable of implementing energy-efficient DL execution arises. This paper first introduces the key properties of two brain-inspired models like Deep Neural Network (DNN), and Spiking Neural Network (SNN), and then analyzes techniques to produce efficient and high-performance designs. This work summarizes and compares the works for four leading platforms for the execution of algorithms such as CPU, GPU, FPGA and ASIC describing the main solutions of the state-of-the-art, giving much prominence to the last two solutions since they offer greater design flexibility and bear the potential of high energy-efficiency, especially for the inference process. In addition to hardware solutions, this paper discusses some of the important security issues that these DNN and SNN models may have during their execution, and offers a comprehensive section on benchmarking, explaining how to assess the quality of different networks and hardware systems designed for them. △ Less

Submitted 21 December, 2020; originally announced December 2020.

Comments: Accepted for publication in IEEE Access

arXiv:2008.08476 [pdf, other]

doi 10.1145/3400302.3415731

NASCaps: A Framework for Neural Architecture Search to Optimize the Accuracy and Hardware Efficiency of Convolutional Capsule Networks

Authors: Alberto Marchisio, Andrea Massa, Vojtech Mrazek, Beatrice Bussolino, Maurizio Martina, Muhammad Shafique

Abstract: Deep Neural Networks (DNNs) have made significant improvements to reach the desired accuracy to be employed in a wide variety of Machine Learning (ML) applications. Recently the Google Brain's team demonstrated the ability of Capsule Networks (CapsNets) to encode and learn spatial correlations between different input features, thereby obtaining superior learning capabilities compared to traditiona… ▽ More Deep Neural Networks (DNNs) have made significant improvements to reach the desired accuracy to be employed in a wide variety of Machine Learning (ML) applications. Recently the Google Brain's team demonstrated the ability of Capsule Networks (CapsNets) to encode and learn spatial correlations between different input features, thereby obtaining superior learning capabilities compared to traditional (i.e., non-capsule based) DNNs. However, designing CapsNets using conventional methods is a tedious job and incurs significant training effort. Recent studies have shown that powerful methods to automatically select the best/optimal DNN model configuration for a given set of applications and a training dataset are based on the Neural Architecture Search (NAS) algorithms. Moreover, due to their extreme computational and memory requirements, DNNs are employed using the specialized hardware accelerators in IoT-Edge/CPS devices. In this paper, we propose NASCaps, an automated framework for the hardware-aware NAS of different types of DNNs, covering both traditional convolutional DNNs and CapsNets. We study the efficacy of deploying a multi-objective Genetic Algorithm (e.g., based on the NSGA-II algorithm). The proposed framework can jointly optimize the network accuracy and the corresponding hardware efficiency, expressed in terms of energy, memory, and latency of a given hardware accelerator executing the DNN inference. Besides supporting the traditional DNN layers, our framework is the first to model and supports the specialized capsule layers and dynamic routing in the NAS-flow. We evaluate our framework on different datasets, generating different network configurations, and demonstrate the tradeoffs between the different output metrics. We will open-source the complete framework and configurations of the Pareto-optimal architectures at https://github.com/ehw-fit/nascaps. △ Less

Submitted 19 August, 2020; originally announced August 2020.

Comments: To appear at the IEEE/ACM International Conference on Computer-Aided Design (ICCAD '20), November 2-5, 2020, Virtual Event, USA

arXiv:2006.09985 [pdf, other]

An Efficient Spiking Neural Network for Recognizing Gestures with a DVS Camera on the Loihi Neuromorphic Processor

Authors: Riccardo Massa, Alberto Marchisio, Maurizio Martina, Muhammad Shafique

Abstract: Spiking Neural Networks (SNNs), the third generation NNs, have come under the spotlight for machine learning based applications due to their biological plausibility and reduced complexity compared to traditional artificial Deep Neural Networks (DNNs). These SNNs can be implemented with extreme energy efficiency on neuromorphic processors like the Intel Loihi research chip, and fed by event-based s… ▽ More Spiking Neural Networks (SNNs), the third generation NNs, have come under the spotlight for machine learning based applications due to their biological plausibility and reduced complexity compared to traditional artificial Deep Neural Networks (DNNs). These SNNs can be implemented with extreme energy efficiency on neuromorphic processors like the Intel Loihi research chip, and fed by event-based sensors, such as DVS cameras. However, DNNs with many layers can achieve relatively high accuracy on image classification and recognition tasks, as the research on learning rules for SNNs for real-world applications is still not mature. The accuracy results for SNNs are typically obtained either by converting the trained DNNs into SNNs, or by directly designing and training SNNs in the spiking domain. Towards the conversion from a DNN to an SNN, we perform a comprehensive analysis of such process, specifically designed for Intel Loihi, showing our methodology for the design of an SNN that achieves nearly the same accuracy results as its corresponding DNN. Towards the usage of the event-based sensors, we design a pre-processing method, evaluated for the DvsGesture dataset, which makes it possible to be used in the DNN domain. Hence, based on the outcome of the first analysis, we train a DNN for the pre-processed DvsGesture dataset, and convert it into the spike domain for its deployment on Intel Loihi, which enables real-time gesture recognition. The results show that our SNN achieves 89.64% classification accuracy and occupies only 37 Loihi cores. The source code for generating our experiments is available online at https://github.com/albertomarchisio/EfficientSNN. △ Less

Submitted 25 January, 2021; v1 submitted 16 May, 2020; originally announced June 2020.

Comments: Accepted for publication at the 2020 International Joint Conference on Neural Networks (IJCNN)

arXiv:2005.08041 [pdf, other]

doi 10.1109/IJCNN48605.2020.9207351

NeuroAttack: Undermining Spiking Neural Networks Security through Externally Triggered Bit-Flips

Authors: Valerio Venceslai, Alberto Marchisio, Ihsen Alouani, Maurizio Martina, Muhammad Shafique

Abstract: Due to their proven efficiency, machine-learning systems are deployed in a wide range of complex real-life problems. More specifically, Spiking Neural Networks (SNNs) emerged as a promising solution to the accuracy, resource-utilization, and energy-efficiency challenges in machine-learning systems. While these systems are going mainstream, they have inherent security and reliability issues. In thi… ▽ More Due to their proven efficiency, machine-learning systems are deployed in a wide range of complex real-life problems. More specifically, Spiking Neural Networks (SNNs) emerged as a promising solution to the accuracy, resource-utilization, and energy-efficiency challenges in machine-learning systems. While these systems are going mainstream, they have inherent security and reliability issues. In this paper, we propose NeuroAttack, a cross-layer attack that threatens the SNNs integrity by exploiting low-level reliability issues through a high-level attack. Particularly, we trigger a fault-injection based sneaky hardware backdoor through a carefully crafted adversarial input noise. Our results on Deep Neural Networks (DNNs) and SNNs show a serious integrity threat to state-of-the art machine-learning techniques. △ Less

Submitted 16 May, 2020; originally announced May 2020.

Comments: Accepted for publication at the 2020 International Joint Conference on Neural Networks (IJCNN)

arXiv:2004.07116 [pdf, other]

doi 10.1109/DAC18072.2020.9218746

Q-CapsNets: A Specialized Framework for Quantizing Capsule Networks

Authors: Alberto Marchisio, Beatrice Bussolino, Alessio Colucci, Maurizio Martina, Guido Masera, Muhammad Shafique

Abstract: Capsule Networks (CapsNets), recently proposed by the Google Brain team, have superior learning capabilities in machine learning tasks, like image classification, compared to the traditional CNNs. However, CapsNets require extremely intense computations and are difficult to be deployed in their original form at the resource-constrained edge devices. This paper makes the first attempt to quantize C… ▽ More Capsule Networks (CapsNets), recently proposed by the Google Brain team, have superior learning capabilities in machine learning tasks, like image classification, compared to the traditional CNNs. However, CapsNets require extremely intense computations and are difficult to be deployed in their original form at the resource-constrained edge devices. This paper makes the first attempt to quantize CapsNet models, to enable their efficient edge implementations, by develo** a specialized quantization framework for CapsNets. We evaluate our framework for several benchmarks. On a deep CapsNet model for the CIFAR10 dataset, the framework reduces the memory footprint by 6.2x, with only 0.15% accuracy loss. We will open-source our framework at https://git.io/JvDIF in August 2020. △ Less

Submitted 17 April, 2020; v1 submitted 15 April, 2020; originally announced April 2020.

Comments: Accepted for publication at Design Automation Conference 2020 (DAC 2020)

arXiv:1905.10142 [pdf, other]

doi 10.1109/IJCNN48605.2020.9207533

FasTrCaps: An Integrated Framework for Fast yet Accurate Training of Capsule Networks

Authors: Alberto Marchisio, Beatrice Bussolino, Alessio Colucci, Muhammad Abdullah Hanif, Maurizio Martina, Guido Masera, Muhammad Shafique

Abstract: Recently, Capsule Networks (CapsNets) have shown improved performance compared to the traditional Convolutional Neural Networks (CNNs), by encoding and preserving spatial relationships between the detected features in a better way. This is achieved through the so-called Capsules (i.e., groups of neurons) that encode both the instantiation probability and the spatial information. However, one of th… ▽ More Recently, Capsule Networks (CapsNets) have shown improved performance compared to the traditional Convolutional Neural Networks (CNNs), by encoding and preserving spatial relationships between the detected features in a better way. This is achieved through the so-called Capsules (i.e., groups of neurons) that encode both the instantiation probability and the spatial information. However, one of the major hurdles in the wide adoption of CapsNets is their gigantic training time, which is primarily due to the relatively higher complexity of their new constituting elements that are different from CNNs. In this paper, we implement different optimizations in the training loop of the CapsNets, and investigate how these optimizations affect their training speed and the accuracy. Towards this, we propose a novel framework FasTrCaps that integrates multiple lightweight optimizations and a novel learning rate policy called WarmAdaBatch (that jointly performs warm restarts and adaptive batch size), and steers them in an appropriate way to provide high training-loop speedup at minimal accuracy loss. We also propose weight sharing for capsule layers. The goal is to reduce the hardware requirements of CapsNets by removing unused/redundant connections and capsules, while kee** high accuracy through tests of different learning rate policies and batch sizes. We demonstrate that one of the solutions generated by the FasTrCaps framework can achieve 58.6% reduction in the training time, while preserving the accuracy (even 0.12% accuracy improvement for the MNIST dataset), compared to the CapsNet by Google Brain. The Pareto-optimal solutions generated by FasTrCaps can be leveraged to realize trade-offs between training time and achieved accuracy. We have open-sourced our framework on https://github.com/Alexei95/FasTrCaps. △ Less

Submitted 18 May, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

Comments: Accepted for publication at the 2020 International Joint Conference on Neural Networks (IJCNN)

arXiv:1902.01147 [pdf, other]

doi 10.1109/IJCNN48605.2020.9207297

Is Spiking Secure? A Comparative Study on the Security Vulnerabilities of Spiking and Deep Neural Networks

Authors: Alberto Marchisio, Giorgio Nanfa, Faiq Khalid, Muhammad Abdullah Hanif, Maurizio Martina, Muhammad Shafique

Abstract: Spiking Neural Networks (SNNs) claim to present many advantages in terms of biological plausibility and energy efficiency compared to standard Deep Neural Networks (DNNs). Recent works have shown that DNNs are vulnerable to adversarial attacks, i.e., small perturbations added to the input data can lead to targeted or random misclassifications. In this paper, we aim at investigating the key researc… ▽ More Spiking Neural Networks (SNNs) claim to present many advantages in terms of biological plausibility and energy efficiency compared to standard Deep Neural Networks (DNNs). Recent works have shown that DNNs are vulnerable to adversarial attacks, i.e., small perturbations added to the input data can lead to targeted or random misclassifications. In this paper, we aim at investigating the key research question: ``Are SNNs secure?'' Towards this, we perform a comparative study of the security vulnerabilities in SNNs and DNNs w.r.t. the adversarial noise. Afterwards, we propose a novel black-box attack methodology, i.e., without the knowledge of the internal structure of the SNN, which employs a greedy heuristic to automatically generate imperceptible and robust adversarial examples (i.e., attack images) for the given SNN. We perform an in-depth evaluation for a Spiking Deep Belief Network (SDBN) and a DNN having the same number of layers and neurons (to obtain a fair comparison), in order to study the efficiency of our methodology and to understand the differences between SNNs and DNNs w.r.t. the adversarial examples. Our work opens new avenues of research towards the robustness of the SNNs, considering their similarities to the human brain's functionality. △ Less

Submitted 18 May, 2020; v1 submitted 4 February, 2019; originally announced February 2019.

Comments: Accepted for publication at the 2020 International Joint Conference on Neural Networks (IJCNN)

arXiv:1901.09878 [pdf, other]

CapsAttacks: Robust and Imperceptible Adversarial Attacks on Capsule Networks

Authors: Alberto Marchisio, Giorgio Nanfa, Faiq Khalid, Muhammad Abdullah Hanif, Maurizio Martina, Muhammad Shafique

Abstract: Capsule Networks preserve the hierarchical spatial relationships between objects, and thereby bears a potential to surpass the performance of traditional Convolutional Neural Networks (CNNs) in performing tasks like image classification. A large body of work has explored adversarial examples for CNNs, but their effectiveness on Capsule Networks has not yet been well studied. In our work, we perfor… ▽ More Capsule Networks preserve the hierarchical spatial relationships between objects, and thereby bears a potential to surpass the performance of traditional Convolutional Neural Networks (CNNs) in performing tasks like image classification. A large body of work has explored adversarial examples for CNNs, but their effectiveness on Capsule Networks has not yet been well studied. In our work, we perform an analysis to study the vulnerabilities in Capsule Networks to adversarial attacks. These perturbations, added to the test inputs, are small and imperceptible to humans, but can fool the network to mispredict. We propose a greedy algorithm to automatically generate targeted imperceptible adversarial examples in a black-box attack scenario. We show that this kind of attacks, when applied to the German Traffic Sign Recognition Benchmark (GTSRB), mislead Capsule Networks. Moreover, we apply the same kind of adversarial attacks to a 5-layer CNN and a 9-layer CNN, and analyze the outcome, compared to the Capsule Networks to study differences in their behavior. △ Less

Submitted 24 May, 2019; v1 submitted 28 January, 2019; originally announced January 2019.

arXiv:1811.03980 [pdf, other]

A Methodology for Automatic Selection of Activation Functions to Design Hybrid Deep Neural Networks

Authors: Alberto Marchisio, Muhammad Abdullah Hanif, Semeen Rehman, Maurizio Martina, Muhammad Shafique

Abstract: Activation functions influence behavior and performance of DNNs. Nonlinear activation functions, like Rectified Linear Units (ReLU), Exponential Linear Units (ELU) and Scaled Exponential Linear Units (SELU), outperform the linear counterparts. However, selecting an appropriate activation function is a challenging problem, as it affects the accuracy and the complexity of the given DNN. In this pape… ▽ More Activation functions influence behavior and performance of DNNs. Nonlinear activation functions, like Rectified Linear Units (ReLU), Exponential Linear Units (ELU) and Scaled Exponential Linear Units (SELU), outperform the linear counterparts. However, selecting an appropriate activation function is a challenging problem, as it affects the accuracy and the complexity of the given DNN. In this paper, we propose a novel methodology to automatically select the best-possible activation function for each layer of a given DNN, such that the overall DNN accuracy, compared to considering only one type of activation function for the whole DNN, is improved. However, an associated scientific challenge in exploring all the different configurations of activation functions would be time and resource-consuming. Towards this, our methodology identifies the Evaluation Points during learning to evaluate the accuracy in an intermediate step of training and to perform early termination by checking the accuracy gradient of the learning curve. This helps in significantly reducing the exploration time during training. Moreover, our methodology selects, for each layer, the dropout rate that optimizes the accuracy. Experiments show that we are able to achieve on average 7% to 15% Relative Error Reduction on MNIST, CIFAR-10 and CIFAR-100 benchmarks, with limited performance and power penalty on GPUs. △ Less

Submitted 27 October, 2018; originally announced November 2018.

arXiv:1105.1014 [pdf, ps, other]

Improving Network-on-Chip-based turbo decoder architectures

Authors: Maurizio Martina, Guido Masera

Abstract: In this work novel results concerning Network-on-Chip-based turbo decoder architectures are presented. Stemming from previous publications, this work concentrates first on improving the throughput by exploiting adaptive-bandwidth reduction techniques. This technique shows in the best case an improvement of more than 60 Mb/s. Moreover, it is known that double-binary turbo decoders require higher ar… ▽ More In this work novel results concerning Network-on-Chip-based turbo decoder architectures are presented. Stemming from previous publications, this work concentrates first on improving the throughput by exploiting adaptive-bandwidth reduction techniques. This technique shows in the best case an improvement of more than 60 Mb/s. Moreover, it is known that double-binary turbo decoders require higher area than binary ones. This characteristic has the negative effect of increasing the data width of the network nodes. Thus, the second contribution of this work is to reduce the network complexity to support doublebinary codes, by exploiting bit-level and pseudo-floating-point representation of the extrinsic information. These two techniques allow for an area reduction of up to more than the 40% with a performance degradation of about 0.2 dB. △ Less

Submitted 5 May, 2011; originally announced May 2011.

arXiv:1001.4694 [pdf]

VLSI Architectures for WIMAX Channel Decoders

Authors: Maurizio Martina, Guido Masera

Abstract: This chapter describes the main architectures proposed in the literature to implement the channel decoders required by the WiMax standard, namely convolutional codes, turbo codes (both block and convolutional) and LDPC. Then it shows a complete design of a convolutional turbo code encoder/decoder system for WiMax. This chapter describes the main architectures proposed in the literature to implement the channel decoders required by the WiMax standard, namely convolutional codes, turbo codes (both block and convolutional) and LDPC. Then it shows a complete design of a convolutional turbo code encoder/decoder system for WiMax. △ Less

Submitted 26 January, 2010; originally announced January 2010.

Comments: To appear in the book "WIMAX, New Developments", M. Upena, D. Dalal, Y. Kosta (Ed.), ISBN978-953-7619-53-4

arXiv:0909.1876 [pdf, ps, other]

doi 10.1109/TCSI.2010.2046257

Turbo NOC: a framework for the design of Network On Chip based turbo decoder architectures

Authors: Maurizio Martina, Guido Masera

Abstract: This work proposes a general framework for the design and simulation of network on chip based turbo decoder architectures. Several parameters in the design space are investigated, namely the network topology, the parallelism degree, the rate at which messages are sent by processing nodes over the network and the routing strategy. The main results of this analysis are: i) the most suited topologi… ▽ More This work proposes a general framework for the design and simulation of network on chip based turbo decoder architectures. Several parameters in the design space are investigated, namely the network topology, the parallelism degree, the rate at which messages are sent by processing nodes over the network and the routing strategy. The main results of this analysis are: i) the most suited topologies to achieve high throughput with a limited complexity overhead are generalized de-Bruijn and generalized Kautz topologies; ii) depending on the throughput requirements different parallelism degrees, message injection rates and routing algorithms can be used to minimize the network area overhead. △ Less

Submitted 10 September, 2009; originally announced September 2009.

Comments: submitted to IEEE Trans. on Circuits and Systems I (submission date 27 may 2009)

Showing 1–28 of 28 results for author: Martina, M