Skip to main content

Showing 1–50 of 59 results for author: Beerel, P A

.
  1. arXiv:2405.10951  [pdf, other

    cs.CV cs.LG

    Block Selective Reprogramming for On-device Training of Vision Transformers

    Authors: Sreetama Sarkar, Souvik Kundu, Kai Zheng, Peter A. Beerel

    Abstract: The ubiquity of vision transformers (ViTs) for various edge applications, including personalized learning, has created the demand for on-device fine-tuning. However, training with the limited memory and computation power of edge devices remains a significant challenge. In particular, the memory required for training is much higher than that needed for inference, primarily due to the need to store… ▽ More

    Submitted 25 March, 2024; originally announced May 2024.

  2. arXiv:2403.13269  [pdf, other

    cs.CL cs.AI cs.LG

    AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models

    Authors: Zeyu Liu, Souvik Kundu, Anni Li, Junrui Wan, Lianghao Jiang, Peter Anthony Beerel

    Abstract: We present a novel Parameter-Efficient Fine-Tuning (PEFT) method, dubbed as Adaptive Freezing of Low Rank Adaptation (AFLoRA). Specifically, for each pre-trained frozen weight tensor, we add a parallel path of trainable low-rank matrices, namely a down-projection and an up-projection matrix, each of which is followed by a feature transformation vector. Based on a novel freezing score, we the incre… ▽ More

    Submitted 16 April, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 5 pages, 5 figures

  3. arXiv:2402.15121  [pdf, other

    cs.AR cs.ET eess.IV

    Toward High Performance, Programmable Extreme-Edge Intelligence for Neuromorphic Vision Sensors utilizing Magnetic Domain Wall Motion-based MTJ

    Authors: Md Abdullah-Al Kaiser, Gourav Datta, Peter A. Beerel, Akhilesh R. Jaiswal

    Abstract: The desire to empower resource-limited edge devices with computer vision (CV) must overcome the high energy consumption of collecting and processing vast sensory data. To address the challenge, this work proposes an energy-efficient non-von-Neumann in-pixel processing solution for neuromorphic vision sensors employing emerging (X) magnetic domain wall magnetic tunnel junction (MDWMTJ) for the firs… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: 11 pages, 7 figures, 2 table

  4. arXiv:2402.05521  [pdf, other

    cs.LG cs.AI cs.CR

    Linearizing Models for Efficient yet Robust Private Inference

    Authors: Sreetama Sarkar, Souvik Kundu, Peter A. Beerel

    Abstract: The growing concern about data privacy has led to the development of private inference (PI) frameworks in client-server applications which protects both data privacy and model IP. However, the cryptographic primitives required yield significant latency overhead which limits its wide-spread application. At the same time, changing environments demand the PI service to be robust against various natur… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  5. arXiv:2402.04882  [pdf, other

    cs.NE cs.AI cs.LG cs.SD eess.AS

    LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre Memory Units

    Authors: Zeyu Liu, Gourav Datta, Anni Li, Peter Anthony Beerel

    Abstract: Transformer models have demonstrated high accuracy in numerous applications but have high complexity and lack sequential processing capability making them ill-suited for many streaming applications at the edge where devices are heavily resource-constrained. Thus motivated, many researchers have proposed reformulating the transformer models as RNN modules which modify the self-attention computation… ▽ More

    Submitted 19 January, 2024; originally announced February 2024.

    Comments: The 12th International Conference on Learning Representations (ICLR 2024)

  6. arXiv:2401.07393  [pdf, other

    cs.ET

    A Novel Optimization Algorithm for Buffer and Splitter Minimization in Phase-Skip** Adiabatic Quantum-Flux-Parametron Circuits

    Authors: Robert S. Aviles, Peter A. Beerel

    Abstract: Adiabatic Quantum-Flux-Parametron (AQFP) logic is a promising emerging device technology that promises six orders of magnitude lower power than CMOS. However, AQFP is challenged by operation at only ultra-low temperatures, has high latency and area, and requires a complex clocking scheme. In particular, every logic gate, buffer, and splitter must be clocked and each pair of connected clocked gates… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

  7. arXiv:2401.06411  [pdf, other

    cs.ET

    An Efficient and Scalable Clocking Assignment Algorithm for Multi-Threaded Multi-Phase Single Flux Quantum Circuits

    Authors: Robert S. Aviles, Xi Li, Lei Lu, Zhaorui Ni, Peter A. Beerel

    Abstract: A key distinguishing feature of single flux quantum (SFQ) circuits is that each logic gate is clocked. This feature forces the introduction of path-balancing flip-flops to ensure proper synchronization of inputs at each gate. This paper proposes a polynomial time complexity approximation algorithm for clocking assignments that minimizes the insertion of path balancing buffers for multi-threaded mu… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

  8. arXiv:2312.06900  [pdf, other

    cs.CV

    When Bio-Inspired Computing meets Deep Learning: Low-Latency, Accurate, & Energy-Efficient Spiking Neural Networks from Artificial Neural Networks

    Authors: Gourav Datta, Zeyu Liu, James Diffenderfer, Bhavya Kailkhura, Peter A. Beerel

    Abstract: Bio-inspired Spiking Neural Networks (SNN) are now demonstrating comparable accuracy to intricate convolutional neural networks (CNN), all while delivering remarkable energy and latency efficiency when deployed on neuromorphic hardware. In particular, ANN-to-SNN conversion has recently gained significant traction in develo** deep SNNs with close to state-of-the-art (SOTA) test accuracy on comple… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Under review

  9. arXiv:2312.01213  [pdf, other

    cs.AR cs.AI

    Recent Advances in Scalable Energy-Efficient and Trustworthy Spiking Neural networks: from Algorithms to Technology

    Authors: Souvik Kundu, Rui-Jie Zhu, Akhilesh Jaiswal, Peter A. Beerel

    Abstract: Neuromorphic computing and, in particular, spiking neural networks (SNNs) have become an attractive alternative to deep neural networks for a broad range of signal processing applications, processing static and/or temporal inputs from different sensory modalities, including audio and vision sensors. In this paper, we start with a description of recent advances in algorithmic and optimization innov… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: 5 pages, 3 figures

  10. arXiv:2311.16456  [pdf, other

    cs.CV eess.IV

    Spiking Neural Networks with Dynamic Time Steps for Vision Transformers

    Authors: Gourav Datta, Zeyu Liu, Anni Li, Peter A. Beerel

    Abstract: Spiking Neural Networks (SNNs) have emerged as a popular spatio-temporal computing paradigm for complex vision tasks. Recently proposed SNN training algorithms have significantly reduced the number of time steps (down to 1) for improved latency and energy efficiency, however, they target only convolutional neural networks (CNN). These algorithms, when applied on the recently spotlighted vision tra… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: Under review

  11. arXiv:2310.11637  [pdf, other

    eess.IV

    FixPix: Fixing Bad Pixels using Deep Learning

    Authors: Sreetama Sarkar, Xinan Ye, Gourav Datta, Peter A. Beerel

    Abstract: Efficient and effective on-line detection and correction of bad pixels can improve yield and increase the expected lifetime of image sensors. This paper presents a comprehensive Deep Learning (DL) based on-line detection-correction approach, suitable for a wide range of pixel corruption rates. A confidence calibrated segmentation approach is introduced, which achieves nearly perfect bad pixel dete… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  12. arXiv:2309.07254  [pdf, other

    cs.CV cs.CR

    Mitigate Replication and Copying in Diffusion Models with Generalized Caption and Dual Fusion Enhancement

    Authors: Chenghao Li, Dake Chen, Yuke Zhang, Peter A. Beerel

    Abstract: While diffusion models demonstrate a remarkable capability for generating high-quality images, their tendency to `replicate' training data raises privacy concerns. Although recent research suggests that this replication may stem from the insufficient generalization of training data captions and duplication of training images, effective mitigation strategies remain elusive. To address this gap, our… ▽ More

    Submitted 23 January, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: This paper has been accepted for presentation at 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024)

  13. Island-based Random Dynamic Voltage Scaling vs ML-Enhanced Power Side-Channel Attacks

    Authors: Dake Chen, Christine Goins, Maxwell Waugaman, Georgios D. Dimou, Peter A. Beerel

    Abstract: In this paper, we describe and analyze an island-based random dynamic voltage scaling (iRDVS) approach to thwart power side-channel attacks. We first analyze the impact of the number of independent voltage islands on the resulting signal-to-noise ratio and trace misalignment. As part of our analysis of misalignment, we propose a novel unsupervised machine learning (ML) based attack that is effecti… ▽ More

    Submitted 13 June, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Journal ref: Proceedings of the Great Lakes Symposium on VLSI 2023

  14. arXiv:2305.00107  [pdf, other

    cs.CR eess.SY

    Unraveling Latch Locking Using Machine Learning, Boolean Analysis, and ILP

    Authors: Dake Chen, Xuan Zhou, Yinghua Hu, Yuke Zhang, Kaixin Yang, Andrew Rittenbach, Pierluigi Nuzzo, Peter A. Beerel

    Abstract: Logic locking has become a promising approach to provide hardware security in the face of a possibly insecure fabrication supply chain. While many techniques have focused on locking combinational logic (CL), an alternative latch-locking approach in which the sequential elements are locked has also gained significant attention. Latch (LAT) locking duplicates a subset of the flip-flops (FF) of a des… ▽ More

    Submitted 28 April, 2023; originally announced May 2023.

    Comments: 8 pages, 7 figures, accepted by ISQED 2023

    ACM Class: B.m; C.m

  15. arXiv:2304.13274  [pdf, other

    cs.LG cs.CR

    Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference

    Authors: Souvik Kundu, Yuke Zhang, Dake Chen, Peter A. Beerel

    Abstract: Large number of ReLU and MAC operations of Deep neural networks make them ill-suited for latency and compute-efficient private inference. In this paper, we present a model optimization method that allows a model to learn to be shallow. In particular, we leverage the ReLU sensitivity of a convolutional block to remove a ReLU layer and merge its succeeding and preceding convolution layers to a shall… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

  16. arXiv:2304.13266  [pdf, other

    cs.CR

    C2PI: An Efficient Crypto-Clear Two-Party Neural Network Private Inference

    Authors: Yuke Zhang, Dake Chen, Souvik Kundu, Haomei Liu, Ruiheng Peng, Peter A. Beerel

    Abstract: Recently, private inference (PI) has addressed the rising concern over data and model privacy in machine learning inference as a service. However, existing PI frameworks suffer from high computational and communication costs due to the expensive multi-party computation (MPC) protocols. Existing literature has developed lighter MPC protocols to yield more efficient PI schemes. We, in contrast, prop… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

  17. Technology-Circuit-Algorithm Tri-Design for Processing-in-Pixel-in-Memory (P2M)

    Authors: Md Abdullah-Al Kaiser, Gourav Datta, Sreetama Sarkar, Souvik Kundu, Zihan Yin, Manas Garg, Ajey P. Jacob, Peter A. Beerel, Akhilesh R. Jaiswal

    Abstract: The massive amounts of data generated by camera sensors motivate data processing inside pixel arrays, i.e., at the extreme-edge. Several critical developments have fueled recent interest in the processing-in-pixel-in-memory paradigm for a wide range of visual machine intelligence tasks, including (1) advances in 3D integration technology to enable complex processing inside each pixel in a 3D integ… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Journal ref: GLSVLSI '23: Great Lakes Symposium on VLSI 2023 Proceedings

  18. ViTA: A Vision Transformer Inference Accelerator for Edge Applications

    Authors: Shashank Nag, Gourav Datta, Souvik Kundu, Nitin Chandrachoodan, Peter A. Beerel

    Abstract: Vision Transformer models, such as ViT, Swin Transformer, and Transformer-in-Transformer, have recently gained significant traction in computer vision tasks due to their ability to capture the global relation between features which leads to superior performance. However, they are compute-heavy and difficult to deploy in resource-constrained edge devices. Existing hardware accelerators, including t… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: Accepted at ISCAS 2023

    Journal ref: 2023 IEEE International Symposium on Circuits and Systems (ISCAS), Monterey, CA, USA, 2023, pp. 1-5

  19. arXiv:2302.03523  [pdf, other

    cs.CV

    Sparse Mixture Once-for-all Adversarial Training for Efficient In-Situ Trade-Off Between Accuracy and Robustness of DNNs

    Authors: Souvik Kundu, Sairam Sundaresan, Sharath Nittur Sridhar, Shunlin Lu, Han Tang, Peter A. Beerel

    Abstract: Existing deep neural networks (DNNs) that achieve state-of-the-art (SOTA) performance on both clean and adversarially-perturbed images rely on either activation or weight conditioned convolution operations. However, such conditional learning costs additional multiply-accumulate (MAC) or addition operations, increasing inference memory and compute costs. To that end, we present a sparse mixture onc… ▽ More

    Submitted 27 December, 2022; originally announced February 2023.

    Comments: 5 pages, 5 figures, 2 tables

  20. arXiv:2301.09254  [pdf, other

    cs.CV cs.AI cs.LG

    Learning to Linearize Deep Neural Networks for Secure and Efficient Private Inference

    Authors: Souvik Kundu, Shunlin Lu, Yuke Zhang, Jacqueline Liu, Peter A. Beerel

    Abstract: The large number of ReLU non-linearity operations in existing deep neural networks makes them ill-suited for latency-efficient private inference (PI). Existing techniques to reduce ReLU operations often involve manual effort and sacrifice significant accuracy. In this paper, we first present a novel measure of non-linearity layers' ReLU sensitivity, enabling mitigation of the time-consuming manual… ▽ More

    Submitted 22 January, 2023; originally announced January 2023.

    Comments: 15 pages, 10 figures, 11 tables. Accepted as a conference paper at ICLR 2023

  21. arXiv:2301.09111  [pdf, other

    eess.IV

    Neuromorphic-P2M: Processing-in-Pixel-in-Memory Paradigm for Neuromorphic Image Sensors

    Authors: Md Abdullah-Al Kaiser, Gourav Datta, Zixu Wang, Ajey P. Jacob, Peter A. Beerel, Akhilesh R. Jaiswal

    Abstract: Edge devices equipped with computer vision must deal with vast amounts of sensory data with limited computing resources. Hence, researchers have been exploring different energy-efficient solutions such as near-sensor processing, in-sensor processing, and in-pixel processing, bringing the computation closer to the sensor. In particular, in-pixel processing embeds the computation capabilities inside… ▽ More

    Submitted 22 January, 2023; originally announced January 2023.

    Comments: 17 pages, 11 figures, 2 tables

  22. arXiv:2212.10881  [pdf, other

    cs.CV

    In-Sensor & Neuromorphic Computing are all you need for Energy Efficient Computer Vision

    Authors: Gourav Datta, Zeyu Liu, Md Abdullah-Al Kaiser, Souvik Kundu, Joe Mathai, Zihan Yin, Ajey P. Jacob, Akhilesh R. Jaiswal, Peter A. Beerel

    Abstract: Due to the high activation sparsity and use of accumulates (AC) instead of expensive multiply-and-accumulates (MAC), neuromorphic spiking neural networks (SNNs) have emerged as a promising low-power alternative to traditional DNNs for several computer vision (CV) applications. However, most existing SNNs require multiple time steps for acceptable inference accuracy, hindering real-time deployment… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

  23. arXiv:2212.10170  [pdf, other

    cs.CV

    Hoyer regularizer is all you need for ultra low-latency spiking neural networks

    Authors: Gourav Datta, Zeyu Liu, Peter A. Beerel

    Abstract: Spiking Neural networks (SNN) have emerged as an attractive spatio-temporal computing paradigm for a wide range of low-power vision tasks. However, state-of-the-art (SOTA) SNN models either incur multiple time steps which hinder their deployment in real-time use cases or increase the training complexity significantly. To mitigate this concern, we present a training framework (from scratch) for one… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Comments: 16 pages, 4 figures

  24. arXiv:2211.04515  [pdf, other

    cs.DC

    QuantPipe: Applying Adaptive Post-Training Quantization for Distributed Transformer Pipelines in Dynamic Edge Environments

    Authors: Haonan Wang, Connor Imes, Souvik Kundu, Peter A. Beerel, Stephen P. Crago, John Paul Walters

    Abstract: Pipeline parallelism has achieved great success in deploying large-scale transformer models in cloud environments, but has received less attention in edge environments. Unlike in cloud scenarios with high-speed and stable network interconnects, dynamic bandwidth in edge systems can degrade distributed pipeline performance. We address this issue with QuantPipe, a communication-efficient distributed… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

  25. arXiv:2210.12613  [pdf, other

    cs.NE cs.ET

    Towards Energy-Efficient, Low-Latency and Accurate Spiking LSTMs

    Authors: Gourav Datta, Haoqin Deng, Robert Aviles, Peter A. Beerel

    Abstract: Spiking Neural Networks (SNNs) have emerged as an attractive spatio-temporal computing paradigm for complex vision tasks. However, most existing works yield models that require many time steps and do not leverage the inherent temporal dynamics of spiking neural networks, even for sequential tasks. Motivated by this observation, we propose an \rev{optimized spiking long short-term memory networks (… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

  26. arXiv:2210.05451  [pdf, other

    cs.CV eess.IV

    Enabling ISP-less Low-Power Computer Vision

    Authors: Gourav Datta, Zeyu Liu, Zihan Yin, Linyu Sun, Akhilesh R. Jaiswal, Peter A. Beerel

    Abstract: In order to deploy current computer vision (CV) models on resource-constrained low-power devices, recent works have proposed in-sensor and in-pixel computing approaches that try to partly/fully bypass the image signal processor (ISP) and yield significant bandwidth reduction between the image sensor and the CV processing unit by downsampling the activation maps in the initial convolutional neural… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: Accepted to WACV 2023

  27. arXiv:2205.14285  [pdf, other

    eess.IV

    P2M-DeTrack: Processing-in-Pixel-in-Memory for Energy-efficient and Real-Time Multi-Object Detection and Tracking

    Authors: Gourav Datta, Souvik Kundu, Zihan Yin, Joe Mathai, Zeyu Liu, Zixu Wang, Mulin Tian, Shunlin Lu, Ravi T. Lakkireddy, Andrew Schmidt, Wael Abd-Almageed, Ajey P. Jacob, Akhilesh R. Jaiswal, Peter A. Beerel

    Abstract: Today's high resolution, high frame rate cameras in autonomous vehicles generate a large volume of data that needs to be transferred and processed by a downstream processor or machine learning (ML) accelerator to enable intelligent computing tasks, such as multi-object detection and tracking. The massive amount of data transfer incurs significant energy, latency, and bandwidth bottlenecks, which h… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: 6 pages, 4 figures, 4 tables

  28. arXiv:2204.00426  [pdf, other

    cs.CV

    A Fast and Efficient Conditional Learning for Tunable Trade-Off between Accuracy and Robustness

    Authors: Souvik Kundu, Sairam Sundaresan, Massoud Pedram, Peter A. Beerel

    Abstract: Existing models that achieve state-of-the-art (SOTA) performance on both clean and adversarially-perturbed images rely on convolution operations conditioned with feature-wise linear modulation (FiLM) layers. These layers require many new parameters and are hyperparameter sensitive. They significantly increase training time, memory cost, and potential latency which can prove costly for resource-lim… ▽ More

    Submitted 28 March, 2022; originally announced April 2022.

    Comments: 14 pages, 10 figures, 1 table

  29. arXiv:2203.05696  [pdf, other

    eess.IV cs.CV

    Toward Efficient Hyperspectral Image Processing inside Camera Pixels

    Authors: Gourav Datta, Zihan Yin, Ajey Jacob, Akhilesh R. Jaiswal, Peter A. Beerel

    Abstract: Hyperspectral cameras generate a large amount of data due to the presence of hundreds of spectral bands as opposed to only three channels (red, green, and blue) in traditional cameras. This requires a significant amount of data transmission between the hyperspectral image sensor and a processor used to classify/detect/track the images, frame by frame, expending high energy and causing bandwidth an… ▽ More

    Submitted 10 March, 2022; originally announced March 2022.

    Comments: 6 pages, 3 figures

  30. arXiv:2203.04737  [pdf, other

    cs.LG cs.AR cs.CV

    P2M: A Processing-in-Pixel-in-Memory Paradigm for Resource-Constrained TinyML Applications

    Authors: Gourav Datta, Souvik Kundu, Zihan Yin, Ravi Teja Lakkireddy, Joe Mathai, Ajey Jacob, Peter A. Beerel, Akhilesh R. Jaiswal

    Abstract: The demand to process vast amounts of data generated from state-of-the-art high resolution cameras has motivated novel energy-efficient on-device AI solutions. Visual data in such cameras are usually captured in the form of analog voltages by a sensor pixel array, and then converted to the digital domain for subsequent AI processing using analog-to-digital converters (ADC). Recent research has tri… ▽ More

    Submitted 16 March, 2022; v1 submitted 6 March, 2022; originally announced March 2022.

    Comments: 15 pages, 8 figures

  31. arXiv:2201.05943  [pdf, other

    cs.CR eess.SY

    TriLock: IC Protection with Tunable Corruptibility and Resilience to SAT and Removal Attacks

    Authors: Yuke Zhang, Yinghua Hu, Pierluigi Nuzzo, Peter A. Beerel

    Abstract: Sequential logic locking has been studied over the last decade as a method to protect sequential circuits from reverse engineering. However, most of the existing sequential logic locking techniques are threatened by increasingly more sophisticated SAT-based attacks, efficiently using input queries to a SAT solver to rule out incorrect keys, as well as removal attacks based on structural analysis.… ▽ More

    Submitted 15 January, 2022; originally announced January 2022.

    Comments: Accepted at Design, Automation and Test in Europe Conference (DATE), 2022

  32. arXiv:2112.13843  [pdf, other

    cs.CV cs.LG

    BMPQ: Bit-Gradient Sensitivity Driven Mixed-Precision Quantization of DNNs from Scratch

    Authors: Souvik Kundu, Shikai Wang, Qirui Sun, Peter A. Beerel, Massoud Pedram

    Abstract: Large DNNs with mixed-precision quantization can achieve ultra-high compression while retaining high classification performance. However, because of the challenges in finding an accurate metric that can guide the optimization process, these methods either sacrifice significant performance compared to the 32-bit floating-point (FP-32) baseline or rely on a compute-expensive, iterative training poli… ▽ More

    Submitted 23 December, 2021; originally announced December 2021.

    Comments: 4 pages, 2 figures, 2 tables

  33. arXiv:2112.12133  [pdf, other

    cs.CV

    Can Deep Neural Networks be Converted to Ultra Low-Latency Spiking Neural Networks?

    Authors: Gourav Datta, Peter A. Beerel

    Abstract: Spiking neural networks (SNNs), that operate via binary spikes distributed over time, have emerged as a promising energy efficient ML paradigm for resource-constrained devices. However, the current state-of-the-art (SOTA) SNNs require multiple time steps for acceptable inference accuracy, increasing spiking activity and, consequently, energy consumption. SOTA training strategies for SNNs involve c… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

    Comments: Accepted to DATE 2022

  34. arXiv:2110.14895  [pdf, other

    cs.DC cs.LG

    Pipeline Parallelism for Inference on Heterogeneous Edge Computing

    Authors: Yang Hu, Connor Imes, Xuanang Zhao, Souvik Kundu, Peter A. Beerel, Stephen P. Crago, John Paul N. Walters

    Abstract: Deep neural networks with large model sizes achieve state-of-the-art results for tasks in computer vision (CV) and natural language processing (NLP). However, these large-scale models are too compute- or memory-intensive for resource-constrained edge devices. Prior works on parallel and distributed execution primarily focus on training -- rather than inference -- using homogeneous accelerators in… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

  35. arXiv:2110.11417  [pdf, other

    cs.CV cs.AI

    HIRE-SNN: Harnessing the Inherent Robustness of Energy-Efficient Deep Spiking Neural Networks by Training with Crafted Input Noise

    Authors: Souvik Kundu, Massoud Pedram, Peter A. Beerel

    Abstract: Low-latency deep spiking neural networks (SNNs) have become a promising alternative to conventional artificial neural networks (ANNs) because of their potential for increased energy efficiency on event-driven neuromorphic hardware. Neural networks, including SNNs, however, are subject to various adversarial attacks and must be trained to remain resilient against such attacks for many applications.… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

    Comments: 10 pages, 11 figures, 7 tables, International Conference on Computer Vision

  36. Fun-SAT: Functional Corruptibility-Guided SAT-Based Attack on Sequential Logic Encryption

    Authors: Yinghua Hu, Yuke Zhang, Kaixin Yang, Dake Chen, Peter A. Beerel, Pierluigi Nuzzo

    Abstract: The SAT attack has shown to be efficient against most combinational logic encryption methods. It can be extended to attack sequential logic encryption techniques by leveraging circuit unrolling and model checking methods. However, with no guidance on the number of times that a circuit needs to be unrolled to find the correct key, the attack tends to solve many time-consuming Boolean satisfiability… ▽ More

    Submitted 10 August, 2021; originally announced August 2021.

    Comments: Accepted at IEEE International Symposium on Hardware Oriented Security and Trust (HOST) 2021

  37. arXiv:2108.03719  [pdf, other

    cond-mat.supr-con

    A High Performance and Robust FIFO Synchronizer-Interface for Crossing Clock Domains in SFQ Logic

    Authors: Gourav Datta, Shidie Lin, Peter A. Beerel

    Abstract: Digital single-flux quantum (SFQ) technology promises to meet the demands of ultra low power and high speed computing needed for future exascale supercomputing platforms. However, clocking SFQ logic circuits remains a challenge due to the presence of a large number of on-chip clock sinks, and hence, decomposing large designs into multiple independent clock domains similar to CMOS, have been propos… ▽ More

    Submitted 8 August, 2021; originally announced August 2021.

    Comments: 5 pages, 5 figures, under review in IEEE Transactions on Circuits and Systems-II

  38. arXiv:2107.12445  [pdf, other

    cs.NE cs.LG

    Towards Low-Latency Energy-Efficient Deep SNNs via Attention-Guided Compression

    Authors: Souvik Kundu, Gourav Datta, Massoud Pedram, Peter A. Beerel

    Abstract: Deep spiking neural networks (SNNs) have emerged as a potential alternative to traditional deep learning frameworks, due to their promise to provide increased compute efficiency on event-driven neuromorphic hardware. However, to perform well on complex vision applications, most SNN training frameworks yield large inference latency which translates to increased spike activity and reduced energy eff… ▽ More

    Submitted 16 July, 2021; originally announced July 2021.

    Comments: 10 Pages, 8 Figures, 5 Tables

  39. arXiv:2107.12374  [pdf, other

    cs.NE

    Training Energy-Efficient Deep Spiking Neural Networks with Single-Spike Hybrid Input Encoding

    Authors: Gourav Datta, Souvik Kundu, Peter A. Beerel

    Abstract: Spiking Neural Networks (SNNs) have emerged as an attractive alternative to traditional deep learning frameworks, since they provide higher computational efficiency in event driven neuromorphic hardware. However, the state-of-the-art (SOTA) SNNs suffer from high inference latency, resulting from inefficient input encoding and training techniques. The most widely used input coding schemes, such as… ▽ More

    Submitted 26 July, 2021; originally announced July 2021.

    Comments: Extended version of arXiv:2107.11979

  40. arXiv:2107.11979  [pdf, other

    cs.NE

    HYPER-SNN: Towards Energy-efficient Quantized Deep Spiking Neural Networks for Hyperspectral Image Classification

    Authors: Gourav Datta, Souvik Kundu, Akhilesh R. Jaiswal, Peter A. Beerel

    Abstract: Hyper spectral images (HSI) provide rich spectral and spatial information across a series of contiguous spectral bands. However, the accurate processing of the spectral and spatial correlation between the bands requires the use of energy-expensive 3-D Convolutional Neural Networks (CNNs). To address this challenge, we propose the use of Spiking Neural Networks (SNNs) that are generated from iso-ar… ▽ More

    Submitted 28 July, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

  41. arXiv:2101.12279  [pdf, other

    cs.CR

    GF-Flush: A GF(2) Algebraic Attack on Secure Scan Chains

    Authors: Dake Chen, Chunxiao Lin, Peter A. Beerel

    Abstract: Scan chains provide increased controllability and observability for testing digital circuits. The increased testability, however, can also be a source of information leakage for sensitive designs. The state-of-the-art defenses to secure scan chains apply dynamic keys to pseudo-randomly invert the scan vectors. In this paper, we pinpoint an algebraic vulnerability of these dynamic defenses that inv… ▽ More

    Submitted 28 January, 2021; originally announced January 2021.

    Comments: Submitted to IEEE Transactions on Computer-Aided Design of Integrated Circuits And Systems

  42. arXiv:2011.03083  [pdf, other

    cs.CV cs.CR cs.LG

    A Tunable Robust Pruning Framework Through Dynamic Network Rewiring of DNNs

    Authors: Souvik Kundu, Mahdi Nazemi, Peter A. Beerel, Massoud Pedram

    Abstract: This paper presents a dynamic network rewiring (DNR) method to generate pruned deep neural network (DNN) models that are robust against adversarial attacks yet maintain high accuracy on clean images. In particular, the disclosed DNR method is based on a unified constrained optimization formulation using a hybrid loss function that merges ultra-high model compression with robust adversarial trainin… ▽ More

    Submitted 24 November, 2020; v1 submitted 3 November, 2020; originally announced November 2020.

    Comments: 8 pages, 4 figures, conference paper

  43. arXiv:2004.10655  [pdf, other

    cs.LO

    Formal Verification of Flow Equivalence in Desynchronized Designs

    Authors: Jennifer Paykin, Brian Huffman, Daniel M. Zimmerman, Peter A. Beerel

    Abstract: Seminal work by Cortadella, Kondratyev, Lavagno, and Sotiriou includes a hand-written proof that a particular handshaking protocol preserves flow equivalence, a notion of equivalence between synchronous latch-based specifications and their desynchronized bundled-data asynchronous implementations. In this work we identify a counterexample to Cortadella et al.'s proof illustrating how their protocol… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.

    Comments: To appear in ASYNC 2020

  44. arXiv:2004.04804  [pdf, other

    physics.app-ph

    Modeling and Characterization of Metastability in Single Flux Quantum (SFQ) Synchronizers

    Authors: Gourav Datta, Peter A. Beerel

    Abstract: Despite the promises of low-power and high-frequency of single-flux quantum (SFQ) technology, scaling these circuits remains a serious challenge that motivates the support of multiple SFQ clock domains. Towards this end, this paper analyzes the impact of setup time violations and metastability in SFQ circuits comparing the derived analytical models to their CMOS counterparts. It then extends this… ▽ More

    Submitted 9 April, 2020; originally announced April 2020.

  45. arXiv:2004.00974  [pdf, other

    cs.LG cs.CV stat.ML

    Deep-n-Cheap: An Automated Search Framework for Low Complexity Deep Learning

    Authors: Sourya Dey, Saikrishna C. Kanala, Keith M. Chugg, Peter A. Beerel

    Abstract: We present Deep-n-Cheap -- an open-source AutoML framework to search for deep learning models. This search includes both architecture and training hyperparameters, and supports convolutional neural networks and multi-layer perceptrons. Our framework is targeted for deployment on both benchmark and custom datasets, and as a result, offers a greater degree of search space customizability as compared… ▽ More

    Submitted 5 September, 2020; v1 submitted 27 March, 2020; originally announced April 2020.

    Comments: Accepted as a conference paper at ACML 2020

  46. arXiv:2001.10715  [pdf, other

    cs.AR

    qBSA: Logic Design of a 32-bit Block-Skewed RSFQ Arithmetic Logic Unit

    Authors: Souvik Kundu, Gourav datta, Peter A. Beerel, Massoud Pedram

    Abstract: Single flux quantum (SFQ) circuits are an attractive beyond-CMOS technology because they promise two orders of magnitude lower power at clock frequencies exceeding 25 GHz.However, every SFQ gate is clocked creating very deep gate-level pipelines that are difficult to keep full, particularly for sequences that include data-dependent operations. This paper proposes to increase the throughput of SFQ… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.

    Comments: 3 pages, 3 figures

  47. arXiv:2001.10710  [pdf, other

    cs.CV cs.LG cs.PF

    Pre-defined Sparsity for Low-Complexity Convolutional Neural Networks

    Authors: Souvik Kundu, Mahdi Nazemi, Massoud Pedram, Keith M. Chugg, Peter A. Beerel

    Abstract: The high energy cost of processing deep convolutional neural networks impedes their ubiquitous deployment in energy-constrained platforms such as embedded systems and IoT devices. This work introduces convolutional layers with pre-defined sparse 2D kernels that have support sets that repeat periodically within and across filters. Due to the efficient storage of our periodic sparse kernels, the par… ▽ More

    Submitted 4 February, 2020; v1 submitted 29 January, 2020; originally announced January 2020.

    Comments: 14 pages, 13 figures

  48. SERAD: Soft Error Resilient Asynchronous Design using a Bundled Data Protocol

    Authors: Sai Aparna Aketi, Smriti Gupta, Huimei Cheng, Joycee Mekie, Peter A. Beerel

    Abstract: The risk of soft errors due to radiation continues to be a significant challenge for engineers trying to build systems that can handle harsh environments. Building systems that are Radiation Hardened by Design (RHBD) is the preferred approach, but existing techniques are expensive in terms of performance, power, and/or area. This paper introduces a novel soft-error resilient asynchronous bundled-d… ▽ More

    Submitted 12 January, 2020; originally announced January 2020.

  49. arXiv:1910.09876  [pdf, other

    cs.LG stat.ML

    Neural Network Training with Approximate Logarithmic Computations

    Authors: Arnab Sanyal, Peter A. Beerel, Keith M. Chugg

    Abstract: The high computational complexity associated with training deep neural networks limits online and real-time training on edge devices. This paper proposed an end-to-end training and inference scheme that eliminates multiplications by approximate operations in the log-domain which has the potential to significantly reduce implementation complexity. We implement the entire training procedure in the l… ▽ More

    Submitted 22 October, 2019; originally announced October 2019.

  50. arXiv:1910.04907  [pdf, other

    cs.ET physics.app-ph

    Metastability-Resilient Synchronization FIFO for SFQ Logic

    Authors: Gourav Datta, Haolin Cong, Souvik Kundu, Peter A. Beerel

    Abstract: Digital single-flux quantum (SFQ) technology promises to meet the demands of ultra low power and high speed computing needed for future exascale supercomputing systems. The combination of ultra high clock frequencies, gate-level pipelines, and numerous sources of variability in SFQ circuits, however, make low-skew global clock distribution a challenge. This motivates the support of multiple indepe… ▽ More

    Submitted 23 October, 2019; v1 submitted 10 October, 2019; originally announced October 2019.

    Comments: Accepted in ISEC 2019