Search | arXiv e-print repository

A 65nm 36nJ/Decision Bio-inspired Temporal-Sparsity-Aware Digital Keyword Spotting IC with 0.6V Near-Threshold SRAM

Authors: Qinyu Chen, Kwantae Kim, Chang Gao, Sheng Zhou, Taekwang Jang, Tobi Delbruck, Shih-Chii Liu

Abstract: This paper introduces, to the best of the authors' knowledge, the first fine-grained temporal sparsity-aware keyword spotting (KWS) IC leveraging temporal similarities between neighboring feature vectors extracted from input frames and network hidden states, eliminating unnecessary operations and memory accesses. This KWS IC, featuring a bio-inspired delta-gated recurrent neural network (ΔRNN) cla… ▽ More This paper introduces, to the best of the authors' knowledge, the first fine-grained temporal sparsity-aware keyword spotting (KWS) IC leveraging temporal similarities between neighboring feature vectors extracted from input frames and network hidden states, eliminating unnecessary operations and memory accesses. This KWS IC, featuring a bio-inspired delta-gated recurrent neural network (ΔRNN) classifier, achieves an 11-class Google Speech Command Dataset (GSCD) KWS accuracy of 90.5% and energy consumption of 36nJ/decision. At 87% temporal sparsity, computing latency and energy per inference are reduced by 2.4$\times$/3.4$\times$, respectively. The 65nm design occupies 0.78mm$^2$ and features two additional blocks, a compact 0.084mm$^2$ digital infinite-impulse-response (IIR)-based band-pass filter (BPF) audio feature extractor (FEx) and a 24kB 0.6V near-Vth weight SRAM with 6.6$\times$ lower read power compared to the standard SRAM. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2304.04706 [pdf, other]

Shining light on the DVS pixel: A tutorial and discussion about biasing and optimization

Authors: Rui Graça, Brian McReynolds, Tobi Delbruck

Abstract: The operation of the DVS event camera is controlled by the user through adjusting different bias parameters. These biases affect the response of the camera by controlling - among other parameters - the bandwidth, sensitivity, and maximum firing rate of the pixels. Besides determining the response of the camera to input signals, biases significantly impact its noise performance. Bias optimization i… ▽ More The operation of the DVS event camera is controlled by the user through adjusting different bias parameters. These biases affect the response of the camera by controlling - among other parameters - the bandwidth, sensitivity, and maximum firing rate of the pixels. Besides determining the response of the camera to input signals, biases significantly impact its noise performance. Bias optimization is a multivariate process depending on the task and the scene, to which the user's knowledge about pixel design and non-idealities can be of great importance. In this paper, we go step-by-step along the signal pathway of the DVS pixel, shining light on its low-level operation and non-idealities, comparing pixel level measurements with array level measurements, and discussing and how biasing and illumination affect the pixel's behavior. With the results and discussion presented, we aim to help DVS users achieve more hardware-aware camera utilization and modelling. △ Less

Submitted 11 April, 2023; v1 submitted 10 April, 2023; originally announced April 2023.

Comments: Accepted at 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW); 4th International Workshop on Event-Based Vision

arXiv:2304.04019 [pdf, other]

Optimal biasing and physical limits of DVS event noise

Authors: Rui Graca, Brian McReynolds, Tobi Delbruck

Abstract: Under dim lighting conditions, the output of Dynamic Vision Sensor (DVS) event cameras is strongly affected by noise. Photon and electron shot-noise cause a high rate of non-informative events that reduce Signal to Noise ratio. DVS noise performance depends not only on the scene illumination, but also on the user-controllable biasing of the camera. In this paper, we explore the physical limits of… ▽ More Under dim lighting conditions, the output of Dynamic Vision Sensor (DVS) event cameras is strongly affected by noise. Photon and electron shot-noise cause a high rate of non-informative events that reduce Signal to Noise ratio. DVS noise performance depends not only on the scene illumination, but also on the user-controllable biasing of the camera. In this paper, we explore the physical limits of DVS noise, showing that the DVS photoreceptor is limited to a theoretical minimum of 2x photon shot noise, and we discuss how biasing the DVS with high photoreceptor bias and adequate source-follower bias approaches optimal noise performance. We support our conclusions with pixel-level measurements of a DAVIS346 and analysis of a theoretical pixel model. △ Less

Submitted 12 April, 2023; v1 submitted 8 April, 2023; originally announced April 2023.

Comments: Accepted to the 2023 International Image Sensor Workshop (IISW)

arXiv:2304.03494 [pdf, other]

Exploiting Alternating DVS Shot Noise Event Pair Statistics to Reduce Background Activity

Authors: Brian McReynolds, Rui Graca, Tobi Delbruck

Abstract: Dynamic Vision Sensors (DVS) record "events" corresponding to pixel-level brightness changes, resulting in data-efficient representation of a dynamic visual scene. As DVS expand into increasingly diverse applications, non-ideal behaviors in their output under extreme sensing conditions are important to consider. Under low illumination (below ~10 lux) their output begins to be dominated by shot noi… ▽ More Dynamic Vision Sensors (DVS) record "events" corresponding to pixel-level brightness changes, resulting in data-efficient representation of a dynamic visual scene. As DVS expand into increasingly diverse applications, non-ideal behaviors in their output under extreme sensing conditions are important to consider. Under low illumination (below ~10 lux) their output begins to be dominated by shot noise events (SNEs) which increase the data output and obscure true signal. SNE rates can be controlled to some degree by tuning circuit parameters to reduce sensitivity or temporal response bandwidth at the cost of signal loss. Alternatively, an improved understanding of SNE statistics can be leveraged to develop novel techniques for minimizing uninformative sensor output. We first explain a fundamental observation about sequential pairing of opposite polarity SNEs based on pixel circuit logic and validate our theory using DVS recordings and simulations. Finally, we derive a practical result from this new understanding and demonstrate two novel biasing techniques to reduce SNEs by 50% and 80% respectively while still retaining sensitivity and/or temporal resolution. △ Less

Submitted 12 April, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

Comments: IISW 2023, paper R5.6

arXiv:2208.00693 [pdf, other]

doi 10.1109/JSSC.2022.3195610

A 23 $μ$W Keyword Spotting IC with Ring-Oscillator-Based Time-Domain Feature Extraction

Authors: Kwantae Kim, Chang Gao, Rui Graça, Ilya Kiselev, Hoi-Jun Yoo, Tobi Delbruck, Shih-Chii Liu

Abstract: This article presents the first keyword spotting (KWS) IC which uses a ring-oscillator-based time-domain processing technique for its analog feature extractor (FEx). Its extensive usage of time-encoding schemes allows the analog audio signal to be processed in a fully time-domain manner except for the voltage-to-time conversion stage of the analog front-end. Benefiting from fundamental building bl… ▽ More This article presents the first keyword spotting (KWS) IC which uses a ring-oscillator-based time-domain processing technique for its analog feature extractor (FEx). Its extensive usage of time-encoding schemes allows the analog audio signal to be processed in a fully time-domain manner except for the voltage-to-time conversion stage of the analog front-end. Benefiting from fundamental building blocks based on digital logic gates, it offers a better technology scalability compared to conventional voltage-domain designs. Fabricated in a 65 nm CMOS process, the prototyped KWS IC occupies 2.03mm$^{2}$ and dissipates 23 $μ$W power consumption including analog FEx and digital neural network classifier. The 16-channel time-domain FEx achieves 54.89 dB dynamic range for 16 ms frame shift size while consuming 9.3 $μ$W. The measurement result verifies that the proposed IC performs a 12-class KWS task on the Google Speech Command Dataset (GSCD) with >86% accuracy and 12.4 ms latency. △ Less

Submitted 1 August, 2022; originally announced August 2022.

Comments: 14 pages, 21 figures, 2 tables

arXiv:2109.08640 [pdf, other]

Unraveling the paradox of intensity-dependent DVS pixel noise

Authors: Rui Graca, Tobi Delbruck

Abstract: Dynamic vision sensor (DVS) event camera output is affected by noise, particularly in dim lighting conditions. A theory explaining how photon and electron noise affect DVS output events has so far not been developed. Moreover, there is no clear understanding of how DVS parameters and operating conditions affect noise. There is an apparent paradox between the real noise data observed from the DVS o… ▽ More Dynamic vision sensor (DVS) event camera output is affected by noise, particularly in dim lighting conditions. A theory explaining how photon and electron noise affect DVS output events has so far not been developed. Moreover, there is no clear understanding of how DVS parameters and operating conditions affect noise. There is an apparent paradox between the real noise data observed from the DVS output and the reported noise measurements of the logarithmic photoreceptor. While measurements of the logarithmic photoreceptor predict that the photoreceptor is approximately a first-order system with RMS noise voltage independent of the photocurrent, DVS output shows higher noise event rates at low light intensity. This paper unravels this paradox by showing how the DVS photoreceptor is a second-order system, and the assumption that it is first-order is generally not reasonable. As we show, at higher photocurrents, the photoreceptor amplifier dominates the frequency response, causing a drop in RMS noise voltage and noise event rate. We bring light to the noise performance of the DVS photoreceptor by presenting a theoretical explanation supported by both transistor-level simulation results and chip measurements. △ Less

Submitted 17 September, 2021; originally announced September 2021.

Comments: Presented in 2021 International Image Sensor Workshop (IISW)

Journal ref: 2021 International Image Sensor Workshop (IISW)

arXiv:2002.03197 [pdf, other]

doi 10.1109/ICRA40945.2020.9196984

Recurrent Neural Network Control of a Hybrid Dynamic Transfemoral Prosthesis with EdgeDRNN Accelerator

Authors: Chang Gao, Rachel Gehlhar, Aaron D. Ames, Shih-Chii Liu, Tobi Delbruck

Abstract: Lower leg prostheses could improve the life quality of amputees by increasing comfort and reducing energy to locomote, but currently control methods are limited in modulating behaviors based upon the human's experience. This paper describes the first steps toward learning complex controllers for dynamical robotic assistive devices. We provide the first example of behavioral cloning to control a po… ▽ More Lower leg prostheses could improve the life quality of amputees by increasing comfort and reducing energy to locomote, but currently control methods are limited in modulating behaviors based upon the human's experience. This paper describes the first steps toward learning complex controllers for dynamical robotic assistive devices. We provide the first example of behavioral cloning to control a powered transfemoral prostheses using a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) running on a custom hardware accelerator that exploits temporal sparsity. The RNN is trained on data collected from the original prosthesis controller. The RNN inference is realized by a novel EdgeDRNN accelerator in real-time. Experimental results show that the RNN can replace the nominal PD controller to realize end-to-end control of the AMPRO3 prosthetic leg walking on flat ground and unforeseen slopes with comparable tracking accuracy. EdgeDRNN computes the RNN about 240 times faster than real time, opening the possibility of running larger networks for more complex tasks in the future. Implementing an RNN on this real-time dynamical system with impacts sets the ground work to incorporate other learned elements of the human-prosthesis system into prosthesis control. △ Less

Submitted 28 July, 2020; v1 submitted 8 February, 2020; originally announced February 2020.

Comments: Accepted at 2020 International Conference on Robotics and Automation (ICRA 2020)

Journal ref: 2020 IEEE International Conference on Robotics and Automation (ICRA)

arXiv:1912.12193 [pdf, ps, other]

doi 10.1109/AICAS48895.2020.9074001

EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference

Authors: Chang Gao, Antonio Rios-Navarro, Xi Chen, Tobi Delbruck, Shih-Chii Liu

Abstract: This paper presents a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) accelerator called EdgeDRNN designed for portable edge computing. EdgeDRNN adopts the spiking neural network inspired delta network algorithm to exploit temporal sparsity in RNNs. It reduces off-chip memory access by a factor of up to 10x with tolerable accuracy loss. Experimental results on a 10 million paramete… ▽ More This paper presents a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) accelerator called EdgeDRNN designed for portable edge computing. EdgeDRNN adopts the spiking neural network inspired delta network algorithm to exploit temporal sparsity in RNNs. It reduces off-chip memory access by a factor of up to 10x with tolerable accuracy loss. Experimental results on a 10 million parameter 2-layer GRU-RNN, with weights stored in DRAM, show that EdgeDRNN computes them in under 0.5 ms. With 2.42 W wall plug power on an entry level USB powered FPGA board, it achieves latency comparable with a 92 W Nvidia 1080 GPU. It outperforms NVIDIA Jetson Nano, Jetson TX2 and Intel Neural Compute Stick 2 in latency by 6X. For a batch size of 1, EdgeDRNN achieves a mean effective throughput of 20.2 GOp/s and a wall plug power efficiency that is over 4X higher than all other platforms. △ Less

Submitted 28 July, 2020; v1 submitted 22 December, 2019; originally announced December 2019.

Comments: This paper has been accepted for publication at the IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), Genoa, 2020

Journal ref: 2020 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)

Showing 1–8 of 8 results for author: Delbruck, T