Search | arXiv e-print repository

A Hybrid SNN-ANN Network for Event-based Object Detection with Spatial and Temporal Attention

Authors: Soikat Hasan Ahmed, Jan Finkbeiner, Emre Neftci

Abstract: Event cameras offer high temporal resolution and dynamic range with minimal motion blur, making them promising for object detection tasks. While Spiking Neural Networks (SNNs) are a natural match for event-based sensory data and enable ultra-energy efficient and low latency inference on neuromorphic hardware, Artificial Neural Networks (ANNs) tend to display more stable training dynamics and faste… ▽ More Event cameras offer high temporal resolution and dynamic range with minimal motion blur, making them promising for object detection tasks. While Spiking Neural Networks (SNNs) are a natural match for event-based sensory data and enable ultra-energy efficient and low latency inference on neuromorphic hardware, Artificial Neural Networks (ANNs) tend to display more stable training dynamics and faster convergence resulting in greater task performance. Hybrid SNN-ANN approaches are a promising alternative, enabling to leverage the strengths of both SNN and ANN architectures. In this work, we introduce the first Hybrid Attention-based SNN-ANN backbone for object detection using event cameras. We propose a novel Attention-based SNN-ANN bridge module to capture sparse spatial and temporal relations from the SNN layer and convert them into dense feature maps for the ANN part of the backbone. Experimental results demonstrate that our proposed method surpasses baseline hybrid and SNN-based approaches by significant margins, with results comparable to existing ANN-based methods. Extensive ablation studies confirm the effectiveness of our proposed modules and architectural choices. These results pave the way toward a hybrid SNN-ANN architecture that achieves ANN like performance at a drastically reduced parameter budget. We implemented the SNN blocks on digital neuromorphic hardware to investigate latency and power consumption and demonstrate the feasibility of our approach. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.06712 [pdf, other]

The Ouroboros of Memristors: Neural Networks Facilitating Memristor Programming

Authors: Zhenming Yu, Ming-Jay Yang, Jan Finkbeiner, Sebastian Siegel, John Paul Strachan, Emre Neftci

Abstract: Memristive devices hold promise to improve the scale and efficiency of machine learning and neuromorphic hardware, thanks to their compact size, low power consumption, and the ability to perform matrix multiplications in constant time. However, on-chip training with memristor arrays still faces challenges, including device-to-device and cycle-to-cycle variations, switching non-linearity, and espec… ▽ More Memristive devices hold promise to improve the scale and efficiency of machine learning and neuromorphic hardware, thanks to their compact size, low power consumption, and the ability to perform matrix multiplications in constant time. However, on-chip training with memristor arrays still faces challenges, including device-to-device and cycle-to-cycle variations, switching non-linearity, and especially SET and RESET asymmetry. To combat device non-linearity and asymmetry, we propose to program memristors by harnessing neural networks that map desired conductance updates to the required pulse times. With our method, approximately 95% of devices can be programmed within a relative percentage difference of +-50% from the target conductance after just one attempt. Our approach substantially reduces memristor programming delays compared to traditional write-and-verify methods, presenting an advantageous solution for on-chip training scenarios. Furthermore, our proposed neural network can be accelerated by memristor arrays upon deployment, providing assistance while reducing hardware overhead compared with previous works. This work contributes significantly to the practical application of memristors, particularly in reducing delays in memristor programming. It also envisions the future development of memristor-based machine learning accelerators. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: This work is accepted at the 2024 IEEE AICAS

arXiv:2311.04386 [pdf, other]

Harnessing Manycore Processors with Distributed Memory for Accelerated Training of Sparse and Recurrent Models

Authors: Jan Finkbeiner, Thomas Gmeinder, Mark Pupilli, Alexander Titterton, Emre Neftci

Abstract: Current AI training infrastructure is dominated by single instruction multiple data (SIMD) and systolic array architectures, such as Graphics Processing Units (GPUs) and Tensor Processing Units (TPUs), that excel at accelerating parallel workloads and dense vector matrix multiplications. Potentially more efficient neural network models utilizing sparsity and recurrence cannot leverage the full pow… ▽ More Current AI training infrastructure is dominated by single instruction multiple data (SIMD) and systolic array architectures, such as Graphics Processing Units (GPUs) and Tensor Processing Units (TPUs), that excel at accelerating parallel workloads and dense vector matrix multiplications. Potentially more efficient neural network models utilizing sparsity and recurrence cannot leverage the full power of SIMD processor and are thus at a severe disadvantage compared to today's prominent parallel architectures like Transformers and CNNs, thereby hindering the path towards more sustainable AI. To overcome this limitation, we explore sparse and recurrent model training on a massively parallel multiple instruction multiple data (MIMD) architecture with distributed local memory. We implement a training routine based on backpropagation through time (BPTT) for the brain-inspired class of Spiking Neural Networks (SNNs) that feature binary sparse activations. We observe a massive advantage in using sparse activation tensors with a MIMD processor, the Intelligence Processing Unit (IPU) compared to GPUs. On training workloads, our results demonstrate 5-10x throughput gains compared to A100 GPUs and up to 38x gains for higher levels of activation sparsity, without a significant slowdown in training convergence or reduction in final model performance. Furthermore, our results show highly promising trends for both single and multi IPU configurations as we scale up to larger model sizes. Our work paves the way towards more efficient, non-standard models via AI training hardware beyond GPUs, and competitive large scale SNN models. △ Less

Submitted 7 November, 2023; originally announced November 2023.

arXiv:2309.03840 [pdf, other]

Generating Minimal Training Sets for Machine Learned Potentials

Authors: Jan Finkbeiner, Samuel Tovey, Christian Holm

Abstract: This letter presents a novel approach for identifying uncorrelated atomic configurations from extensive data sets with a non-standard neural network workflow known as random network distillation (RND) for training machine-learned inter-atomic potentials (MLPs). This method is coupled with a DFT workflow wherein initial data is generated with cheaper classical methods before only the minimal subset… ▽ More This letter presents a novel approach for identifying uncorrelated atomic configurations from extensive data sets with a non-standard neural network workflow known as random network distillation (RND) for training machine-learned inter-atomic potentials (MLPs). This method is coupled with a DFT workflow wherein initial data is generated with cheaper classical methods before only the minimal subset is passed to a more computationally expensive ab initio calculation. This benefits training not only by reducing the number of expensive DFT calculations required but also by providing a pathway to the use of more accurate quantum mechanical calculations for training. The method's efficacy is demonstrated by constructing machine-learned inter-atomic potentials for the molten salts KCl and NaCl. Our RND method allows accurate models to be fit on minimal data sets, as small as 32 configurations, reducing the required structures by at least one order of magnitude compared to alternative methods. △ Less

Submitted 7 September, 2023; originally announced September 2023.

Comments: 9 pages, 8 figures, letter

arXiv:2303.11860 [pdf, other]

Online Transformers with Spiking Neurons for Fast Prosthetic Hand Control

Authors: Nathan Leroux, Jan Finkbeiner, Emre Neftci

Abstract: Transformers are state-of-the-art networks for most sequence processing tasks. However, the self-attention mechanism often used in Transformers requires large time windows for each computation step and thus makes them less suitable for online signal processing compared to Recurrent Neural Networks (RNNs). In this paper, instead of the self-attention mechanism, we use a sliding window attention mec… ▽ More Transformers are state-of-the-art networks for most sequence processing tasks. However, the self-attention mechanism often used in Transformers requires large time windows for each computation step and thus makes them less suitable for online signal processing compared to Recurrent Neural Networks (RNNs). In this paper, instead of the self-attention mechanism, we use a sliding window attention mechanism. We show that this mechanism is more efficient for continuous signals with finite-range dependencies between input and target, and that we can use it to process sequences element-by-element, this making it compatible with online processing. We test our model on a finger position regression dataset (NinaproDB8) with Surface Electromyographic (sEMG) signals measured on the forearm skin to estimate muscle activities. Our approach sets the new state-of-the-art in terms of accuracy on this dataset while requiring only very short time windows of 3.5 ms at each inference step. Moreover, we increase the sparsity of the network using Leaky-Integrate and Fire (LIF) units, a bio-inspired neuron model that activates sparsely in time solely when crossing a threshold. We thus reduce the number of synaptic operations up to a factor of $\times5.3$ without loss of accuracy. Our results hold great promises for accurate and fast online processing of sEMG signals for smooth prosthetic hand control and is a step towards Transformers and Spiking Neural Networks (SNNs) co-integration for energy efficient temporal signal processing. △ Less

Submitted 21 March, 2023; originally announced March 2023.

Comments: Preprint of 9 pages, 4 figures

arXiv:2108.01582 [pdf, other]

Efficient Data Selection Methods for the Development of Machine Learned Potentials

Authors: Jan Finkbeiner, Samuel Tovey, Christian Holm

Abstract: We present an investigation into data selection methods for the efficient sampling of configuration space as applied to the development of inter-atomic potentials for scale bridging in molecular dynamics (MD) simulations. This investigation suggests that the most efficient sampling techniques are those that incorporate information on an atomic level such as forces or atomic energies. Finally, we g… ▽ More We present an investigation into data selection methods for the efficient sampling of configuration space as applied to the development of inter-atomic potentials for scale bridging in molecular dynamics (MD) simulations. This investigation suggests that the most efficient sampling techniques are those that incorporate information on an atomic level such as forces or atomic energies. Finally, we generate an inter-atomic potential for the a sodium chloride system using each data selection technique and find that the global selection methods result in non-physical simulations. △ Less

Submitted 3 August, 2021; originally announced August 2021.

arXiv:2006.13084 [pdf, other]

Single-Shot 3D Detection of Vehicles from Monocular RGB Images via Geometry Constrained Keypoints in Real-Time

Authors: Nils Gählert, Jun-Jun Wan, Nicolas Jourdan, Jan Finkbeiner, Uwe Franke, Joachim Denzler

Abstract: In this paper we propose a novel 3D single-shot object detection method for detecting vehicles in monocular RGB images. Our approach lifts 2D detections to 3D space by predicting additional regression and classification parameters and hence kee** the runtime close to pure 2D object detection. The additional parameters are transformed to 3D bounding box keypoints within the network under geometri… ▽ More In this paper we propose a novel 3D single-shot object detection method for detecting vehicles in monocular RGB images. Our approach lifts 2D detections to 3D space by predicting additional regression and classification parameters and hence kee** the runtime close to pure 2D object detection. The additional parameters are transformed to 3D bounding box keypoints within the network under geometric constraints. Our proposed method features a full 3D description including all three angles of rotation without supervision by any labeled ground truth data for the object's orientation, as it focuses on certain keypoints within the image plane. While our approach can be combined with any modern object detection framework with only little computational overhead, we exemplify the extension of SSD for the prediction of 3D bounding boxes. We test our approach on different datasets for autonomous driving and evaluate it using the challenging KITTI 3D Object Detection as well as the novel nuScenes Object Detection benchmarks. While we achieve competitive results on both benchmarks we outperform current state-of-the-art methods in terms of speed with more than 20 FPS for all tested datasets and image resolutions. △ Less

Submitted 23 June, 2020; originally announced June 2020.

Comments: 2020 IEEE IV Symposium

Showing 1–7 of 7 results for author: Finkbeiner, J