-
NeuroBench: A Framework for Benchmarking Neuromorphic Computing Algorithms and Systems
Authors:
Jason Yik,
Korneel Van den Berghe,
Douwe den Blanken,
Younes Bouhadjar,
Maxime Fabre,
Paul Hueber,
Denis Kleyko,
Noah Pacik-Nelson,
Pao-Sheng Vincent Sun,
Guangzhi Tang,
Shenqi Wang,
Biyan Zhou,
Soikat Hasan Ahmed,
George Vathakkattil Joseph,
Benedetto Leto,
Aurora Micheli,
Anurag Kumar Mishra,
Gregor Lenz,
Tao Sun,
Zergham Ahmed,
Mahmoud Akl,
Brian Anderson,
Andreas G. Andreou,
Chiara Bartolozzi,
Arindam Basu
, et al. (73 additional authors not shown)
Abstract:
Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using brain-inspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. Prior neu…
▽ More
Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using brain-inspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. Prior neuromorphic computing benchmark efforts have not seen widespread adoption due to a lack of inclusive, actionable, and iterative benchmark design and guidelines. To address these shortcomings, we present NeuroBench: a benchmark framework for neuromorphic computing algorithms and systems. NeuroBench is a collaboratively-designed effort from an open community of nearly 100 co-authors across over 50 institutions in industry and academia, aiming to provide a representative structure for standardizing the evaluation of neuromorphic approaches. The NeuroBench framework introduces a common set of tools and systematic methodology for inclusive benchmark measurement, delivering an objective reference framework for quantifying neuromorphic approaches in both hardware-independent (algorithm track) and hardware-dependent (system track) settings. In this article, we present initial performance baselines across various model architectures on the algorithm track and outline the system track benchmark tasks and guidelines. NeuroBench is intended to continually expand its benchmarks and features to foster and track the progress made by the research community.
△ Less
Submitted 17 January, 2024; v1 submitted 10 April, 2023;
originally announced April 2023.
-
Unleashing the Potential of Spiking Neural Networks by Dynamic Confidence
Authors:
Chen Li,
Edward Jones,
Steve Furber
Abstract:
This paper presents a new methodology to alleviate the fundamental trade-off between accuracy and latency in spiking neural networks (SNNs). The approach involves decoding confidence information over time from the SNN outputs and using it to develop a decision-making agent that can dynamically determine when to terminate each inference.
The proposed method, Dynamic Confidence, provides several s…
▽ More
This paper presents a new methodology to alleviate the fundamental trade-off between accuracy and latency in spiking neural networks (SNNs). The approach involves decoding confidence information over time from the SNN outputs and using it to develop a decision-making agent that can dynamically determine when to terminate each inference.
The proposed method, Dynamic Confidence, provides several significant benefits to SNNs. 1. It can effectively optimize latency dynamically at runtime, setting it apart from many existing low-latency SNN algorithms. Our experiments on CIFAR-10 and ImageNet datasets have demonstrated an average 40% speedup across eight different settings after applying Dynamic Confidence. 2. The decision-making agent in Dynamic Confidence is straightforward to construct and highly robust in parameter space, making it extremely easy to implement. 3. The proposed method enables visualizing the potential of any given SNN, which sets a target for current SNNs to approach. For instance, if an SNN can terminate at the most appropriate time point for each input sample, a ResNet-50 SNN can achieve an accuracy as high as 82.47% on ImageNet within just 4.71 time steps on average. Unlocking the potential of SNNs needs a highly-reliable decision-making agent to be constructed and fed with a high-quality estimation of ground truth. In this regard, Dynamic Confidence represents a meaningful step toward realizing the potential of SNNs.
△ Less
Submitted 28 November, 2023; v1 submitted 17 March, 2023;
originally announced March 2023.
-
2022 Roadmap on Neuromorphic Computing and Engineering
Authors:
Dennis V. Christensen,
Regina Dittmann,
Bernabé Linares-Barranco,
Abu Sebastian,
Manuel Le Gallo,
Andrea Redaelli,
Stefan Slesazeck,
Thomas Mikolajick,
Sabina Spiga,
Stephan Menzel,
Ilia Valov,
Gianluca Milano,
Carlo Ricciardi,
Shi-Jun Liang,
Feng Miao,
Mario Lanza,
Tyler J. Quill,
Scott T. Keene,
Alberto Salleo,
Julie Grollier,
Danijela Marković,
Alice Mizrahi,
Peng Yao,
J. Joshua Yang,
Giacomo Indiveri
, et al. (34 additional authors not shown)
Abstract:
Modern computation based on the von Neumann architecture is today a mature cutting-edge science. In the Von Neumann architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exas…
▽ More
Modern computation based on the von Neumann architecture is today a mature cutting-edge science. In the Von Neumann architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exascale with 1018 calculations each second. Even though these future computers will be incredibly powerful, if they are based on von Neumann type architectures, they will consume between 20 and 30 megawatts of power and will not have intrinsic physically built-in capabilities to learn or deal with complex data as our brain does. These needs can be addressed by neuromorphic computing systems which are inspired by the biological concepts of the human brain. This new generation of computers has the potential to be used for the storage and processing of large amounts of digital information with much lower power consumption than conventional processors. Among their potential future applications, an important niche is moving the control from data centers to edge devices.
The aim of this Roadmap is to present a snapshot of the present state of neuromorphic technology and provide an opinion on the challenges and opportunities that the future holds in the major areas of neuromorphic technology, namely materials, devices, neuromorphic circuits, neuromorphic algorithms, applications, and ethics. The Roadmap is a collection of perspectives where leading researchers in the neuromorphic community provide their own view about the current state and the future challenges. We hope that this Roadmap will be a useful resource to readers outside this field, for those who are just entering the field, and for those who are well established in the neuromorphic community.
https://doi.org/10.1088/2634-4386/ac4a83
△ Less
Submitted 13 January, 2022; v1 submitted 12 May, 2021;
originally announced May 2021.
-
The SpiNNaker 2 Processing Element Architecture for Hybrid Digital Neuromorphic Computing
Authors:
Sebastian Höppner,
Yexin Yan,
Andreas Dixius,
Stefan Scholze,
Johannes Partzsch,
Marco Stolba,
Florian Kelber,
Bernhard Vogginger,
Felix Neumärker,
Georg Ellguth,
Stephan Hartmann,
Stefan Schiefer,
Thomas Hocker,
Dennis Walter,
Genting Liu,
Jim Garside,
Steve Furber,
Christian Mayr
Abstract:
This paper introduces the processing element architecture of the second generation SpiNNaker chip, implemented in 22nm FDSOI. On circuit level, the chip features adaptive body biasing for near-threshold operation, and dynamic voltage-and-frequency scaling driven by spiking activity. On system level, processing is centered around an ARM M4 core, similar to the processor-centric architecture of the…
▽ More
This paper introduces the processing element architecture of the second generation SpiNNaker chip, implemented in 22nm FDSOI. On circuit level, the chip features adaptive body biasing for near-threshold operation, and dynamic voltage-and-frequency scaling driven by spiking activity. On system level, processing is centered around an ARM M4 core, similar to the processor-centric architecture of the first generation SpiNNaker. To speed operation of subtasks, we have added accelerators for numerical operations of both spiking (SNN) and rate based (deep) neural networks (DNN). PEs communicate via a dedicated, custom-designed network-on-chip. We present three benchmarks showing operation of the whole processor element on SNN, DNN and hybrid SNN/DNN networks.
△ Less
Submitted 15 August, 2022; v1 submitted 15 March, 2021;
originally announced March 2021.
-
Low-Power Low-Latency Keyword Spotting and Adaptive Control with a SpiNNaker 2 Prototype and Comparison with Loihi
Authors:
Yexin Yan,
Terrence C. Stewart,
Xuan Choo,
Bernhard Vogginger,
Johannes Partzsch,
Sebastian Hoeppner,
Florian Kelber,
Chris Eliasmith,
Steve Furber,
Christian Mayr
Abstract:
We implemented two neural network based benchmark tasks on a prototype chip of the second-generation SpiNNaker (SpiNNaker 2) neuromorphic system: keyword spotting and adaptive robotic control. Keyword spotting is commonly used in smart speakers to listen for wake words, and adaptive control is used in robotic applications to adapt to unknown dynamics in an online fashion. We highlight the benefit…
▽ More
We implemented two neural network based benchmark tasks on a prototype chip of the second-generation SpiNNaker (SpiNNaker 2) neuromorphic system: keyword spotting and adaptive robotic control. Keyword spotting is commonly used in smart speakers to listen for wake words, and adaptive control is used in robotic applications to adapt to unknown dynamics in an online fashion. We highlight the benefit of a multiply accumulate (MAC) array in the SpiNNaker 2 prototype which is ordinarily used in rate-based machine learning networks when employed in a neuromorphic, spiking context. In addition, the same benchmark tasks have been implemented on the Loihi neuromorphic chip, giving a side-by-side comparison regarding power consumption and computation time. While Loihi shows better efficiency when less complicated vector-matrix multiplication is involved, with the MAC array, the SpiNNaker 2 prototype shows better efficiency when high dimensional vector-matrix multiplication is involved.
△ Less
Submitted 18 September, 2020;
originally announced September 2020.
-
Spiking Associative Memory for Spatio-Temporal Patterns
Authors:
Simon Davidson,
Stephen B. Furber,
Oliver Rhodes
Abstract:
Spike Timing Dependent Plasticity is form of learning that has been demonstrated in real cortical tissue, but attempts to use it for artificial systems have not produced good results. This paper seeks to remedy this with two significant advances. The first is the development a simple stochastic learning rule called cyclic STDP that can extract patterns encoded in the precise spiking times of a gro…
▽ More
Spike Timing Dependent Plasticity is form of learning that has been demonstrated in real cortical tissue, but attempts to use it for artificial systems have not produced good results. This paper seeks to remedy this with two significant advances. The first is the development a simple stochastic learning rule called cyclic STDP that can extract patterns encoded in the precise spiking times of a group of neurons. We show that a population of neurons endowed with this learning rule can act as an effective short-term associative memory, storing and reliably recalling a large set of pattern associations over an extended period of time.
The second major theme examines the challenges associated with training a neuron to produce a spike at a precise time and for the fidelity of spike recall time to be maintained as further learning occurs. The strong constraint of working with precisely-timed spikes (so-called temporal coding) is mandated by the learning rule but is also consistent with the believe in the necessity of such an encoding scheme to render a spiking neural network a competitive solution for flexible intelligent systems in continuous learning environments.
The encoding and learning rules are demonstrated in the design of a single-layer associative memory (an input layer consisting of 3,200 spiking neurons fully-connected to a similar sized population of memory neurons), which we simulate and characterise. Design considerations and clarification of the role of parameters under the control of the designer are explored.
△ Less
Submitted 30 June, 2020;
originally announced June 2020.
-
ATIS + SpiNNaker: a Fully Event-based Visual Tracking Demonstration
Authors:
Arren Glover,
Alan B. Stokes,
Steve Furber,
Chiara Bartolozzi
Abstract:
The Asynchronous Time-based Image Sensor (ATIS) and the Spiking Neural Network Architecture (SpiNNaker) are both neuromorphic technologies that "unconventionally" use binary spikes to represent information. The ATIS produces spikes to represent the change in light falling on the sensor, and the SpiNNaker is a massively parallel computing platform that asynchronously sends spikes between cores for…
▽ More
The Asynchronous Time-based Image Sensor (ATIS) and the Spiking Neural Network Architecture (SpiNNaker) are both neuromorphic technologies that "unconventionally" use binary spikes to represent information. The ATIS produces spikes to represent the change in light falling on the sensor, and the SpiNNaker is a massively parallel computing platform that asynchronously sends spikes between cores for processing. In this demonstration we show these two hardware used together to perform a visual tracking task. We aim to show the hardware and software architecture that integrates the ATIS and SpiNNaker together in a robot middle-ware that makes processing agnostic to the platform (CPU or SpiNNaker). We also aim to describe the algorithm, why it is suitable for the "unconventional" sensor and processing platform including the advantages as well as challenges faced.
△ Less
Submitted 3 December, 2019;
originally announced December 2019.
-
SpiNNaker 2: A 10 Million Core Processor System for Brain Simulation and Machine Learning
Authors:
Christian Mayr,
Sebastian Hoeppner,
Steve Furber
Abstract:
SpiNNaker is an ARM-based processor platform optimized for the simulation of spiking neural networks. This brief describes the roadmap in going from the current SPINNaker1 system, a 1 Million core machine in 130nm CMOS, to SpiNNaker2, a 10 Million core machine in 22nm FDSOI. Apart from pure scaling, we will take advantage of specific technology features, such as runtime adaptive body biasing, to d…
▽ More
SpiNNaker is an ARM-based processor platform optimized for the simulation of spiking neural networks. This brief describes the roadmap in going from the current SPINNaker1 system, a 1 Million core machine in 130nm CMOS, to SpiNNaker2, a 10 Million core machine in 22nm FDSOI. Apart from pure scaling, we will take advantage of specific technology features, such as runtime adaptive body biasing, to deliver cutting-edge power consumption. Power management of the cores allows a wide range of workload adaptivity, i.e. processor power scales with the complexity and activity of the spiking network. Additional numerical accelerators will enhance the utility of SpiNNaker2 for simulation of spiking neural networks as well as for executing conventional deep neural networks. These measures should increase the simulation capacity of the machine by a factor $>$50. The interplay between the two domains, i.e. spiking and rate based, will provide an interesting field for algorithm exploration on SpiNNaker2. Apart from the platforms' traditional usage as a neuroscience exploration tool, the extended functionality opens up new application areas such as automotive AI, tactile internet, industry 4.0 and biomedical processing.
△ Less
Submitted 6 November, 2019;
originally announced November 2019.
-
Real-Time Cortical Simulation on Neuromorphic Hardware
Authors:
Oliver Rhodes,
Luca Peres,
Andrew G. D. Rowley,
Andrew Gait,
Luis A. Plana,
Christian Brenninkmeijer,
Steve B. Furber
Abstract:
Real-time simulation of a large-scale biologically representative spiking neural network is presented, through the use of a heterogeneous parallelisation scheme and SpiNNaker neuromorphic hardware. A published cortical microcircuit model is used as a benchmark test case, representing approx. 1 square mm of early sensory cortex, containing 77k neurons and 0.3 billion synapses. This is the first tru…
▽ More
Real-time simulation of a large-scale biologically representative spiking neural network is presented, through the use of a heterogeneous parallelisation scheme and SpiNNaker neuromorphic hardware. A published cortical microcircuit model is used as a benchmark test case, representing approx. 1 square mm of early sensory cortex, containing 77k neurons and 0.3 billion synapses. This is the first true real-time simulation of this model, with 10 s of biological simulation time executed in 10 s wall-clock time. This surpasses best published efforts on HPC neural simulators (3x slowdown) and GPUs running optimised SNN libraries (2x slowdown). Furthermore, the presented approach indicates that real-time processing can be maintained with increasing SNN size, breaking the communication barrier incurred by traditional computing machinery. Model results are compared to an established HPC simulator baseline to verify simulation correctness, comparing well across a range of statistical measures. Energy to solution, and energy per synaptic event are also reported, demonstrating that the relatively low-tech SpiNNaker processors achieve a 10x reduction in energy relative to modern HPC systems, and comparable energy consumption to modern GPUs. Finally, system robustness is demonstrated through multiple 12 h simulations of the cortical microcircuit, each simulating 12 h of biological time, and demonstrating the potential of neuromorphic hardware as a neuroscience research tool for studying complex spiking neural networks over extended time periods.
△ Less
Submitted 18 September, 2019;
originally announced September 2019.
-
Stochastic rounding and reduced-precision fixed-point arithmetic for solving neural ordinary differential equations
Authors:
Michael Hopkins,
Mantas Mikaitis,
Dave R. Lester,
Steve Furber
Abstract:
Although double-precision floating-point arithmetic currently dominates high-performance computing, there is increasing interest in smaller and simpler arithmetic types. The main reasons are potential improvements in energy efficiency and memory footprint and bandwidth. However, simply switching to lower-precision types typically results in increased numerical errors. We investigate approaches to…
▽ More
Although double-precision floating-point arithmetic currently dominates high-performance computing, there is increasing interest in smaller and simpler arithmetic types. The main reasons are potential improvements in energy efficiency and memory footprint and bandwidth. However, simply switching to lower-precision types typically results in increased numerical errors. We investigate approaches to improving the accuracy of reduced-precision fixed-point arithmetic types, using examples in an important domain for numerical computation in neuroscience: the solution of Ordinary Differential Equations (ODEs). The Izhikevich neuron model is used to demonstrate that rounding has an important role in producing accurate spike timings from explicit ODE solution algorithms. In particular, fixed-point arithmetic with stochastic rounding consistently results in smaller errors compared to single precision floating-point and fixed-point arithmetic with round-to-nearest across a range of neuron behaviours and ODE solvers. A computationally much cheaper alternative is also investigated, inspired by the concept of dither that is a widely understood mechanism for providing resolution below the least significant bit (LSB) in digital signal processing. These results will have implications for the solution of ODEs in other subject areas, and should also be directly relevant to the huge range of practical problems that are represented by Partial Differential Equations (PDEs).
△ Less
Submitted 22 January, 2020; v1 submitted 25 April, 2019;
originally announced April 2019.
-
Dynamic Power Management for Neuromorphic Many-Core Systems
Authors:
Sebastian Hoeppner,
Bernhard Vogginger,
Yexin Yan,
Andreas Dixius,
Stefan Scholze,
Johannes Partzsch,
Felix Neumaerker,
Stephan Hartmann,
Stefan Schiefer,
Georg Ellguth,
Love Cederstroem,
Luis Plana,
Jim Garside,
Steve Furber,
Christian Mayr
Abstract:
This work presents a dynamic power management architecture for neuromorphic many core systems such as SpiNNaker. A fast dynamic voltage and frequency scaling (DVFS) technique is presented which allows the processing elements (PE) to change their supply voltage and clock frequency individually and autonomously within less than 100 ns. This is employed by the neuromorphic simulation software flow, w…
▽ More
This work presents a dynamic power management architecture for neuromorphic many core systems such as SpiNNaker. A fast dynamic voltage and frequency scaling (DVFS) technique is presented which allows the processing elements (PE) to change their supply voltage and clock frequency individually and autonomously within less than 100 ns. This is employed by the neuromorphic simulation software flow, which defines the performance level (PL) of the PE based on the actual workload within each simulation cycle. A test chip in 28 nm SLP CMOS technology has been implemented. It includes 4 PEs which can be scaled from 0.7 V to 1.0 V with frequencies from 125 MHz to 500 MHz at three distinct PLs. By measurement of three neuromorphic benchmarks it is shown that the total PE power consumption can be reduced by 75%, with 80% baseline power reduction and a 50% reduction of energy per neuron and synapse computation, all while maintaining temporary peak system performance to achieve biological real-time operation of the system. A numerical model of this power management model is derived which allows DVFS architecture exploration for neuromorphics. The proposed technique is to be used for the second generation SpiNNaker neuromorphic many core system.
△ Less
Submitted 21 March, 2019;
originally announced March 2019.
-
Efficient Reward-Based Structural Plasticity on a SpiNNaker 2 Prototype
Authors:
Yexin Yan,
David Kappel,
Felix Neumaerker,
Johannes Partzsch,
Bernhard Vogginger,
Sebastian Hoeppner,
Steve Furber,
Wolfgang Maass,
Robert Legenstein,
Christian Mayr
Abstract:
Advances in neuroscience uncover the mechanisms employed by the brain to efficiently solve complex learning tasks with very limited resources. However, the efficiency is often lost when one tries to port these findings to a silicon substrate, since brain-inspired algorithms often make extensive use of complex functions such as random number generators, that are expensive to compute on standard gen…
▽ More
Advances in neuroscience uncover the mechanisms employed by the brain to efficiently solve complex learning tasks with very limited resources. However, the efficiency is often lost when one tries to port these findings to a silicon substrate, since brain-inspired algorithms often make extensive use of complex functions such as random number generators, that are expensive to compute on standard general purpose hardware. The prototype chip of the 2nd generation SpiNNaker system is designed to overcome this problem. Low-power ARM processors equipped with a random number generator and an exponential function accelerator enable the efficient execution of brain-inspired algorithms. We implement the recently introduced reward-based synaptic sampling model that employs structural plasticity to learn a function or task. The numerical simulation of the model requires to update the synapse variables in each time step including an explorative random term. To the best of our knowledge, this is the most complex synapse model implemented so far on the SpiNNaker system. By making efficient use of the hardware accelerators and numerical optimizations the computation time of one plasticity update is reduced by a factor of 2. This, combined with fitting the model into to the local SRAM, leads to 62% energy reduction compared to the case without accelerators and the use of external DRAM. The model implementation is integrated into the SpiNNaker software framework allowing for scalability onto larger systems. The hardware-software system presented in this work paves the way for power-efficient mobile and biomedical applications with biologically plausible brain-inspired algorithms.
△ Less
Submitted 20 March, 2019;
originally announced March 2019.
-
SpiNNTools: The Execution Engine for the SpiNNaker Platform
Authors:
Andrew G. D. Rowley,
Christian Brenninkmeijer,
Simon Davidson,
Donal Fellows,
Andrew Gait,
David R. Lester,
Luis A. Plana,
Oliver Rhodes,
Alan B. Stokes,
Steve B. Furber
Abstract:
Distributed systems are becoming more common place, as computers typically contain multiple computation processors. The SpiNNaker architecture is such a distributed architecture, containing millions of cores connected with a unique communication network, making it one of the largest neuromorphic computing platforms in the world. Utilising these processors efficiently usually requires expert knowle…
▽ More
Distributed systems are becoming more common place, as computers typically contain multiple computation processors. The SpiNNaker architecture is such a distributed architecture, containing millions of cores connected with a unique communication network, making it one of the largest neuromorphic computing platforms in the world. Utilising these processors efficiently usually requires expert knowledge of the architecture to generate executable code. This work introduces a set of tools (SpiNNTools) that can map computational work described as a graph in to executable code that runs on this novel machine. The SpiNNaker architecture is highly scalable which in turn produces unique challenges in loading data, executing the mapped problem and the retrieval of data. In this paper we describe these challenges in detail and the solutions implemented.
△ Less
Submitted 16 October, 2018;
originally announced October 2018.
-
SLAMBench2: Multi-Objective Head-to-Head Benchmarking for Visual SLAM
Authors:
Bruno Bodin,
Harry Wagstaff,
Sajad Saeedi,
Luigi Nardi,
Emanuele Vespa,
John H Mayer,
Andy Nisbet,
Mikel Luján,
Steve Furber,
Andrew J Davison,
Paul H. J. Kelly,
Michael O'Boyle
Abstract:
SLAM is becoming a key component of robotics and augmented reality (AR) systems. While a large number of SLAM algorithms have been presented, there has been little effort to unify the interface of such algorithms, or to perform a holistic comparison of their capabilities. This is a problem since different SLAM applications can have different functional and non-functional requirements. For example,…
▽ More
SLAM is becoming a key component of robotics and augmented reality (AR) systems. While a large number of SLAM algorithms have been presented, there has been little effort to unify the interface of such algorithms, or to perform a holistic comparison of their capabilities. This is a problem since different SLAM applications can have different functional and non-functional requirements. For example, a mobile phonebased AR application has a tight energy budget, while a UAV navigation system usually requires high accuracy. SLAMBench2 is a benchmarking framework to evaluate existing and future SLAM systems, both open and close source, over an extensible list of datasets, while using a comparable and clearly specified list of performance metrics. A wide variety of existing SLAM algorithms and datasets is supported, e.g. ElasticFusion, InfiniTAM, ORB-SLAM2, OKVIS, and integrating new ones is straightforward and clearly specified by the framework. SLAMBench2 is a publicly-available software framework which represents a starting point for quantitative, comparable and validatable experimental research to investigate trade-offs across SLAM systems.
△ Less
Submitted 21 August, 2018;
originally announced August 2018.
-
Navigating the Landscape for Real-time Localisation and Map** for Robotics and Virtual and Augmented Reality
Authors:
Sajad Saeedi,
Bruno Bodin,
Harry Wagstaff,
Andy Nisbet,
Luigi Nardi,
John Mawer,
Nicolas Melot,
Oscar Palomar,
Emanuele Vespa,
Tom Spink,
Cosmin Gorgovan,
Andrew Webb,
James Clarkson,
Erik Tomusk,
Thomas Debrunner,
Kuba Kaszyk,
Pablo Gonzalez-de-Aledo,
Andrey Rodchenko,
Graham Riley,
Christos Kotselidis,
Björn Franke,
Michael F. P. O'Boyle,
Andrew J. Davison,
Paul H. J. Kelly,
Mikel Luján
, et al. (1 additional authors not shown)
Abstract:
Visual understanding of 3D environments in real-time, at low power, is a huge computational challenge. Often referred to as SLAM (Simultaneous Localisation and Map**), it is central to applications spanning domestic and industrial robotics, autonomous vehicles, virtual and augmented reality. This paper describes the results of a major research effort to assemble the algorithms, architectures, to…
▽ More
Visual understanding of 3D environments in real-time, at low power, is a huge computational challenge. Often referred to as SLAM (Simultaneous Localisation and Map**), it is central to applications spanning domestic and industrial robotics, autonomous vehicles, virtual and augmented reality. This paper describes the results of a major research effort to assemble the algorithms, architectures, tools, and systems software needed to enable delivery of SLAM, by supporting applications specialists in selecting and configuring the appropriate algorithm and the appropriate hardware, and compilation pathway, to meet their performance, accuracy, and energy consumption goals. The major contributions we present are (1) tools and methodology for systematic quantitative evaluation of SLAM algorithms, (2) automated, machine-learning-guided exploration of the algorithmic and implementation design space with respect to multiple objectives, (3) end-to-end simulation tools to enable optimisation of heterogeneous, accelerated architectures for the specific algorithmic requirements of the various SLAM algorithmic approaches, and (4) tools for delivering, where appropriate, accelerated, adaptive SLAM solutions in a managed, JIT-compiled, adaptive runtime context.
△ Less
Submitted 20 August, 2018;
originally announced August 2018.
-
Noisy Softplus: an activation function that enables SNNs to be trained as ANNs
Authors:
Qian Liu,
Yunhua Chen,
Steve Furber
Abstract:
We extended the work of proposed activation function, Noisy Softplus, to fit into training of layered up spiking neural networks (SNNs). Thus, any ANN employing Noisy Softplus neurons, even of deep architecture, can be trained simply by the traditional algorithm, for example Back Propagation (BP), and the trained weights can be directly used in the spiking version of the same network without any c…
▽ More
We extended the work of proposed activation function, Noisy Softplus, to fit into training of layered up spiking neural networks (SNNs). Thus, any ANN employing Noisy Softplus neurons, even of deep architecture, can be trained simply by the traditional algorithm, for example Back Propagation (BP), and the trained weights can be directly used in the spiking version of the same network without any conversion. Furthermore, the training method can be generalised to other activation units, for instance Rectified Linear Units (ReLU), to train deep SNNs off-line. This research is crucial to provide an effective approach for SNN training, and to increase the classification accuracy of SNNs with biological characteristics and to close the gap between the performance of SNNs and ANNs.
△ Less
Submitted 31 March, 2017;
originally announced June 2017.
-
Application-aware Retiming of Accelerators: A High-level Data-driven Approach
Authors:
Ana Lava,
Mahdi Jelodari Mamaghani,
Siamak Mohammadi,
Steve Furber
Abstract:
Flexibility at hardware level is the main driving force behind adaptive systems whose aim is to realise microarhitecture deconfiguration 'online'. This feature allows the software/hardware stack to tolerate drastic changes of the workload in data centres. With emerge of FPGA reconfigurablity this technology is becoming a mainstream computing paradigm. Adaptivity is usually accompanied by the high-…
▽ More
Flexibility at hardware level is the main driving force behind adaptive systems whose aim is to realise microarhitecture deconfiguration 'online'. This feature allows the software/hardware stack to tolerate drastic changes of the workload in data centres. With emerge of FPGA reconfigurablity this technology is becoming a mainstream computing paradigm. Adaptivity is usually accompanied by the high-level tools to facilitate multi-dimensional space exploration. An essential aspect in this space is memory orchestration where on-chip and off-chip memory distribution significantly influences the architecture in co** with the critical spatial and timing constraints, e.g. Place and Route. This paper proposes a memory smart technique for a particular class of adaptive systems: Elastic Circuits which enjoy slack elasticity at fine level of granularity. We explore retiming of a set of popular benchmarks via investigating the memory distribution within and among accelerators. The area, performance and power patterns are adopted by our high-level synthesis framework, with respect to the behaviour of the input descriptions, to improve the quality of the synthesised elastic circuits.
△ Less
Submitted 24 December, 2016;
originally announced December 2016.
-
Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM
Authors:
Luigi Nardi,
Bruno Bodin,
M. Zeeshan Zia,
John Mawer,
Andy Nisbet,
Paul H. J. Kelly,
Andrew J. Davison,
Mikel Luján,
Michael F. P. O'Boyle,
Graham Riley,
Nigel Topham,
Steve Furber
Abstract:
Real-time dense computer vision and SLAM offer great potential for a new level of scene modelling, tracking and real environmental interaction for many types of robot, but their high computational requirements mean that use on mass market embedded platforms is challenging. Meanwhile, trends in low-cost, low-power processing are towards massive parallelism and heterogeneity, making it difficult for…
▽ More
Real-time dense computer vision and SLAM offer great potential for a new level of scene modelling, tracking and real environmental interaction for many types of robot, but their high computational requirements mean that use on mass market embedded platforms is challenging. Meanwhile, trends in low-cost, low-power processing are towards massive parallelism and heterogeneity, making it difficult for robotics and vision researchers to implement their algorithms in a performance-portable way. In this paper we introduce SLAMBench, a publicly-available software framework which represents a starting point for quantitative, comparable and validatable experimental research to investigate trade-offs in performance, accuracy and energy consumption of a dense RGB-D SLAM system. SLAMBench provides a KinectFusion implementation in C++, OpenMP, OpenCL and CUDA, and harnesses the ICL-NUIM dataset of synthetic RGB-D sequences with trajectory and scene ground truth for reliable accuracy comparison of different implementation and algorithms. We present an analysis and breakdown of the constituent algorithmic elements of KinectFusion, and experimentally investigate their execution time on a variety of multicore and GPUaccelerated platforms. For a popular embedded platform, we also present an analysis of energy efficiency for different configuration alternatives.
△ Less
Submitted 26 February, 2015; v1 submitted 8 October, 2014;
originally announced October 2014.
-
An associative memory for the on-line recognition and prediction of temporal sequences
Authors:
J. Bose,
S. B. Furber,
J. L. Shapiro
Abstract:
This paper presents the design of an associative memory with feedback that is capable of on-line temporal sequence learning. A framework for on-line sequence learning has been proposed, and different sequence learning models have been analysed according to this framework. The network model is an associative memory with a separate store for the sequence context of a symbol. A sparse distributed m…
▽ More
This paper presents the design of an associative memory with feedback that is capable of on-line temporal sequence learning. A framework for on-line sequence learning has been proposed, and different sequence learning models have been analysed according to this framework. The network model is an associative memory with a separate store for the sequence context of a symbol. A sparse distributed memory is used to gain scalability. The context store combines the functionality of a neural layer with a shift register. The sensitivity of the machine to the sequence context is controllable, resulting in different characteristic behaviours. The model can store and predict on-line sequences of various types and length. Numerical simulations on the model have been carried out to determine its properties.
△ Less
Submitted 4 November, 2006;
originally announced November 2006.