Search | arXiv e-print repository

Neuromorphic Co-Design as a Game

Authors: Craig M. Vineyard, William M. Severa, James B. Aimone

Abstract: Co-design is a prominent topic presently in computing, speaking to the mutual benefit of coordinating design choices of several layers in the technology stack. For example, this may be designing algorithms which can most efficiently take advantage of the acceleration properties of a given architecture, while simultaneously designing the hardware to support the structural needs of a class of comput… ▽ More Co-design is a prominent topic presently in computing, speaking to the mutual benefit of coordinating design choices of several layers in the technology stack. For example, this may be designing algorithms which can most efficiently take advantage of the acceleration properties of a given architecture, while simultaneously designing the hardware to support the structural needs of a class of computation. The implications of these design decisions are influential enough to be deemed a lottery, enabling an idea to win out over others irrespective of the individual merits. Coordination is a well studied topic in the mathematics of game theory, where in many cases without a coordination mechanism the outcome is sub-optimal. Here we consider what insights game theoretic analysis can offer for computer architecture co-design. In particular, we consider the interplay between algorithm and architecture advances in the field of neuromorphic computing. Analyzing developments of spiking neural network algorithms and neuromorphic hardware as a co-design game we use the Stag Hunt model to illustrate challenges for spiking algorithms or architectures to advance the field independently and advocate for a strategic pursuit to advance neuromorphic computing. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: 8 pages, 2 figures, accepted to First Workshop on Machine Learning with New Compute Paradigms at NeurIPS 2023

Report number: SAND2023-14477C

arXiv:2304.04640 [pdf, other]

NeuroBench: A Framework for Benchmarking Neuromorphic Computing Algorithms and Systems

Authors: Jason Yik, Korneel Van den Berghe, Douwe den Blanken, Younes Bouhadjar, Maxime Fabre, Paul Hueber, Denis Kleyko, Noah Pacik-Nelson, Pao-Sheng Vincent Sun, Guangzhi Tang, Shenqi Wang, Biyan Zhou, Soikat Hasan Ahmed, George Vathakkattil Joseph, Benedetto Leto, Aurora Micheli, Anurag Kumar Mishra, Gregor Lenz, Tao Sun, Zergham Ahmed, Mahmoud Akl, Brian Anderson, Andreas G. Andreou, Chiara Bartolozzi, Arindam Basu , et al. (73 additional authors not shown)

Abstract: Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using brain-inspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. Prior neu… ▽ More Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using brain-inspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. Prior neuromorphic computing benchmark efforts have not seen widespread adoption due to a lack of inclusive, actionable, and iterative benchmark design and guidelines. To address these shortcomings, we present NeuroBench: a benchmark framework for neuromorphic computing algorithms and systems. NeuroBench is a collaboratively-designed effort from an open community of nearly 100 co-authors across over 50 institutions in industry and academia, aiming to provide a representative structure for standardizing the evaluation of neuromorphic approaches. The NeuroBench framework introduces a common set of tools and systematic methodology for inclusive benchmark measurement, delivering an objective reference framework for quantifying neuromorphic approaches in both hardware-independent (algorithm track) and hardware-dependent (system track) settings. In this article, we present initial performance baselines across various model architectures on the algorithm track and outline the system track benchmark tasks and guidelines. NeuroBench is intended to continually expand its benchmarks and features to foster and track the progress made by the research community. △ Less

Submitted 17 January, 2024; v1 submitted 10 April, 2023; originally announced April 2023.

Comments: Updated from whitepaper to full perspective article preprint

arXiv:2203.12662 [pdf, other]

Spiking Neural Streaming Binary Arithmetic

Authors: James B. Aimone, Aaron J. Hill, William M. Severa, Craig M. Vineyard

Abstract: Boolean functions and binary arithmetic operations are central to standard computing paradigms. Accordingly, many advances in computing have focused upon how to make these operations more efficient as well as exploring what they can compute. To best leverage the advantages of novel computing paradigms it is important to consider what unique computing approaches they offer. However, for any special… ▽ More Boolean functions and binary arithmetic operations are central to standard computing paradigms. Accordingly, many advances in computing have focused upon how to make these operations more efficient as well as exploring what they can compute. To best leverage the advantages of novel computing paradigms it is important to consider what unique computing approaches they offer. However, for any special-purpose co-processor, Boolean functions and binary arithmetic operations are useful for, among other things, avoiding unnecessary I/O on-and-off the co-processor by pre- and post-processing data on-device. This is especially true for spiking neuromorphic architectures where these basic operations are not fundamental low-level operations. Instead, these functions require specific implementation. Here we discuss the implications of an advantageous streaming binary encoding method as well as a handful of circuits designed to exactly compute elementary Boolean and binary operations. △ Less

Submitted 23 March, 2022; originally announced March 2022.

Comments: Accepted and presented at the 2021 International Conference on Rebooting Computing (ICRC)

Report number: SAND2021-13472 C

arXiv:2106.08435 [pdf]

Co-Design of Free-Space Metasurface Optical Neuromorphic Classifiers for High Performance

Authors: François Léonard, Adam S. Backer, Elliot J. Fuller, Corinne Teeter, Craig. M. Vineyard

Abstract: Classification of features in a scene typically requires conversion of the incoming photonic field into the electronic domain. Recently, an alternative approach has emerged whereby passive structured materials can perform classification tasks by directly using free-space propagation and diffraction of light. In this manuscript, we present a theoretical and computational study of such systems and e… ▽ More Classification of features in a scene typically requires conversion of the incoming photonic field into the electronic domain. Recently, an alternative approach has emerged whereby passive structured materials can perform classification tasks by directly using free-space propagation and diffraction of light. In this manuscript, we present a theoretical and computational study of such systems and establish the basic features that govern their performance. We show that system architecture, material structure, and input light field are intertwined and need to be co-designed to maximize classification accuracy. Our simulations show that a single layer metasurface can achieve classification accuracy better than conventional linear classifiers, with an order of magnitude fewer diffractive features than previously reported. For a wavelength λ, single layer metasurfaces of size with aperture density achieve ~96% testing accuracy on the MNIST dataset, for an optimized distance ~ to the output plane. This is enabled by an intrinsic nonlinearity in photodetection, despite the use of linear optical metamaterials. Furthermore, we find that once the system is optimized, the number of diffractive features is the main determinant of classification performance. The slow asymptotic scaling with the number of apertures suggests a reason why such systems may benefit from multiple layer designs. Finally, we show a trade-off between the number of apertures and fabrication noise. △ Less

Submitted 15 June, 2021; originally announced June 2021.

Comments: 32 pages, 11 figures (main text and supporting information). To appear in ACS Photonics

arXiv:1911.05704 [pdf, other]

RAPDARTS: Resource-Aware Progressive Differentiable Architecture Search

Authors: Sam Green, Craig M. Vineyard, Ryan Helinski, Çetin Kaya Koç

Abstract: Early neural network architectures were designed by so-called "grad student descent". Since then, the field of Neural Architecture Search (NAS) has developed with the goal of algorithmically designing architectures tailored for a dataset of interest. Recently, gradient-based NAS approaches have been created to rapidly perform the search. Gradient-based approaches impose more structure on the searc… ▽ More Early neural network architectures were designed by so-called "grad student descent". Since then, the field of Neural Architecture Search (NAS) has developed with the goal of algorithmically designing architectures tailored for a dataset of interest. Recently, gradient-based NAS approaches have been created to rapidly perform the search. Gradient-based approaches impose more structure on the search, compared to alternative NAS methods, enabling faster search phase optimization. In the real-world, neural architecture performance is measured by more than just high accuracy. There is increasing need for efficient neural architectures, where resources such as model size or latency must also be considered. Gradient-based NAS is also suitable for such multi-objective optimization. In this work we extend a popular gradient-based NAS method to support one or more resource costs. We then perform in-depth analysis on the discovery of architectures satisfying single-resource constraints for classification of CIFAR-10. △ Less

Submitted 8 November, 2019; originally announced November 2019.

arXiv:1905.12130 [pdf, other]

Composing Neural Algorithms with Fugu

Authors: James B Aimone, William Severa, Craig M Vineyard

Abstract: Neuromorphic hardware architectures represent a growing family of potential post-Moore's Law Era platforms. Largely due to event-driving processing inspired by the human brain, these computer platforms can offer significant energy benefits compared to traditional von Neumann processors. Unfortunately there still remains considerable difficulty in successfully programming, configuring and deploying… ▽ More Neuromorphic hardware architectures represent a growing family of potential post-Moore's Law Era platforms. Largely due to event-driving processing inspired by the human brain, these computer platforms can offer significant energy benefits compared to traditional von Neumann processors. Unfortunately there still remains considerable difficulty in successfully programming, configuring and deploying neuromorphic systems. We present the Fugu framework as an answer to this need. Rather than necessitating a developer attain intricate knowledge of how to program and exploit spiking neural dynamics to utilize the potential benefits of neuromorphic computing, Fugu is designed to provide a higher level abstraction as a hardware-independent mechanism for linking a variety of scalable spiking neural algorithms from a variety of sources. Individual kernels linked together provide sophisticated processing through compositionality. Fugu is intended to be suitable for a wide-range of neuromorphic applications, including machine learning, scientific computing, and more brain-inspired neural algorithms. Ultimately, we hope the community adopts this and other open standardization attempts allowing for free exchange and easy implementations of the ever-growing list of spiking neural algorithms. △ Less

Submitted 28 May, 2019; originally announced May 2019.

Comments: Accepted to 2019 International Conference on Neuromorphic Systems (ICONS)

arXiv:1901.08128 [pdf, other]

Distillation Strategies for Proximal Policy Optimization

Authors: Sam Green, Craig M. Vineyard, Çetin Kaya Koç

Abstract: Vision-based deep reinforcement learning (RL) typically obtains performance benefit by using high capacity and relatively large convolutional neural networks (CNN). However, a large network leads to higher inference costs (power, latency, silicon area, MAC count). Many inference optimizations have been developed for CNNs. Some optimization techniques offer theoretical efficiency, such as sparsity,… ▽ More Vision-based deep reinforcement learning (RL) typically obtains performance benefit by using high capacity and relatively large convolutional neural networks (CNN). However, a large network leads to higher inference costs (power, latency, silicon area, MAC count). Many inference optimizations have been developed for CNNs. Some optimization techniques offer theoretical efficiency, such as sparsity, but designing actual hardware to support them is difficult. On the other hand, distillation is a simple general-purpose optimization technique which is broadly applicable for transferring knowledge from a trained, high capacity teacher network to an untrained, low capacity student network. DQN distillation extended the original distillation idea to transfer information stored in a high performance, high capacity teacher Q-function trained via the Deep Q-Learning (DQN) algorithm. Our work adapts the DQN distillation work to the actor-critic Proximal Policy Optimization algorithm. PPO is simple to implement and has much higher performance than the seminal DQN algorithm. We show that a distilled PPO student can attain far higher performance compared to a DQN teacher. We also show that a low capacity distilled student is generally able to outperform a low capacity agent that directly trains in the environment. Finally, we show that distillation, followed by "fine-tuning" in the environment, enables the distilled PPO student to achieve parity with teacher performance. In general, the lessons learned in this work should transfer to other modern actor-critic RL algorithms. △ Less

Submitted 30 April, 2019; v1 submitted 23 January, 2019; originally announced January 2019.

arXiv:1810.11521 [pdf, other]

doi 10.1038/s42256-018-0015-y

Whetstone: A Method for Training Deep Artificial Neural Networks for Binary Communication

Authors: William Severa, Craig M. Vineyard, Ryan Dellana, Stephen J. Verzi, James B. Aimone

Abstract: This paper presents a new technique for training networks for low-precision communication. Targeting minimal communication between nodes not only enables the use of emerging spiking neuromorphic platforms, but may additionally streamline processing conventionally. Low-power and embedded neuromorphic processors potentially offer dramatic performance-per-Watt improvements over traditional von Neuman… ▽ More This paper presents a new technique for training networks for low-precision communication. Targeting minimal communication between nodes not only enables the use of emerging spiking neuromorphic platforms, but may additionally streamline processing conventionally. Low-power and embedded neuromorphic processors potentially offer dramatic performance-per-Watt improvements over traditional von Neumann processors, however programming these brain-inspired platforms generally requires platform-specific expertise which limits their applicability. To date, the majority of artificial neural networks have not operated using discrete spike-like communication. We present a method for training deep spiking neural networks using an iterative modification of the backpropagation optimization algorithm. This method, which we call Whetstone, effectively and reliably configures a network for a spiking hardware target with little, if any, loss in performance. Whetstone networks use single time step binary communication and do not require a rate code or other spike-based coding scheme, thus producing networks comparable in timing and size to conventional ANNs, albeit with binarized communication. We demonstrate Whetstone on a number of image classification networks, describing how the sharpening process interacts with different training optimizers and changes the distribution of activity within the network. We further note that Whetstone is compatible with several non-classification neural network applications, such as autoencoders and semantic segmentation. Whetstone is widely extendable and currently implemented using custom activation functions within the Keras wrapper to the popular TensorFlow machine learning framework. △ Less

Submitted 26 October, 2018; originally announced October 2018.

arXiv:1704.08306 [pdf, other]

A Digital Neuromorphic Architecture Efficiently Facilitating Complex Synaptic Response Functions Applied to Liquid State Machines

Authors: Michael R. Smith, Aaron J. Hill, Kristofor D. Carlson, Craig M. Vineyard, Jonathon Donaldson, David R. Follett, Pamela L. Follett, John H. Naegle, Conrad D. James, James B. Aimone

Abstract: Information in neural networks is represented as weighted connections, or synapses, between neurons. This poses a problem as the primary computational bottleneck for neural networks is the vector-matrix multiply when inputs are multiplied by the neural network weights. Conventional processing architectures are not well suited for simulating neural networks, often requiring large amounts of energy… ▽ More Information in neural networks is represented as weighted connections, or synapses, between neurons. This poses a problem as the primary computational bottleneck for neural networks is the vector-matrix multiply when inputs are multiplied by the neural network weights. Conventional processing architectures are not well suited for simulating neural networks, often requiring large amounts of energy and time. Additionally, synapses in biological neural networks are not binary connections, but exhibit a nonlinear response function as neurotransmitters are emitted and diffuse between neurons. Inspired by neuroscience principles, we present a digital neuromorphic architecture, the Spiking Temporal Processing Unit (STPU), capable of modeling arbitrary complex synaptic response functions without requiring additional hardware components. We consider the paradigm of spiking neurons with temporally coded information as opposed to non-spiking rate coded neurons used in most neural networks. In this paradigm we examine liquid state machines applied to speech recognition and show how a liquid state machine with temporal dynamics maps onto the STPU-demonstrating the flexibility and efficiency of the STPU for instantiating neural algorithms. △ Less

Submitted 21 March, 2017; originally announced April 2017.

Comments: 8 pages, 4 Figures, Preprint of 2017 IJCNN

arXiv:1612.03770 [pdf, other]

doi 10.1109/IJCNN.2017.7965898

Neurogenesis Deep Learning

Authors: Timothy J. Draelos, Nadine E. Miner, Christopher C. Lamb, Jonathan A. Cox, Craig M. Vineyard, Kristofor D. Carlson, William M. Severa, Conrad D. James, James B. Aimone

Abstract: Neural machine learning methods, such as deep neural networks (DNN), have achieved remarkable success in a number of complex data processing tasks. These methods have arguably had their strongest impact on tasks such as image and audio processing - data processing domains in which humans have long held clear advantages over conventional algorithms. In contrast to biological neural systems, which a… ▽ More Neural machine learning methods, such as deep neural networks (DNN), have achieved remarkable success in a number of complex data processing tasks. These methods have arguably had their strongest impact on tasks such as image and audio processing - data processing domains in which humans have long held clear advantages over conventional algorithms. In contrast to biological neural systems, which are capable of learning continuously, deep artificial networks have a limited ability for incorporating new information in an already trained network. As a result, methods for continuous learning are potentially highly impactful in enabling the application of deep networks to dynamic data sets. Here, inspired by the process of adult neurogenesis in the hippocampus, we explore the potential for adding new neurons to deep layers of artificial neural networks in order to facilitate their acquisition of novel information while preserving previously trained data representations. Our results on the MNIST handwritten digit dataset and the NIST SD 19 dataset, which includes lower and upper case letters and digits, demonstrate that neurogenesis is well suited for addressing the stability-plasticity dilemma that has long challenged adaptive machine learning algorithms. △ Less

Submitted 28 March, 2017; v1 submitted 12 December, 2016; originally announced December 2016.

Comments: 8 pages, 8 figures, Accepted to 2017 International Joint Conference on Neural Networks (IJCNN 2017)

Report number: SAND2017-2174 C

Showing 1–10 of 10 results for author: Vineyard, C M