Search | arXiv e-print repository

Shape-Dependent Multi-Weight Magnetic Artificial Synapses for Neuromorphic Computing

Authors: Thomas Leonard, Samuel Liu, Mahshid Alamdar, Can Cui, Otitoaleke G. Akinola, Lin Xue, T. Patrick Xiao, Joseph S. Friedman, Matthew J. Marinella, Christopher H. Bennett, Jean Anne C. Incorvia

Abstract: In neuromorphic computing, artificial synapses provide a multi-weight conductance state that is set based on inputs from neurons, analogous to the brain. Additional properties of the synapse beyond multiple weights can be needed, and can depend on the application, requiring the need for generating different synapse behaviors from the same materials. Here, we measure artificial synapses based on ma… ▽ More In neuromorphic computing, artificial synapses provide a multi-weight conductance state that is set based on inputs from neurons, analogous to the brain. Additional properties of the synapse beyond multiple weights can be needed, and can depend on the application, requiring the need for generating different synapse behaviors from the same materials. Here, we measure artificial synapses based on magnetic materials that use a magnetic tunnel junction and a magnetic domain wall. By fabricating lithographic notches in a domain wall track underneath a single magnetic tunnel junction, we achieve 4-5 stable resistance states that can be repeatably controlled electrically using spin orbit torque. We analyze the effect of geometry on the synapse behavior, showing that a trapezoidal device has asymmetric weight updates with high controllability, while a straight device has higher stochasticity, but with stable resistance levels. The device data is input into neuromorphic computing simulators to show the usefulness of application-specific synaptic functions. Implementing an artificial neural network applied on streamed Fashion-MNIST data, we show that the trapezoidal magnetic synapse can be used as a metaplastic function for efficient online learning. Implementing a convolutional neural network for CIFAR-100 image recognition, we show that the straight magnetic synapse achieves near-ideal inference accuracy, due to the stability of its resistance levels. This work shows multi-weight magnetic synapses are a feasible technology for neuromorphic computing and provides design guidelines for emerging artificial synapse technologies. △ Less

Submitted 17 February, 2022; v1 submitted 22 November, 2021; originally announced November 2021.

Comments: 27 pages 6 figures 1 table

arXiv:2109.01262 [pdf, other]

doi 10.1109/MCAS.2022.3214409

On the Accuracy of Analog Neural Network Inference Accelerators

Authors: T. Patrick Xiao, Ben Feinberg, Christopher H. Bennett, Venkatraman Prabhakar, Prashant Saxena, Vineet Agrawal, Sapan Agarwal, Matthew J. Marinella

Abstract: Specialized accelerators have recently garnered attention as a method to reduce the power consumption of neural network inference. A promising category of accelerators utilizes nonvolatile memory arrays to both store weights and perform $\textit{in situ}$ analog computation inside the array. While prior work has explored the design space of analog accelerators to optimize performance and energy ef… ▽ More Specialized accelerators have recently garnered attention as a method to reduce the power consumption of neural network inference. A promising category of accelerators utilizes nonvolatile memory arrays to both store weights and perform $\textit{in situ}$ analog computation inside the array. While prior work has explored the design space of analog accelerators to optimize performance and energy efficiency, there is seldom a rigorous evaluation of the accuracy of these accelerators. This work shows how architectural design decisions, particularly in map** neural network parameters to analog memory cells, influence inference accuracy. When evaluated using ResNet50 on ImageNet, the resilience of the system to analog non-idealities - cell programming errors, analog-to-digital converter resolution, and array parasitic resistances - all improve when analog quantities in the hardware are made proportional to the weights in the network. Moreover, contrary to the assumptions of prior work, nearly equivalent resilience to cell imprecision can be achieved by fully storing weights as analog quantities, rather than spreading weight bits across multiple devices, often referred to as bit slicing. By exploiting proportionality, analog system designers have the freedom to match the precision of the hardware to the needs of the algorithm, rather than attempting to guarantee the same level of precision in the intermediate results as an equivalent digital accelerator. This ultimately results in an analog accelerator that is more accurate, more robust to analog errors, and more energy-efficient. △ Less

Submitted 3 February, 2022; v1 submitted 2 September, 2021; originally announced September 2021.

Comments: Changes in v3: modified definition of state-independent error (factor of 2) for fairer comparison to state-proportional. Added more results on INT4 network

Journal ref: IEEE Circuits and Systems Magazine, vol. 22, no. 4, pp. 26-48, 2022

arXiv:2107.02238 [pdf, other]

High-Speed CMOS-Free Purely Spintronic Asynchronous Recurrent Neural Network

Authors: Pranav O. Mathews, Christian B. Duffee, Abel Thayil, Ty E. Stovall, Christopher H. Bennett, Felipe Garcia-Sanchez, Matthew J. Marinella, Jean Anne C. Incorvia, Naimul Hassan, Xuan Hu, Joseph S. Friedman

Abstract: Neuromorphic computing systems overcome the limitations of traditional von Neumann computing architectures. These computing systems can be further improved upon by using emerging technologies that are more efficient than CMOS for neural computation. Recent research has demonstrated memristors and spintronic devices in various neural network designs boost efficiency and speed. This paper presents a… ▽ More Neuromorphic computing systems overcome the limitations of traditional von Neumann computing architectures. These computing systems can be further improved upon by using emerging technologies that are more efficient than CMOS for neural computation. Recent research has demonstrated memristors and spintronic devices in various neural network designs boost efficiency and speed. This paper presents a biologically inspired fully spintronic neuron used in a fully spintronic Hopfield RNN. The network is used to solve tasks, and the results are compared against those of current Hopfield neuromorphic architectures which use emerging technologies. △ Less

Submitted 30 September, 2022; v1 submitted 5 July, 2021; originally announced July 2021.

arXiv:2101.03095 [pdf]

doi 10.1109/LMAG.2021.3069666

Controllable reset behavior in domain wall-magnetic tunnel junction artificial neurons for task-adaptable computation

Authors: Samuel Liu, Christopher H. Bennett, Joseph S. Friedman, Matthew J. Marinella, David Paydarfar, Jean Anne C. Incorvia

Abstract: Neuromorphic computing with spintronic devices has been of interest due to the limitations of CMOS-driven von Neumann computing. Domain wall-magnetic tunnel junction (DW-MTJ) devices have been shown to be able to intrinsically capture biological neuron behavior. Edgy-relaxed behavior, where a frequently firing neuron experiences a lower action potential threshold, may provide additional artificial… ▽ More Neuromorphic computing with spintronic devices has been of interest due to the limitations of CMOS-driven von Neumann computing. Domain wall-magnetic tunnel junction (DW-MTJ) devices have been shown to be able to intrinsically capture biological neuron behavior. Edgy-relaxed behavior, where a frequently firing neuron experiences a lower action potential threshold, may provide additional artificial neuronal functionality when executing repeated tasks. In this study, we demonstrate that this behavior can be implemented in DW-MTJ artificial neurons via three alternative mechanisms: shape anisotropy, magnetic field, and current-driven soft reset. Using micromagnetics and analytical device modeling to classify the Optdigits handwritten digit dataset, we show that edgy-relaxed behavior improves both classification accuracy and classification rate for ordered datasets while sacrificing little to no accuracy for a randomized dataset. This work establishes methods by which artificial spintronic neurons can be flexibly adapted to datasets. △ Less

Submitted 8 January, 2021; originally announced January 2021.

Comments: 5 pages, 5 figures

arXiv:2011.06075 [pdf]

doi 10.1109/TED.2022.3159508

Domain Wall Leaky Integrate-and-Fire Neurons with Shape-Based Configurable Activation Functions

Authors: Wesley H. Brigner, Naimul Hassan, Xuan Hu, Christopher H. Bennett, Felipe Garcia-Sanchez, Can Cui, Alvaro Velasquez, Matthew J. Marinella, Jean Anne C. Incorvia, Joseph S. Friedman

Abstract: Complementary metal oxide semiconductor (CMOS) devices display volatile characteristics, and are not well suited for analog applications such as neuromorphic computing. Spintronic devices, on the other hand, exhibit both non-volatile and analog features, which are well-suited to neuromorphic computing. Consequently, these novel devices are at the forefront of beyond-CMOS artificial intelligence ap… ▽ More Complementary metal oxide semiconductor (CMOS) devices display volatile characteristics, and are not well suited for analog applications such as neuromorphic computing. Spintronic devices, on the other hand, exhibit both non-volatile and analog features, which are well-suited to neuromorphic computing. Consequently, these novel devices are at the forefront of beyond-CMOS artificial intelligence applications. However, a large quantity of these artificial neuromorphic devices still require the use of CMOS, which decreases the efficiency of the system. To resolve this, we have previously proposed a number of artificial neurons and synapses that do not require CMOS for operation. Although these devices are a significant improvement over previous renditions, their ability to enable neural network learning and recognition is limited by their intrinsic activation functions. This work proposes modifications to these spintronic neurons that enable configuration of the activation functions through control of the shape of a magnetic domain wall track. Linear and sigmoidal activation functions are demonstrated in this work, which can be extended through a similar approach to enable a wide variety of activation functions. △ Less

Submitted 11 November, 2020; originally announced November 2020.

arXiv:2004.00802 [pdf]

Device-aware inference operations in SONOS nonvolatile memory arrays

Authors: Christopher H. Bennett, T. Patrick Xiao, Ryan Dellana, Vineet Agrawal, Ben Feinberg, Venkatraman Prabhakar, Krishnaswamy Ramkumar, Long Hinh, Swatilekha Saha, Vijay Raghavan, Ramesh Chettuvetty, Sapan Agarwal, Matthew J. Marinella

Abstract: Non-volatile memory arrays can deploy pre-trained neural network models for edge inference. However, these systems are affected by device-level noise and retention issues. Here, we examine damage caused by these effects, introduce a mitigation strategy, and demonstrate its use in fabricated array of SONOS (Silicon-Oxide-Nitride-Oxide-Silicon) devices. On MNIST, fashion-MNIST, and CIFAR-10 tasks, o… ▽ More Non-volatile memory arrays can deploy pre-trained neural network models for edge inference. However, these systems are affected by device-level noise and retention issues. Here, we examine damage caused by these effects, introduce a mitigation strategy, and demonstrate its use in fabricated array of SONOS (Silicon-Oxide-Nitride-Oxide-Silicon) devices. On MNIST, fashion-MNIST, and CIFAR-10 tasks, our approach increases resilience to synaptic noise and drift. We also show strong performance can be realized with ADCs of 5-8 bits precision. △ Less

Submitted 2 April, 2020; originally announced April 2020.

Comments: To be presented at IEEE International Physics Reliability Symposium (IRPS) 2020

arXiv:2003.11120 [pdf, other]

Unsupervised Competitive Hardware Learning Rule for Spintronic Clustering Architecture

Authors: Alvaro Velasquez, Christopher H. Bennett, Naimul Hassan, Wesley H. Brigner, Otitoaleke G. Akinola, Jean Anne C. Incorvia, Matthew J. Marinella, Joseph S. Friedman

Abstract: We propose a hardware learning rule for unsupervised clustering within a novel spintronic computing architecture. The proposed approach leverages the three-terminal structure of domain-wall magnetic tunnel junction devices to establish a feedback loop that serves to train such devices when they are used as synapses in a neuromorphic computing architecture. We propose a hardware learning rule for unsupervised clustering within a novel spintronic computing architecture. The proposed approach leverages the three-terminal structure of domain-wall magnetic tunnel junction devices to establish a feedback loop that serves to train such devices when they are used as synapses in a neuromorphic computing architecture. △ Less

Submitted 24 March, 2020; originally announced March 2020.

arXiv:2003.10396 [pdf, other]

Evaluating complexity and resilience trade-offs in emerging memory inference machines

Authors: Christopher H. Bennett, Ryan Dellana, T. Patrick Xiao, Ben Feinberg, Sapan Agarwal, Suma Cardwell, Matthew J. Marinella, William Severa, Brad Aimone

Abstract: Neuromorphic-style inference only works well if limited hardware resources are maximized properly, e.g. accuracy continues to scale with parameters and complexity in the face of potential disturbance. In this work, we use realistic crossbar simulations to highlight that compact implementations of deep neural networks are unexpectedly susceptible to collapse from multiple system disturbances. Our w… ▽ More Neuromorphic-style inference only works well if limited hardware resources are maximized properly, e.g. accuracy continues to scale with parameters and complexity in the face of potential disturbance. In this work, we use realistic crossbar simulations to highlight that compact implementations of deep neural networks are unexpectedly susceptible to collapse from multiple system disturbances. Our work proposes a middle path towards high performance and strong resilience utilizing the Mosaics framework, and specifically by re-using synaptic connections in a recurrent neural network implementation that possesses a natural form of noise-immunity. △ Less

Submitted 25 February, 2020; originally announced March 2020.

arXiv:2003.02357 [pdf, other]

Plasticity-Enhanced Domain-Wall MTJ Neural Networks for Energy-Efficient Online Learning

Authors: Christopher H. Bennett, T. Patrick Xiao, Can Cui, Naimul Hassan, Otitoaleke G. Akinola, Jean Anne C. Incorvia, Alvaro Velasquez, Joseph S. Friedman, Matthew J. Marinella

Abstract: Machine learning implements backpropagation via abundant training samples. We demonstrate a multi-stage learning system realized by a promising non-volatile memory device, the domain-wall magnetic tunnel junction (DW-MTJ). The system consists of unsupervised (clustering) as well as supervised sub-systems, and generalizes quickly (with few samples). We demonstrate interactions between physical prop… ▽ More Machine learning implements backpropagation via abundant training samples. We demonstrate a multi-stage learning system realized by a promising non-volatile memory device, the domain-wall magnetic tunnel junction (DW-MTJ). The system consists of unsupervised (clustering) as well as supervised sub-systems, and generalizes quickly (with few samples). We demonstrate interactions between physical properties of this device and optimal implementation of neuroscience-inspired plasticity learning rules, and highlight performance on a suite of tasks. Our energy analysis confirms the value of the approach, as the learning budget stays below 20 $μJ$ even for large tasks used typically in machine learning. △ Less

Submitted 4 March, 2020; originally announced March 2020.

arXiv:2002.00862 [pdf]

CMOS-Free Multilayer Perceptron Enabled by Four-Terminal MTJ Device

Authors: Wesley H. Brigner, Naimul Hassan, Xuan Hu, Christopher H. Bennett, Felipe Garcia-Sanchez, Matthew J. Marinella, Jean Anne C. Incorvia, Joseph S. Friedman

Abstract: Neuromorphic computing promises revolutionary improvements over conventional systems for applications that process unstructured information. To fully realize this potential, neuromorphic systems should exploit the biomimetic behavior of emerging nanodevices. In particular, exceptional opportunities are provided by the non-volatility and analog capabilities of spintronic devices. While spintronic d… ▽ More Neuromorphic computing promises revolutionary improvements over conventional systems for applications that process unstructured information. To fully realize this potential, neuromorphic systems should exploit the biomimetic behavior of emerging nanodevices. In particular, exceptional opportunities are provided by the non-volatility and analog capabilities of spintronic devices. While spintronic devices have previously been proposed that emulate neurons and synapses, complementary metal-oxide-semiconductor (CMOS) devices are required to implement multilayer spintronic perceptron crossbars. This work therefore proposes a new spintronic neuron that enables purely spintronic multilayer perceptrons, eliminating the need for CMOS circuitry and simplifying fabrication. △ Less

Submitted 3 February, 2020; originally announced February 2020.

arXiv:1905.05485 [pdf]

doi 10.1109/TED.2019.2938952

Shape-based Magnetic Domain Wall Drift for an Artificial Spintronic Leaky Integrate-and-Fire Neuron

Authors: Wesley H. Brigner, Naimul Hassan, Lucian Jiang-Wei, Xuan Hu, Diptish Saha, Christopher H. Bennett, Matthew J. Marinella, Jean Anne C. Incorvia, Felipe Garcia-Sanchez, Joseph S. Friedman

Abstract: Spintronic devices based on domain wall (DW) motion through ferromagnetic nanowire tracks have received great interest as components of neuromorphic information processing systems. Previous proposals for spintronic artificial neurons required external stimuli to perform the leaking functionality, one of the three fundamental functions of a leaky integrate-and-fire (LIF) neuron. The use of this ext… ▽ More Spintronic devices based on domain wall (DW) motion through ferromagnetic nanowire tracks have received great interest as components of neuromorphic information processing systems. Previous proposals for spintronic artificial neurons required external stimuli to perform the leaking functionality, one of the three fundamental functions of a leaky integrate-and-fire (LIF) neuron. The use of this external magnetic field or electrical current stimulus results in either a decrease in energy efficiency or an increase in fabrication complexity. In this work, we modify the shape of previously demonstrated three-terminal magnetic tunnel junction neurons to perform the leaking operation without any external stimuli. The trapezoidal structure causes shape-based DW drift, thus intrinsically providing the leaking functionality with no hardware cost. This LIF neuron therefore promises to advance the development of spintronic neural network crossbar arrays. △ Less

Submitted 14 May, 2019; originally announced May 2019.

arXiv:1901.10570 [pdf]

Using Floating Gate Memory to Train Ideal Accuracy Neural Networks

Authors: Sapan Agarwal, Diana Garland, John Niroula, Robin B, Jacobs-Gedrim, Alex Hsia, Michael S. Van Heukelom, Elliot Fuller, Bruce Draper, Matthew J. Marinella

Abstract: Floating gate SONOS (Silicon-Oxygen-Nitrogen-Oxygen-Silicon) transistors can be used to train neural networks to ideal accuracies that match those of floating point digital weights on the MNIST dataset when using multiple devices to represent a weight or within 1% of ideal accuracy when using a single device. This is enabled by operating devices in the subthreshold regime, where they exhibit symme… ▽ More Floating gate SONOS (Silicon-Oxygen-Nitrogen-Oxygen-Silicon) transistors can be used to train neural networks to ideal accuracies that match those of floating point digital weights on the MNIST dataset when using multiple devices to represent a weight or within 1% of ideal accuracy when using a single device. This is enabled by operating devices in the subthreshold regime, where they exhibit symmetric write nonlinearities. A neural training accelerator core based on SONOS with a single device per weight would increase energy efficiency by 120X, operate 2.1X faster and require 5X lower area than an optimized SRAM based ASIC. △ Less

Submitted 27 February, 2019; v1 submitted 29 January, 2019; originally announced January 2019.

arXiv:1707.09952 [pdf]

doi 10.1109/JETCAS.2018.2796379

Multiscale Co-Design Analysis of Energy, Latency, Area, and Accuracy of a ReRAM Analog Neural Training Accelerator

Authors: Matthew J. Marinella, Sapan Agarwal, Alexander Hsia, Isaac Richter, Robin Jacobs-Gedrim, John Niroula, Steven J. Plimpton, Engin Ipek, Conrad D. James

Abstract: Neural networks are an increasingly attractive algorithm for natural language processing and pattern recognition. Deep networks with >50M parameters are made possible by modern GPU clusters operating at <50 pJ per op and more recently, production accelerators capable of <5pJ per operation at the board level. However, with the slowing of CMOS scaling, new paradigms will be required to achieve the n… ▽ More Neural networks are an increasingly attractive algorithm for natural language processing and pattern recognition. Deep networks with >50M parameters are made possible by modern GPU clusters operating at <50 pJ per op and more recently, production accelerators capable of <5pJ per operation at the board level. However, with the slowing of CMOS scaling, new paradigms will be required to achieve the next several orders of magnitude in performance per watt gains. Using an analog resistive memory (ReRAM) crossbar to perform key matrix operations in an accelerator is an attractive option. This work presents a detailed design using a state of the art 14/16 nm PDK for of an analog crossbar circuit block designed to process three key kernels required in training and inference of neural networks. A detailed circuit and device-level analysis of energy, latency, area, and accuracy are given and compared to relevant designs using standard digital ReRAM and SRAM operations. It is shown that the analog accelerator has a 270x energy and 540x latency advantage over a similar block utilizing only digital ReRAM and takes only 11 fJ per multiply and accumulate (MAC). Compared to an SRAM based accelerator, the energy is 430X better and latency is 34X better. Although training accuracy is degraded in the analog accelerator, several options to improve this are presented. The possible gains over a similar digital-only version of this accelerator block suggest that continued optimization of analog resistive memories is valuable. This detailed circuit and device analysis of a training accelerator may serve as a foundation for further architecture-level studies. △ Less

Submitted 16 February, 2018; v1 submitted 31 July, 2017; originally announced July 2017.

arXiv:1406.4033 [pdf]

doi 10.1063/1.4895526

Degenerate Resistive Switching and Ultrahigh Density Storage in Resistive Memory

Authors: Andrew J. Lohn, Patrick R. Mickel, Conrad D. James, Matthew J. Marinella

Abstract: We show that, in tantalum oxide resistive memories, activation power provides a multi-level variable for information storage that can be set and read separately from the resistance. These two state variables (resistance and activation power) can be precisely controlled in two steps: (1) the possible activation power states are selected by partially reducing resistance, then (2) a subsequent partia… ▽ More We show that, in tantalum oxide resistive memories, activation power provides a multi-level variable for information storage that can be set and read separately from the resistance. These two state variables (resistance and activation power) can be precisely controlled in two steps: (1) the possible activation power states are selected by partially reducing resistance, then (2) a subsequent partial increase in resistance specifies the resistance state and the final activation power state. We show that these states can be precisely written and read electrically, making this approach potentially amenable for ultra-high density memories. We provide a theoretical explanation for information storage and retrieval from activation power and experimentally demonstrate information storage in a third dimension related to the change in activation power with resistance. △ Less

Submitted 16 June, 2014; originally announced June 2014.

Showing 1–14 of 14 results for author: Marinella, M J