-
Nanoscale imaging of He-ion irradiation effects on amorphous TaO$_x$ toward electroforming-free neuromorphic functions
Authors:
Olha Popova,
Steven J. Randolph,
Sabine M. Neumayer,
Liangbo Liang,
Benjamin Lawrie,
Olga S. Ovchinnikova,
Robert J. Bondi,
Matthew J. Marinella,
Bobby G. Sumpter,
Petro Maksymovych
Abstract:
Resistive switching in thin films has been widely studied in a broad range of materials. Yet the mechanisms behind electroresistive switching have been persistently difficult to decipher and control, in part due to their non-equilibrium nature. Here, we demonstrate new experimental approaches that can probe resistive switching phenomena, utilizing amorphous TaO$_x$ as a model material system. Spec…
▽ More
Resistive switching in thin films has been widely studied in a broad range of materials. Yet the mechanisms behind electroresistive switching have been persistently difficult to decipher and control, in part due to their non-equilibrium nature. Here, we demonstrate new experimental approaches that can probe resistive switching phenomena, utilizing amorphous TaO$_x$ as a model material system. Specifically, we apply Scanning Microwave Impedance Microscopy (sMIM) and cathodoluminescence (CL) microscopy as direct probes of conductance and electronic structure, respectively. These methods provide direct evidence of the electronic state of TaO$_x$ despite its amorphous nature. For example CL identifies characteristic impurity levels in TaO$_x$, in agreement with first principles calculations. We applied these methods to investigate He-ion-beam irradiation as a path to activate conductivity of materials and enable electroforming-free control over resistive switching. However, we find that even though He-ions begin to modify the nature of bonds even at the lowest doses, the films conductive properties exhibit remarkable stability with large displacement damage and they are driven to metallic states only at the limit of structural decomposition. Finally, we show that electroforming in a nanoscale junction can be carried out with a dissipated power of < 20 nW, a much smaller value compared to earlier studies and one that minimizes irreversible structural modifications of the films. The multimodal approach described here provides a new framework toward the theory/experiment guided design and optimization of electroresistive materials.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
Parallel Matrix Multiplication Using Voltage Controlled Magnetic Anisotropy Domain Wall Logic
Authors:
Nicholas Zogbi,
Samuel Liu,
Christopher H. Bennett,
Sapan Agarwal,
Matthew J. Marinella,
Jean Anne C. Incorvia,
T. Patrick Xiao
Abstract:
The domain wall-magnetic tunnel junction (DW-MTJ) is a versatile device that can simultaneously store data and perform computations. These three-terminal devices are promising for digital logic due to their nonvolatility, low-energy operation, and radiation hardness. Here, we augment the DW-MTJ logic gate with voltage controlled magnetic anisotropy (VCMA) to improve the reliability of logical conc…
▽ More
The domain wall-magnetic tunnel junction (DW-MTJ) is a versatile device that can simultaneously store data and perform computations. These three-terminal devices are promising for digital logic due to their nonvolatility, low-energy operation, and radiation hardness. Here, we augment the DW-MTJ logic gate with voltage controlled magnetic anisotropy (VCMA) to improve the reliability of logical concatenation in the presence of realistic process variations. VCMA creates potential wells that allow for reliable and repeatable localization of domain walls. The DW-MTJ logic gate supports different fanouts, allowing for multiple inputs and outputs for a single device without affecting area. We simulate a systolic array of DW-MTJ Multiply-Accumulate (MAC) units with 4-bit and 8-bit precision, which uses the nonvolatility of DW-MTJ logic gates to enable fine-grained pipelining and high parallelism. The DW-MTJ systolic array provides comparable throughput and efficiency to state-of-the-art CMOS systolic arrays while being radiation-hard. These results improve the feasibility of using domain wall-based processors, especially for extreme-environment applications such as space.
△ Less
Submitted 14 April, 2023; v1 submitted 26 January, 2023;
originally announced January 2023.
-
Tunable intervalence charge transfer in ruthenium Prussian blue analogue enables stable and efficient biocompatible artificial synapses
Authors:
Donald A. Robinson,
Michael E. Foster,
Christopher H. Bennett,
Austin Bhandarkar,
Elizabeth R. Webster,
Aleyna Celebi,
Nisa Celebi,
Elliot J. Fuller,
Vitalie Stavila,
Catalin D. Spataru,
David S. Ashby,
Matthew J. Marinella,
Raga Krishnakumar,
Mark D. Allendorf,
A. Alec Talin
Abstract:
Emerging concepts for neuromorphic computing, bioelectronics, and brain-computer interfacing inspire new research avenues aimed at understanding the relationship between oxidation state and conductivity in unexplored materials. Here, we present ruthenium Prussian blue analogue (RuPBA), a mixed valence coordination compound with an open framework structure and ability to conduct both ionic and elec…
▽ More
Emerging concepts for neuromorphic computing, bioelectronics, and brain-computer interfacing inspire new research avenues aimed at understanding the relationship between oxidation state and conductivity in unexplored materials. Here, we present ruthenium Prussian blue analogue (RuPBA), a mixed valence coordination compound with an open framework structure and ability to conduct both ionic and electronic charge, for flexible artificial synapses that reversibly switch conductance by more than four orders of magnitude based on electrochemically tunable oxidation state. Retention of programmed states is improved by nearly two orders of magnitude compared to the extensively studied organic polymers, thus reducing the frequency, complexity and energy costs associated with error correction schemes. We demonstrate dopamine detection using RuPBA synapses and biocompatibility with neuronal cells, evoking prospective application for brain-computer interfacing. By application of electron transfer theory to in-situ spectroscopic probing of intervalence charge transfer, we elucidate a switching mechanism whereby the degree of mixed valency between N-coordinated Ru sites controls the carrier concentration and mobility, as supported by DFT.
△ Less
Submitted 15 July, 2022;
originally announced July 2022.
-
Shape-Dependent Multi-Weight Magnetic Artificial Synapses for Neuromorphic Computing
Authors:
Thomas Leonard,
Samuel Liu,
Mahshid Alamdar,
Can Cui,
Otitoaleke G. Akinola,
Lin Xue,
T. Patrick Xiao,
Joseph S. Friedman,
Matthew J. Marinella,
Christopher H. Bennett,
Jean Anne C. Incorvia
Abstract:
In neuromorphic computing, artificial synapses provide a multi-weight conductance state that is set based on inputs from neurons, analogous to the brain. Additional properties of the synapse beyond multiple weights can be needed, and can depend on the application, requiring the need for generating different synapse behaviors from the same materials. Here, we measure artificial synapses based on ma…
▽ More
In neuromorphic computing, artificial synapses provide a multi-weight conductance state that is set based on inputs from neurons, analogous to the brain. Additional properties of the synapse beyond multiple weights can be needed, and can depend on the application, requiring the need for generating different synapse behaviors from the same materials. Here, we measure artificial synapses based on magnetic materials that use a magnetic tunnel junction and a magnetic domain wall. By fabricating lithographic notches in a domain wall track underneath a single magnetic tunnel junction, we achieve 4-5 stable resistance states that can be repeatably controlled electrically using spin orbit torque. We analyze the effect of geometry on the synapse behavior, showing that a trapezoidal device has asymmetric weight updates with high controllability, while a straight device has higher stochasticity, but with stable resistance levels. The device data is input into neuromorphic computing simulators to show the usefulness of application-specific synaptic functions. Implementing an artificial neural network applied on streamed Fashion-MNIST data, we show that the trapezoidal magnetic synapse can be used as a metaplastic function for efficient online learning. Implementing a convolutional neural network for CIFAR-100 image recognition, we show that the straight magnetic synapse achieves near-ideal inference accuracy, due to the stability of its resistance levels. This work shows multi-weight magnetic synapses are a feasible technology for neuromorphic computing and provides design guidelines for emerging artificial synapse technologies.
△ Less
Submitted 17 February, 2022; v1 submitted 22 November, 2021;
originally announced November 2021.
-
On the Accuracy of Analog Neural Network Inference Accelerators
Authors:
T. Patrick Xiao,
Ben Feinberg,
Christopher H. Bennett,
Venkatraman Prabhakar,
Prashant Saxena,
Vineet Agrawal,
Sapan Agarwal,
Matthew J. Marinella
Abstract:
Specialized accelerators have recently garnered attention as a method to reduce the power consumption of neural network inference. A promising category of accelerators utilizes nonvolatile memory arrays to both store weights and perform $\textit{in situ}$ analog computation inside the array. While prior work has explored the design space of analog accelerators to optimize performance and energy ef…
▽ More
Specialized accelerators have recently garnered attention as a method to reduce the power consumption of neural network inference. A promising category of accelerators utilizes nonvolatile memory arrays to both store weights and perform $\textit{in situ}$ analog computation inside the array. While prior work has explored the design space of analog accelerators to optimize performance and energy efficiency, there is seldom a rigorous evaluation of the accuracy of these accelerators. This work shows how architectural design decisions, particularly in map** neural network parameters to analog memory cells, influence inference accuracy. When evaluated using ResNet50 on ImageNet, the resilience of the system to analog non-idealities - cell programming errors, analog-to-digital converter resolution, and array parasitic resistances - all improve when analog quantities in the hardware are made proportional to the weights in the network. Moreover, contrary to the assumptions of prior work, nearly equivalent resilience to cell imprecision can be achieved by fully storing weights as analog quantities, rather than spreading weight bits across multiple devices, often referred to as bit slicing. By exploiting proportionality, analog system designers have the freedom to match the precision of the hardware to the needs of the algorithm, rather than attempting to guarantee the same level of precision in the intermediate results as an equivalent digital accelerator. This ultimately results in an analog accelerator that is more accurate, more robust to analog errors, and more energy-efficient.
△ Less
Submitted 3 February, 2022; v1 submitted 2 September, 2021;
originally announced September 2021.
-
High-Speed CMOS-Free Purely Spintronic Asynchronous Recurrent Neural Network
Authors:
Pranav O. Mathews,
Christian B. Duffee,
Abel Thayil,
Ty E. Stovall,
Christopher H. Bennett,
Felipe Garcia-Sanchez,
Matthew J. Marinella,
Jean Anne C. Incorvia,
Naimul Hassan,
Xuan Hu,
Joseph S. Friedman
Abstract:
Neuromorphic computing systems overcome the limitations of traditional von Neumann computing architectures. These computing systems can be further improved upon by using emerging technologies that are more efficient than CMOS for neural computation. Recent research has demonstrated memristors and spintronic devices in various neural network designs boost efficiency and speed. This paper presents a…
▽ More
Neuromorphic computing systems overcome the limitations of traditional von Neumann computing architectures. These computing systems can be further improved upon by using emerging technologies that are more efficient than CMOS for neural computation. Recent research has demonstrated memristors and spintronic devices in various neural network designs boost efficiency and speed. This paper presents a biologically inspired fully spintronic neuron used in a fully spintronic Hopfield RNN. The network is used to solve tasks, and the results are compared against those of current Hopfield neuromorphic architectures which use emerging technologies.
△ Less
Submitted 30 September, 2022; v1 submitted 5 July, 2021;
originally announced July 2021.
-
Controllable reset behavior in domain wall-magnetic tunnel junction artificial neurons for task-adaptable computation
Authors:
Samuel Liu,
Christopher H. Bennett,
Joseph S. Friedman,
Matthew J. Marinella,
David Paydarfar,
Jean Anne C. Incorvia
Abstract:
Neuromorphic computing with spintronic devices has been of interest due to the limitations of CMOS-driven von Neumann computing. Domain wall-magnetic tunnel junction (DW-MTJ) devices have been shown to be able to intrinsically capture biological neuron behavior. Edgy-relaxed behavior, where a frequently firing neuron experiences a lower action potential threshold, may provide additional artificial…
▽ More
Neuromorphic computing with spintronic devices has been of interest due to the limitations of CMOS-driven von Neumann computing. Domain wall-magnetic tunnel junction (DW-MTJ) devices have been shown to be able to intrinsically capture biological neuron behavior. Edgy-relaxed behavior, where a frequently firing neuron experiences a lower action potential threshold, may provide additional artificial neuronal functionality when executing repeated tasks. In this study, we demonstrate that this behavior can be implemented in DW-MTJ artificial neurons via three alternative mechanisms: shape anisotropy, magnetic field, and current-driven soft reset. Using micromagnetics and analytical device modeling to classify the Optdigits handwritten digit dataset, we show that edgy-relaxed behavior improves both classification accuracy and classification rate for ordered datasets while sacrificing little to no accuracy for a randomized dataset. This work establishes methods by which artificial spintronic neurons can be flexibly adapted to datasets.
△ Less
Submitted 8 January, 2021;
originally announced January 2021.
-
Domain Wall Leaky Integrate-and-Fire Neurons with Shape-Based Configurable Activation Functions
Authors:
Wesley H. Brigner,
Naimul Hassan,
Xuan Hu,
Christopher H. Bennett,
Felipe Garcia-Sanchez,
Can Cui,
Alvaro Velasquez,
Matthew J. Marinella,
Jean Anne C. Incorvia,
Joseph S. Friedman
Abstract:
Complementary metal oxide semiconductor (CMOS) devices display volatile characteristics, and are not well suited for analog applications such as neuromorphic computing. Spintronic devices, on the other hand, exhibit both non-volatile and analog features, which are well-suited to neuromorphic computing. Consequently, these novel devices are at the forefront of beyond-CMOS artificial intelligence ap…
▽ More
Complementary metal oxide semiconductor (CMOS) devices display volatile characteristics, and are not well suited for analog applications such as neuromorphic computing. Spintronic devices, on the other hand, exhibit both non-volatile and analog features, which are well-suited to neuromorphic computing. Consequently, these novel devices are at the forefront of beyond-CMOS artificial intelligence applications. However, a large quantity of these artificial neuromorphic devices still require the use of CMOS, which decreases the efficiency of the system. To resolve this, we have previously proposed a number of artificial neurons and synapses that do not require CMOS for operation. Although these devices are a significant improvement over previous renditions, their ability to enable neural network learning and recognition is limited by their intrinsic activation functions. This work proposes modifications to these spintronic neurons that enable configuration of the activation functions through control of the shape of a magnetic domain wall track. Linear and sigmoidal activation functions are demonstrated in this work, which can be extended through a similar approach to enable a wide variety of activation functions.
△ Less
Submitted 11 November, 2020;
originally announced November 2020.
-
Domain Wall-Magnetic Tunnel Junction Spin Orbit Torque Devices and Circuits for In-Memory Computing
Authors:
Mahshid Alamdar,
Thomas Leonard,
Can Cui,
Bishweshwor P. Rimal,
Lin Xue,
Otitoaleke G. Akinola,
T. Patrick Xiao,
Joseph S. Friedman,
Christopher H. Bennett,
Matthew J. Marinella,
Jean Anne C. Incorvia
Abstract:
There are pressing problems with traditional computing, especially for accomplishing data-intensive and real-time tasks, that motivate the development of in-memory computing devices to both store information and perform computation. Magnetic tunnel junction (MTJ) memory elements can be used for computation by manipulating a domain wall (DW), a transition region between magnetic domains. But, these…
▽ More
There are pressing problems with traditional computing, especially for accomplishing data-intensive and real-time tasks, that motivate the development of in-memory computing devices to both store information and perform computation. Magnetic tunnel junction (MTJ) memory elements can be used for computation by manipulating a domain wall (DW), a transition region between magnetic domains. But, these devices have suffered from challenges: spin transfer torque (STT) switching of a DW requires high current, and the multiple etch steps needed to create an MTJ pillar on top of a DW track has led to reduced tunnel magnetoresistance (TMR). These issues have limited experimental study of devices and circuits. Here, we study prototypes of three-terminal domain wall-magnetic tunnel junction (DW-MTJ) in-memory computing devices that can address data processing bottlenecks and resolve these challenges by using perpendicular magnetic anisotropy (PMA), spin-orbit torque (SOT) switching, and an optimized lithography process to produce average device tunnel magnetoresistance TMR = 164%, resistance-area product RA = 31 Ω-μm^2, close to the RA of the unpatterned film, and lower switching current density compared to using spin transfer torque. A two-device circuit shows bit propagation between devices. Device initialization variation in switching voltage is shown to be curtailed to 7% by controlling the DW initial position, which we show corresponds to 96% accuracy in a DW-MTJ full adder simulation. These results make strides in using MTJs and DWs for in-memory and neuromorphic computing applications.
△ Less
Submitted 26 October, 2020;
originally announced October 2020.
-
Device-aware inference operations in SONOS nonvolatile memory arrays
Authors:
Christopher H. Bennett,
T. Patrick Xiao,
Ryan Dellana,
Vineet Agrawal,
Ben Feinberg,
Venkatraman Prabhakar,
Krishnaswamy Ramkumar,
Long Hinh,
Swatilekha Saha,
Vijay Raghavan,
Ramesh Chettuvetty,
Sapan Agarwal,
Matthew J. Marinella
Abstract:
Non-volatile memory arrays can deploy pre-trained neural network models for edge inference. However, these systems are affected by device-level noise and retention issues. Here, we examine damage caused by these effects, introduce a mitigation strategy, and demonstrate its use in fabricated array of SONOS (Silicon-Oxide-Nitride-Oxide-Silicon) devices. On MNIST, fashion-MNIST, and CIFAR-10 tasks, o…
▽ More
Non-volatile memory arrays can deploy pre-trained neural network models for edge inference. However, these systems are affected by device-level noise and retention issues. Here, we examine damage caused by these effects, introduce a mitigation strategy, and demonstrate its use in fabricated array of SONOS (Silicon-Oxide-Nitride-Oxide-Silicon) devices. On MNIST, fashion-MNIST, and CIFAR-10 tasks, our approach increases resilience to synaptic noise and drift. We also show strong performance can be realized with ADCs of 5-8 bits precision.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
Unsupervised Competitive Hardware Learning Rule for Spintronic Clustering Architecture
Authors:
Alvaro Velasquez,
Christopher H. Bennett,
Naimul Hassan,
Wesley H. Brigner,
Otitoaleke G. Akinola,
Jean Anne C. Incorvia,
Matthew J. Marinella,
Joseph S. Friedman
Abstract:
We propose a hardware learning rule for unsupervised clustering within a novel spintronic computing architecture. The proposed approach leverages the three-terminal structure of domain-wall magnetic tunnel junction devices to establish a feedback loop that serves to train such devices when they are used as synapses in a neuromorphic computing architecture.
We propose a hardware learning rule for unsupervised clustering within a novel spintronic computing architecture. The proposed approach leverages the three-terminal structure of domain-wall magnetic tunnel junction devices to establish a feedback loop that serves to train such devices when they are used as synapses in a neuromorphic computing architecture.
△ Less
Submitted 24 March, 2020;
originally announced March 2020.
-
Evaluating complexity and resilience trade-offs in emerging memory inference machines
Authors:
Christopher H. Bennett,
Ryan Dellana,
T. Patrick Xiao,
Ben Feinberg,
Sapan Agarwal,
Suma Cardwell,
Matthew J. Marinella,
William Severa,
Brad Aimone
Abstract:
Neuromorphic-style inference only works well if limited hardware resources are maximized properly, e.g. accuracy continues to scale with parameters and complexity in the face of potential disturbance. In this work, we use realistic crossbar simulations to highlight that compact implementations of deep neural networks are unexpectedly susceptible to collapse from multiple system disturbances. Our w…
▽ More
Neuromorphic-style inference only works well if limited hardware resources are maximized properly, e.g. accuracy continues to scale with parameters and complexity in the face of potential disturbance. In this work, we use realistic crossbar simulations to highlight that compact implementations of deep neural networks are unexpectedly susceptible to collapse from multiple system disturbances. Our work proposes a middle path towards high performance and strong resilience utilizing the Mosaics framework, and specifically by re-using synaptic connections in a recurrent neural network implementation that possesses a natural form of noise-immunity.
△ Less
Submitted 25 February, 2020;
originally announced March 2020.
-
Plasticity-Enhanced Domain-Wall MTJ Neural Networks for Energy-Efficient Online Learning
Authors:
Christopher H. Bennett,
T. Patrick Xiao,
Can Cui,
Naimul Hassan,
Otitoaleke G. Akinola,
Jean Anne C. Incorvia,
Alvaro Velasquez,
Joseph S. Friedman,
Matthew J. Marinella
Abstract:
Machine learning implements backpropagation via abundant training samples. We demonstrate a multi-stage learning system realized by a promising non-volatile memory device, the domain-wall magnetic tunnel junction (DW-MTJ). The system consists of unsupervised (clustering) as well as supervised sub-systems, and generalizes quickly (with few samples). We demonstrate interactions between physical prop…
▽ More
Machine learning implements backpropagation via abundant training samples. We demonstrate a multi-stage learning system realized by a promising non-volatile memory device, the domain-wall magnetic tunnel junction (DW-MTJ). The system consists of unsupervised (clustering) as well as supervised sub-systems, and generalizes quickly (with few samples). We demonstrate interactions between physical properties of this device and optimal implementation of neuroscience-inspired plasticity learning rules, and highlight performance on a suite of tasks. Our energy analysis confirms the value of the approach, as the learning budget stays below 20 $μJ$ even for large tasks used typically in machine learning.
△ Less
Submitted 4 March, 2020;
originally announced March 2020.
-
CMOS-Free Multilayer Perceptron Enabled by Four-Terminal MTJ Device
Authors:
Wesley H. Brigner,
Naimul Hassan,
Xuan Hu,
Christopher H. Bennett,
Felipe Garcia-Sanchez,
Matthew J. Marinella,
Jean Anne C. Incorvia,
Joseph S. Friedman
Abstract:
Neuromorphic computing promises revolutionary improvements over conventional systems for applications that process unstructured information. To fully realize this potential, neuromorphic systems should exploit the biomimetic behavior of emerging nanodevices. In particular, exceptional opportunities are provided by the non-volatility and analog capabilities of spintronic devices. While spintronic d…
▽ More
Neuromorphic computing promises revolutionary improvements over conventional systems for applications that process unstructured information. To fully realize this potential, neuromorphic systems should exploit the biomimetic behavior of emerging nanodevices. In particular, exceptional opportunities are provided by the non-volatility and analog capabilities of spintronic devices. While spintronic devices have previously been proposed that emulate neurons and synapses, complementary metal-oxide-semiconductor (CMOS) devices are required to implement multilayer spintronic perceptron crossbars. This work therefore proposes a new spintronic neuron that enables purely spintronic multilayer perceptrons, eliminating the need for CMOS circuitry and simplifying fabrication.
△ Less
Submitted 3 February, 2020;
originally announced February 2020.
-
PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-efficient ReRAM
Authors:
Aayush Ankit,
Izzat El Hajj,
Sai Rahul Chalamalasetti,
Sapan Agarwal,
Matthew Marinella,
Martin Foltin,
John Paul Strachan,
Dejan Milojicic,
Wen-mei Hwu,
Kaushik Roy
Abstract:
The wide adoption of deep neural networks has been accompanied by ever-increasing energy and performance demands due to the expensive nature of training them. Numerous special-purpose architectures have been proposed to accelerate training: both digital and hybrid digital-analog using resistive RAM (ReRAM) crossbars. ReRAM-based accelerators have demonstrated the effectiveness of ReRAM crossbars a…
▽ More
The wide adoption of deep neural networks has been accompanied by ever-increasing energy and performance demands due to the expensive nature of training them. Numerous special-purpose architectures have been proposed to accelerate training: both digital and hybrid digital-analog using resistive RAM (ReRAM) crossbars. ReRAM-based accelerators have demonstrated the effectiveness of ReRAM crossbars at performing matrix-vector multiplication operations that are prevalent in training. However, they still suffer from inefficiency due to the use of serial reads and writes for performing the weight gradient and update step. A few works have demonstrated the possibility of performing outer products in crossbars, which can be used to realize the weight gradient and update step without the use of serial reads and writes. However, these works have been limited to low precision operations which are not sufficient for typical training workloads. Moreover, they have been confined to a limited set of training algorithms for fully-connected layers only. To address these limitations, we propose a bit-slicing technique for enhancing the precision of ReRAM-based outer products, which is substantially different from bit-slicing for matrix-vector multiplication only. We incorporate this technique into a crossbar architecture with three variants catered to different training algorithms. To evaluate our design on different types of layers in neural networks (fully-connected, convolutional, etc.) and training algorithms, we develop PANTHER, an ISA-programmable training accelerator with compiler support. Our evaluation shows that PANTHER achieves up to $8.02\times$, $54.21\times$, and $103\times$ energy reductions as well as $7.16\times$, $4.02\times$, and $16\times$ execution time reductions compared to digital accelerators, ReRAM-based accelerators, and GPUs, respectively.
△ Less
Submitted 24 December, 2019;
originally announced December 2019.
-
Maximized Lateral Inhibition in Paired Magnetic Domain Wall Racetracks for Neuromorphic Computing
Authors:
C. Cui,
O. G. Akinola,
N. Hassan,
C. H. Bennett,
M. J. Marinella,
J. S. Friedman,
J. A. C. Incorvia
Abstract:
Lateral inhibition is an important functionality in neuromorphic computing, modeled after the biological neuron behavior that a firing neuron deactivates its neighbors belonging to the same layer and prevents them from firing. In most neuromorphic hardware platforms lateral inhibition is implemented by external circuitry, thereby decreasing the energy efficiency and increasing the area overhead of…
▽ More
Lateral inhibition is an important functionality in neuromorphic computing, modeled after the biological neuron behavior that a firing neuron deactivates its neighbors belonging to the same layer and prevents them from firing. In most neuromorphic hardware platforms lateral inhibition is implemented by external circuitry, thereby decreasing the energy efficiency and increasing the area overhead of such systems. Recently, the domain wall -- magnetic tunnel junction (DW-MTJ) artificial neuron is demonstrated in modeling to be inherently inhibitory. Without peripheral circuitry, lateral inhibition in DW-MTJ neurons results from magnetostatic interaction between neighboring neuron cells. However, the lateral inhibition mechanism in DW-MTJ neurons has not been studied thoroughly, leading to weak inhibition only in very closely-spaced devices. This work approaches these problems by modeling current- and field- driven DW motion in a pair of adjacent DW-MTJ neurons. We maximize the magnitude of lateral inhibition by tuning the magnetic interaction between the neurons. The results are explained by current-driven DW velocity characteristics in response to external magnetic field and quantified by an analytical model. Finally, the dependence of lateral inhibition strength on device parameters is investigated. This provides a guideline for the optimization of lateral inhibition implementation in DW-MTJ neurons. With strong lateral inhibition achieved, a path towards competitive learning algorithms such as the winner-take-all are made possible on such neuromorphic devices.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.
-
Shape-based Magnetic Domain Wall Drift for an Artificial Spintronic Leaky Integrate-and-Fire Neuron
Authors:
Wesley H. Brigner,
Naimul Hassan,
Lucian Jiang-Wei,
Xuan Hu,
Diptish Saha,
Christopher H. Bennett,
Matthew J. Marinella,
Jean Anne C. Incorvia,
Felipe Garcia-Sanchez,
Joseph S. Friedman
Abstract:
Spintronic devices based on domain wall (DW) motion through ferromagnetic nanowire tracks have received great interest as components of neuromorphic information processing systems. Previous proposals for spintronic artificial neurons required external stimuli to perform the leaking functionality, one of the three fundamental functions of a leaky integrate-and-fire (LIF) neuron. The use of this ext…
▽ More
Spintronic devices based on domain wall (DW) motion through ferromagnetic nanowire tracks have received great interest as components of neuromorphic information processing systems. Previous proposals for spintronic artificial neurons required external stimuli to perform the leaking functionality, one of the three fundamental functions of a leaky integrate-and-fire (LIF) neuron. The use of this external magnetic field or electrical current stimulus results in either a decrease in energy efficiency or an increase in fabrication complexity. In this work, we modify the shape of previously demonstrated three-terminal magnetic tunnel junction neurons to perform the leaking operation without any external stimuli. The trapezoidal structure causes shape-based DW drift, thus intrinsically providing the leaking functionality with no hardware cost. This LIF neuron therefore promises to advance the development of spintronic neural network crossbar arrays.
△ Less
Submitted 14 May, 2019;
originally announced May 2019.
-
Using Floating Gate Memory to Train Ideal Accuracy Neural Networks
Authors:
Sapan Agarwal,
Diana Garland,
John Niroula,
Robin B,
Jacobs-Gedrim,
Alex Hsia,
Michael S. Van Heukelom,
Elliot Fuller,
Bruce Draper,
Matthew J. Marinella
Abstract:
Floating gate SONOS (Silicon-Oxygen-Nitrogen-Oxygen-Silicon) transistors can be used to train neural networks to ideal accuracies that match those of floating point digital weights on the MNIST dataset when using multiple devices to represent a weight or within 1% of ideal accuracy when using a single device. This is enabled by operating devices in the subthreshold regime, where they exhibit symme…
▽ More
Floating gate SONOS (Silicon-Oxygen-Nitrogen-Oxygen-Silicon) transistors can be used to train neural networks to ideal accuracies that match those of floating point digital weights on the MNIST dataset when using multiple devices to represent a weight or within 1% of ideal accuracy when using a single device. This is enabled by operating devices in the subthreshold regime, where they exhibit symmetric write nonlinearities. A neural training accelerator core based on SONOS with a single device per weight would increase energy efficiency by 120X, operate 2.1X faster and require 5X lower area than an optimized SRAM based ASIC.
△ Less
Submitted 27 February, 2019; v1 submitted 29 January, 2019;
originally announced January 2019.
-
Electroforming-Free TaOx Memristors using Focused Ion Beam Irradiations
Authors:
J. L. Pacheco,
D. L. Perry,
D. R. Hughart,
M. Marinella,
E. Bielejec
Abstract:
We demonstrate creation of electroforming-free TaOx memristive devices using focused ion beam irradiations to locally define conductive filaments in TaOx films. Electrical characterization shows that these irradiations directly create fully functional memristors without the need for electroforming. Ion beam forming of conductive filaments combined with state-of-the-art nano-patterning presents a C…
▽ More
We demonstrate creation of electroforming-free TaOx memristive devices using focused ion beam irradiations to locally define conductive filaments in TaOx films. Electrical characterization shows that these irradiations directly create fully functional memristors without the need for electroforming. Ion beam forming of conductive filaments combined with state-of-the-art nano-patterning presents a CMOS compatible approach to wafer level fabrication of fully formed and operational memristors.
△ Less
Submitted 25 October, 2017;
originally announced October 2017.
-
Multiscale Co-Design Analysis of Energy, Latency, Area, and Accuracy of a ReRAM Analog Neural Training Accelerator
Authors:
Matthew J. Marinella,
Sapan Agarwal,
Alexander Hsia,
Isaac Richter,
Robin Jacobs-Gedrim,
John Niroula,
Steven J. Plimpton,
Engin Ipek,
Conrad D. James
Abstract:
Neural networks are an increasingly attractive algorithm for natural language processing and pattern recognition. Deep networks with >50M parameters are made possible by modern GPU clusters operating at <50 pJ per op and more recently, production accelerators capable of <5pJ per operation at the board level. However, with the slowing of CMOS scaling, new paradigms will be required to achieve the n…
▽ More
Neural networks are an increasingly attractive algorithm for natural language processing and pattern recognition. Deep networks with >50M parameters are made possible by modern GPU clusters operating at <50 pJ per op and more recently, production accelerators capable of <5pJ per operation at the board level. However, with the slowing of CMOS scaling, new paradigms will be required to achieve the next several orders of magnitude in performance per watt gains. Using an analog resistive memory (ReRAM) crossbar to perform key matrix operations in an accelerator is an attractive option. This work presents a detailed design using a state of the art 14/16 nm PDK for of an analog crossbar circuit block designed to process three key kernels required in training and inference of neural networks. A detailed circuit and device-level analysis of energy, latency, area, and accuracy are given and compared to relevant designs using standard digital ReRAM and SRAM operations. It is shown that the analog accelerator has a 270x energy and 540x latency advantage over a similar block utilizing only digital ReRAM and takes only 11 fJ per multiply and accumulate (MAC). Compared to an SRAM based accelerator, the energy is 430X better and latency is 34X better. Although training accuracy is degraded in the analog accelerator, several options to improve this are presented. The possible gains over a similar digital-only version of this accelerator block suggest that continued optimization of analog resistive memories is valuable. This detailed circuit and device analysis of a training accelerator may serve as a foundation for further architecture-level studies.
△ Less
Submitted 16 February, 2018; v1 submitted 31 July, 2017;
originally announced July 2017.
-
Degenerate Resistive Switching and Ultrahigh Density Storage in Resistive Memory
Authors:
Andrew J. Lohn,
Patrick R. Mickel,
Conrad D. James,
Matthew J. Marinella
Abstract:
We show that, in tantalum oxide resistive memories, activation power provides a multi-level variable for information storage that can be set and read separately from the resistance. These two state variables (resistance and activation power) can be precisely controlled in two steps: (1) the possible activation power states are selected by partially reducing resistance, then (2) a subsequent partia…
▽ More
We show that, in tantalum oxide resistive memories, activation power provides a multi-level variable for information storage that can be set and read separately from the resistance. These two state variables (resistance and activation power) can be precisely controlled in two steps: (1) the possible activation power states are selected by partially reducing resistance, then (2) a subsequent partial increase in resistance specifies the resistance state and the final activation power state. We show that these states can be precisely written and read electrically, making this approach potentially amenable for ultra-high density memories. We provide a theoretical explanation for information storage and retrieval from activation power and experimentally demonstrate information storage in a third dimension related to the change in activation power with resistance.
△ Less
Submitted 16 June, 2014;
originally announced June 2014.