-
Parallel Matrix Multiplication Using Voltage Controlled Magnetic Anisotropy Domain Wall Logic
Authors:
Nicholas Zogbi,
Samuel Liu,
Christopher H. Bennett,
Sapan Agarwal,
Matthew J. Marinella,
Jean Anne C. Incorvia,
T. Patrick Xiao
Abstract:
The domain wall-magnetic tunnel junction (DW-MTJ) is a versatile device that can simultaneously store data and perform computations. These three-terminal devices are promising for digital logic due to their nonvolatility, low-energy operation, and radiation hardness. Here, we augment the DW-MTJ logic gate with voltage controlled magnetic anisotropy (VCMA) to improve the reliability of logical conc…
▽ More
The domain wall-magnetic tunnel junction (DW-MTJ) is a versatile device that can simultaneously store data and perform computations. These three-terminal devices are promising for digital logic due to their nonvolatility, low-energy operation, and radiation hardness. Here, we augment the DW-MTJ logic gate with voltage controlled magnetic anisotropy (VCMA) to improve the reliability of logical concatenation in the presence of realistic process variations. VCMA creates potential wells that allow for reliable and repeatable localization of domain walls. The DW-MTJ logic gate supports different fanouts, allowing for multiple inputs and outputs for a single device without affecting area. We simulate a systolic array of DW-MTJ Multiply-Accumulate (MAC) units with 4-bit and 8-bit precision, which uses the nonvolatility of DW-MTJ logic gates to enable fine-grained pipelining and high parallelism. The DW-MTJ systolic array provides comparable throughput and efficiency to state-of-the-art CMOS systolic arrays while being radiation-hard. These results improve the feasibility of using domain wall-based processors, especially for extreme-environment applications such as space.
△ Less
Submitted 14 April, 2023; v1 submitted 26 January, 2023;
originally announced January 2023.
-
An out-of-distribution discriminator based on Bayesian neural network epistemic uncertainty
Authors:
Ethan Ancell,
Christopher Bennett,
Bert Debusschere,
Sapan Agarwal,
Park Hays,
T. Patrick Xiao
Abstract:
Neural networks have revolutionized the field of machine learning with increased predictive capability. In addition to improving the predictions of neural networks, there is a simultaneous demand for reliable uncertainty quantification on estimates made by machine learning methods such as neural networks. Bayesian neural networks (BNNs) are an important type of neural network with built-in capabil…
▽ More
Neural networks have revolutionized the field of machine learning with increased predictive capability. In addition to improving the predictions of neural networks, there is a simultaneous demand for reliable uncertainty quantification on estimates made by machine learning methods such as neural networks. Bayesian neural networks (BNNs) are an important type of neural network with built-in capability for quantifying uncertainty. This paper discusses aleatoric and epistemic uncertainty in BNNs and how they can be calculated. With an example dataset of images where the goal is to identify the amplitude of an event in the image, it is shown that epistemic uncertainty tends to be lower in images which are well-represented in the training dataset and tends to be high in images which are not well-represented. An algorithm for out-of-distribution (OoD) detection with BNN epistemic uncertainty is introduced along with various experiments demonstrating factors influencing the OoD detection capability in a BNN. The OoD detection capability with epistemic uncertainty is shown to be comparable to the OoD detection in the discriminator network of a generative adversarial network (GAN) with comparable network architecture.
△ Less
Submitted 9 August, 2023; v1 submitted 18 October, 2022;
originally announced October 2022.
-
Equivalence of coupled parametric oscillator dynamics to Lagrange multiplier primal-dual optimization
Authors:
Sri Krishna Vadlamani,
Tianyao Patrick Xiao,
Eli Yablonovitch
Abstract:
There has been a recent surge of interest in physics-based solvers for combinatorial optimization problems. We present a dynamical solver for the Ising problem that is comprised of a network of coupled parametric oscillators and show that it implements Lagrange multiplier constrained optimization. We show that the pump depletion effect, which is intrinsic to parametric oscillators, enforces binary…
▽ More
There has been a recent surge of interest in physics-based solvers for combinatorial optimization problems. We present a dynamical solver for the Ising problem that is comprised of a network of coupled parametric oscillators and show that it implements Lagrange multiplier constrained optimization. We show that the pump depletion effect, which is intrinsic to parametric oscillators, enforces binary constraints and enables the system's continuous analog variables to converge to the optimal binary solutions to the optimization problem. Moreover, there is an exact correspondence between the equations of motion for the coupled oscillators and the update rules in the primal-dual method of Lagrange multipliers. Though our analysis is performed using electrical LC oscillators, it can be generalized to any system of coupled parametric oscillators. We simulate the dynamics of the coupled oscillator system and demonstrate that the performance of the solver on a set of benchmark problems is comparable to the best-known results obtained by digital algorithms in the literature.
△ Less
Submitted 5 April, 2022;
originally announced April 2022.
-
Metaplastic and Energy-Efficient Biocompatible Graphene Artificial Synaptic Transistors for Enhanced Accuracy Neuromorphic Computing
Authors:
Dmitry Kireev,
Samuel Liu,
Harrison **,
T. Patrick Xiao,
Christopher H. Bennett,
Deji Akinwande,
Jean Anne Incorvia
Abstract:
CMOS-based computing systems that employ the von Neumann architecture are relatively limited when it comes to parallel data storage and processing. In contrast, the human brain is a living computational signal processing unit that operates with extreme parallelism and energy efficiency. Although numerous neuromorphic electronic devices have emerged in the last decade, most of them are rigid or con…
▽ More
CMOS-based computing systems that employ the von Neumann architecture are relatively limited when it comes to parallel data storage and processing. In contrast, the human brain is a living computational signal processing unit that operates with extreme parallelism and energy efficiency. Although numerous neuromorphic electronic devices have emerged in the last decade, most of them are rigid or contain materials that are toxic to biological systems. In this work, we report on biocompatible bilayer graphene-based artificial synaptic transistors (BLAST) capable of mimicking synaptic behavior. The BLAST devices leverage a dry ion-selective membrane, enabling long-term potentiation, with ~50 aJ/m^2 switching energy efficiency, at least an order of magnitude lower than previous reports on two-dimensional material-based artificial synapses. The devices show unique metaplasticity, a useful feature for generalizable deep neural networks, and we demonstrate that metaplastic BLASTs outperform ideal linear synapses in classic image classification tasks. With switching energy well below the 1 fJ energy estimated per biological synapse, the proposed devices are powerful candidates for bio-interfaced online learning, bridging the gap between artificial and biological neural networks.
△ Less
Submitted 8 March, 2022;
originally announced March 2022.
-
Shape-Dependent Multi-Weight Magnetic Artificial Synapses for Neuromorphic Computing
Authors:
Thomas Leonard,
Samuel Liu,
Mahshid Alamdar,
Can Cui,
Otitoaleke G. Akinola,
Lin Xue,
T. Patrick Xiao,
Joseph S. Friedman,
Matthew J. Marinella,
Christopher H. Bennett,
Jean Anne C. Incorvia
Abstract:
In neuromorphic computing, artificial synapses provide a multi-weight conductance state that is set based on inputs from neurons, analogous to the brain. Additional properties of the synapse beyond multiple weights can be needed, and can depend on the application, requiring the need for generating different synapse behaviors from the same materials. Here, we measure artificial synapses based on ma…
▽ More
In neuromorphic computing, artificial synapses provide a multi-weight conductance state that is set based on inputs from neurons, analogous to the brain. Additional properties of the synapse beyond multiple weights can be needed, and can depend on the application, requiring the need for generating different synapse behaviors from the same materials. Here, we measure artificial synapses based on magnetic materials that use a magnetic tunnel junction and a magnetic domain wall. By fabricating lithographic notches in a domain wall track underneath a single magnetic tunnel junction, we achieve 4-5 stable resistance states that can be repeatably controlled electrically using spin orbit torque. We analyze the effect of geometry on the synapse behavior, showing that a trapezoidal device has asymmetric weight updates with high controllability, while a straight device has higher stochasticity, but with stable resistance levels. The device data is input into neuromorphic computing simulators to show the usefulness of application-specific synaptic functions. Implementing an artificial neural network applied on streamed Fashion-MNIST data, we show that the trapezoidal magnetic synapse can be used as a metaplastic function for efficient online learning. Implementing a convolutional neural network for CIFAR-100 image recognition, we show that the straight magnetic synapse achieves near-ideal inference accuracy, due to the stability of its resistance levels. This work shows multi-weight magnetic synapses are a feasible technology for neuromorphic computing and provides design guidelines for emerging artificial synapse technologies.
△ Less
Submitted 17 February, 2022; v1 submitted 22 November, 2021;
originally announced November 2021.
-
On the Accuracy of Analog Neural Network Inference Accelerators
Authors:
T. Patrick Xiao,
Ben Feinberg,
Christopher H. Bennett,
Venkatraman Prabhakar,
Prashant Saxena,
Vineet Agrawal,
Sapan Agarwal,
Matthew J. Marinella
Abstract:
Specialized accelerators have recently garnered attention as a method to reduce the power consumption of neural network inference. A promising category of accelerators utilizes nonvolatile memory arrays to both store weights and perform $\textit{in situ}$ analog computation inside the array. While prior work has explored the design space of analog accelerators to optimize performance and energy ef…
▽ More
Specialized accelerators have recently garnered attention as a method to reduce the power consumption of neural network inference. A promising category of accelerators utilizes nonvolatile memory arrays to both store weights and perform $\textit{in situ}$ analog computation inside the array. While prior work has explored the design space of analog accelerators to optimize performance and energy efficiency, there is seldom a rigorous evaluation of the accuracy of these accelerators. This work shows how architectural design decisions, particularly in map** neural network parameters to analog memory cells, influence inference accuracy. When evaluated using ResNet50 on ImageNet, the resilience of the system to analog non-idealities - cell programming errors, analog-to-digital converter resolution, and array parasitic resistances - all improve when analog quantities in the hardware are made proportional to the weights in the network. Moreover, contrary to the assumptions of prior work, nearly equivalent resilience to cell imprecision can be achieved by fully storing weights as analog quantities, rather than spreading weight bits across multiple devices, often referred to as bit slicing. By exploiting proportionality, analog system designers have the freedom to match the precision of the hardware to the needs of the algorithm, rather than attempting to guarantee the same level of precision in the intermediate results as an equivalent digital accelerator. This ultimately results in an analog accelerator that is more accurate, more robust to analog errors, and more energy-efficient.
△ Less
Submitted 3 February, 2022; v1 submitted 2 September, 2021;
originally announced September 2021.
-
Domain Wall-Magnetic Tunnel Junction Spin Orbit Torque Devices and Circuits for In-Memory Computing
Authors:
Mahshid Alamdar,
Thomas Leonard,
Can Cui,
Bishweshwor P. Rimal,
Lin Xue,
Otitoaleke G. Akinola,
T. Patrick Xiao,
Joseph S. Friedman,
Christopher H. Bennett,
Matthew J. Marinella,
Jean Anne C. Incorvia
Abstract:
There are pressing problems with traditional computing, especially for accomplishing data-intensive and real-time tasks, that motivate the development of in-memory computing devices to both store information and perform computation. Magnetic tunnel junction (MTJ) memory elements can be used for computation by manipulating a domain wall (DW), a transition region between magnetic domains. But, these…
▽ More
There are pressing problems with traditional computing, especially for accomplishing data-intensive and real-time tasks, that motivate the development of in-memory computing devices to both store information and perform computation. Magnetic tunnel junction (MTJ) memory elements can be used for computation by manipulating a domain wall (DW), a transition region between magnetic domains. But, these devices have suffered from challenges: spin transfer torque (STT) switching of a DW requires high current, and the multiple etch steps needed to create an MTJ pillar on top of a DW track has led to reduced tunnel magnetoresistance (TMR). These issues have limited experimental study of devices and circuits. Here, we study prototypes of three-terminal domain wall-magnetic tunnel junction (DW-MTJ) in-memory computing devices that can address data processing bottlenecks and resolve these challenges by using perpendicular magnetic anisotropy (PMA), spin-orbit torque (SOT) switching, and an optimized lithography process to produce average device tunnel magnetoresistance TMR = 164%, resistance-area product RA = 31 Ω-μm^2, close to the RA of the unpatterned film, and lower switching current density compared to using spin transfer torque. A two-device circuit shows bit propagation between devices. Device initialization variation in switching voltage is shown to be curtailed to 7% by controlling the DW initial position, which we show corresponds to 96% accuracy in a DW-MTJ full adder simulation. These results make strides in using MTJs and DWs for in-memory and neuromorphic computing applications.
△ Less
Submitted 26 October, 2020;
originally announced October 2020.
-
Physics Successfully Implements Lagrange Multiplier Optimization
Authors:
Sri Krishna Vadlamani,
Tianyao Patrick Xiao,
Eli Yablonovitch
Abstract:
Optimization is a major part of human effort. While being mathematical, optimization is also built into physics. For example, physics has the principle of Least Action, the principle of Minimum Entropy Generation, and the Variational Principle. Physics also has physical annealing which, of course, preceded computational Simulated Annealing. Physics has the Adiabatic Principle, which in its quantum…
▽ More
Optimization is a major part of human effort. While being mathematical, optimization is also built into physics. For example, physics has the principle of Least Action, the principle of Minimum Entropy Generation, and the Variational Principle. Physics also has physical annealing which, of course, preceded computational Simulated Annealing. Physics has the Adiabatic Principle, which in its quantum form is called Quantum Annealing. Thus, physical machines can solve the mathematical problem of optimization, including constraints. Binary constraints can be built into the physical optimization. In that case the machines are digital in the same sense that a flip-flop is digital. A wide variety of machines have had recent success at optimizing the Ising magnetic energy. We demonstrate in this paper that almost all those machines perform optimization according to the Principle of Minimum Entropy Generation as put forth by Onsager. Further, we show that this optimization is in fact equivalent to Lagrange multiplier optimization for constrained problems. We find that the physical gain coefficients which drive those systems actually play the role of the corresponding Lagrange Multipliers.
△ Less
Submitted 21 July, 2020; v1 submitted 11 July, 2020;
originally announced July 2020.
-
Device-aware inference operations in SONOS nonvolatile memory arrays
Authors:
Christopher H. Bennett,
T. Patrick Xiao,
Ryan Dellana,
Vineet Agrawal,
Ben Feinberg,
Venkatraman Prabhakar,
Krishnaswamy Ramkumar,
Long Hinh,
Swatilekha Saha,
Vijay Raghavan,
Ramesh Chettuvetty,
Sapan Agarwal,
Matthew J. Marinella
Abstract:
Non-volatile memory arrays can deploy pre-trained neural network models for edge inference. However, these systems are affected by device-level noise and retention issues. Here, we examine damage caused by these effects, introduce a mitigation strategy, and demonstrate its use in fabricated array of SONOS (Silicon-Oxide-Nitride-Oxide-Silicon) devices. On MNIST, fashion-MNIST, and CIFAR-10 tasks, o…
▽ More
Non-volatile memory arrays can deploy pre-trained neural network models for edge inference. However, these systems are affected by device-level noise and retention issues. Here, we examine damage caused by these effects, introduce a mitigation strategy, and demonstrate its use in fabricated array of SONOS (Silicon-Oxide-Nitride-Oxide-Silicon) devices. On MNIST, fashion-MNIST, and CIFAR-10 tasks, our approach increases resilience to synaptic noise and drift. We also show strong performance can be realized with ADCs of 5-8 bits precision.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
Evaluating complexity and resilience trade-offs in emerging memory inference machines
Authors:
Christopher H. Bennett,
Ryan Dellana,
T. Patrick Xiao,
Ben Feinberg,
Sapan Agarwal,
Suma Cardwell,
Matthew J. Marinella,
William Severa,
Brad Aimone
Abstract:
Neuromorphic-style inference only works well if limited hardware resources are maximized properly, e.g. accuracy continues to scale with parameters and complexity in the face of potential disturbance. In this work, we use realistic crossbar simulations to highlight that compact implementations of deep neural networks are unexpectedly susceptible to collapse from multiple system disturbances. Our w…
▽ More
Neuromorphic-style inference only works well if limited hardware resources are maximized properly, e.g. accuracy continues to scale with parameters and complexity in the face of potential disturbance. In this work, we use realistic crossbar simulations to highlight that compact implementations of deep neural networks are unexpectedly susceptible to collapse from multiple system disturbances. Our work proposes a middle path towards high performance and strong resilience utilizing the Mosaics framework, and specifically by re-using synaptic connections in a recurrent neural network implementation that possesses a natural form of noise-immunity.
△ Less
Submitted 25 February, 2020;
originally announced March 2020.
-
Plasticity-Enhanced Domain-Wall MTJ Neural Networks for Energy-Efficient Online Learning
Authors:
Christopher H. Bennett,
T. Patrick Xiao,
Can Cui,
Naimul Hassan,
Otitoaleke G. Akinola,
Jean Anne C. Incorvia,
Alvaro Velasquez,
Joseph S. Friedman,
Matthew J. Marinella
Abstract:
Machine learning implements backpropagation via abundant training samples. We demonstrate a multi-stage learning system realized by a promising non-volatile memory device, the domain-wall magnetic tunnel junction (DW-MTJ). The system consists of unsupervised (clustering) as well as supervised sub-systems, and generalizes quickly (with few samples). We demonstrate interactions between physical prop…
▽ More
Machine learning implements backpropagation via abundant training samples. We demonstrate a multi-stage learning system realized by a promising non-volatile memory device, the domain-wall magnetic tunnel junction (DW-MTJ). The system consists of unsupervised (clustering) as well as supervised sub-systems, and generalizes quickly (with few samples). We demonstrate interactions between physical properties of this device and optimal implementation of neuroscience-inspired plasticity learning rules, and highlight performance on a suite of tasks. Our energy analysis confirms the value of the approach, as the learning budget stays below 20 $μJ$ even for large tasks used typically in machine learning.
△ Less
Submitted 4 March, 2020;
originally announced March 2020.
-
Ultra-Efficient Thermophotovoltaics Exploiting Spectral Filtering by the Photovoltaic Band-Edge
Authors:
Vidya Ganapati,
T. Patrick Xiao,
Eli Yablonovitch
Abstract:
Thermophotovotaics convert thermal radiation from local heat sources to electricity. A new breakthrough in creating highly efficient thin-film solar cells can potentially enable thermophotovoltaic systems with unprecedented high efficiency. The current 28.8% single-junction solar efficiency record, by Alta Devices, was achieved by recognizing that a good solar cell needs to reflect infrared band-e…
▽ More
Thermophotovotaics convert thermal radiation from local heat sources to electricity. A new breakthrough in creating highly efficient thin-film solar cells can potentially enable thermophotovoltaic systems with unprecedented high efficiency. The current 28.8% single-junction solar efficiency record, by Alta Devices, was achieved by recognizing that a good solar cell needs to reflect infrared band-edge radiation at the back surface, to effectively recycle infrared luminescent photons. The effort to reflect band-edge luminescence in solar cells has serendipitously created the technology to reflect all infrared wavelengths, which can revolutionize thermophotovoltaics. We have never before had such high back reflectivity for sub-bandgap radiation, permitting step-function spectral control for the first time. Thus, contemporary efficiency advances in solar photovoltaic cells create the possibility of realizing a $>50\%$ efficient thermophotovoltaic system.
△ Less
Submitted 7 February, 2018; v1 submitted 10 November, 2016;
originally announced November 2016.