-
A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference
Authors:
Manuel Le Gallo,
Riduan Khaddam-Aljameh,
Milos Stanisavljevic,
Athanasios Vasilopoulos,
Benedikt Kersting,
Martino Dazzi,
Geethan Karunaratne,
Matthias Braendli,
Abhairaj Singh,
Silvia M. Mueller,
Julian Buechel,
Xavier Timoneda,
Vinay Joshi,
Urs Egger,
Angelo Garofalo,
Anastasios Petropoulos,
Theodore Antonakopoulos,
Kevin Brew,
Samuel Choi,
Injo Ok,
Timothy Philip,
Victor Chan,
Claire Silvestre,
Ishtiaq Ahsan,
Nicole Saulnier
, et al. (4 additional authors not shown)
Abstract:
The need to repeatedly shuttle around synaptic weight values from memory to processing units has been a key source of energy inefficiency associated with hardware implementation of artificial neural networks. Analog in-memory computing (AIMC) with spatially instantiated synaptic weights holds high promise to overcome this challenge, by performing matrix-vector multiplications (MVMs) directly withi…
▽ More
The need to repeatedly shuttle around synaptic weight values from memory to processing units has been a key source of energy inefficiency associated with hardware implementation of artificial neural networks. Analog in-memory computing (AIMC) with spatially instantiated synaptic weights holds high promise to overcome this challenge, by performing matrix-vector multiplications (MVMs) directly within the network weights stored on a chip to execute an inference workload. However, to achieve end-to-end improvements in latency and energy consumption, AIMC must be combined with on-chip digital operations and communication to move towards configurations in which a full inference workload is realized entirely on-chip. Moreover, it is highly desirable to achieve high MVM and inference accuracy without application-wise re-tuning of the chip. Here, we present a multi-core AIMC chip designed and fabricated in 14-nm complementary metal-oxide-semiconductor (CMOS) technology with backend-integrated phase-change memory (PCM). The fully-integrated chip features 64 256x256 AIMC cores interconnected via an on-chip communication network. It also implements the digital activation functions and processing involved in ResNet convolutional neural networks and long short-term memory (LSTM) networks. We demonstrate near software-equivalent inference accuracy with ResNet and LSTM networks while implementing all the computations associated with the weight layers and the activation functions on-chip. The chip can achieve a maximal throughput of 63.1 TOPS at an energy efficiency of 9.76 TOPS/W for 8-bit input/output matrix-vector multiplications.
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
A system design approach toward integrated cryogenic quantum control systems
Authors:
Mridula Prathapan,
Peter Mueller,
David Heim,
Maria Vittoria Oropallo,
Matthias Braendli,
Pier Andrea Francese,
Marcel Kossel,
Andrea Ruffino,
Cezar Zota,
Eunjung Cha,
Thomas Morf
Abstract:
In this paper, we provide a system level perspective on the design of control electronics for large scale quantum systems. Quantum computing systems with high-fidelity control and readout, coherent coupling, calibrated gates, and reconfigurable circuits with low error rates are expected to have superior quantum volumes. Cryogenic CMOS plays a crucial role in the realization of scalable quantum com…
▽ More
In this paper, we provide a system level perspective on the design of control electronics for large scale quantum systems. Quantum computing systems with high-fidelity control and readout, coherent coupling, calibrated gates, and reconfigurable circuits with low error rates are expected to have superior quantum volumes. Cryogenic CMOS plays a crucial role in the realization of scalable quantum computers, by minimizing the feature size, lowering the cost, power consumption, and implementing low latency error correction. Our approach toward achieving scalable feed-back based control systems includes the design of memory based arbitrary waveform generators (AWG's), wide band radio frequency analog to digital converters, integrated amplifier chain, and state discriminators that can be synchronized with gate sequences. Digitally assisted designs, when implemented in an advanced CMOS node such as 7 nm can reap the benefits of low power due to scaling. A qubit readout chain demands several amplification stages before the digitizer. We propose the co-integration of our in-house developed InP HEMT LNAs with CMOS LNA stages to achieve the required gain at the digitizer input with minimal area. Our approach using high impedance matching between the HEMT LNA and the cryogenic CMOS receiver can relax the design constraints of an inverter-based CMOS LNA, paving the way toward a fully integrated qubit readout chain. The qubit state discriminator consists of a digital signal processor that computes the qubit state from the digitizer output and a pre-determined threshold. The proposed system realizes feedback-based optimal control for error mitigation and reduction of the required data rate through the serial interface to room temperature electronics.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
A cryogenic SRAM based arbitrary waveform generator in 14 nm for spin qubit control
Authors:
Mridula Prathapan,
Peter Mueller,
Christian Menolfi,
Matthias Braendli,
Marcel Kossel,
Pier Andrea Francese,
David Heim,
Maria Vittoria Oropallo,
Andrea Ruffino,
Cezar Zota,
Thomas Morf
Abstract:
Realization of qubit gate sequences require coherent microwave control pulses with programmable amplitude, duration, spacing and phase. We propose an SRAM based arbitrary waveform generator for cryogenic control of spin qubits. We demonstrate in this work, the cryogenic operation of a fully programmable radio frequency arbitrary waveform generator in 14 nm FinFET technology. The waveform sequence…
▽ More
Realization of qubit gate sequences require coherent microwave control pulses with programmable amplitude, duration, spacing and phase. We propose an SRAM based arbitrary waveform generator for cryogenic control of spin qubits. We demonstrate in this work, the cryogenic operation of a fully programmable radio frequency arbitrary waveform generator in 14 nm FinFET technology. The waveform sequence from a control processor can be stored in an SRAM memory array, which can be programmed in real time. The waveform pattern is converted to microwave pulses by a source-series-terminated digital to analog converter. The chip is operational at 4 K, capable of generating an arbitrary envelope shape at the desired carrier frequency. Total power consumption of the AWG is 40-140mW at 4 K, depending upon the baud rate. A wide signal band of 1-17 GHz is measured at 4 K, while multiple qubit control can be achieved using frequency division multiplexing at an average spurious free dynamic range of 40 dB. This work paves the way to optimal qubit control and closed loop feedback control, which is necessary to achieve low latency error mitigation
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
TEMPus VoLA: the Timed Epstein Multi-pressure Vessel at Low Accelerations
Authors:
Holly L. Capelo,
Jonas Kühn,
Antoine Pommerol,
Daniele Piazza,
Mathias Brändli,
Romain Cerubini,
Bernhard Jost,
Jean-David Bodénan,
Thomas Planchet,
Stefano Spadaccia,
Rainer Schräpler,
Jürgen Blum,
Maria Schönbächler,
Lucio Mayer,
Nicolas Thomas
Abstract:
The field of planetary system formation relies extensively on our understanding of the aerodynamic interaction between gas and dust in protoplanetary disks. Of particular importance are the mechanisms triggering fluid instabilities and clum** of dust particles into aggregates, and their subsequent inclusion into planetesimals. We introduce the Timed Epstein Multi-pressure vessel at Low Accelerat…
▽ More
The field of planetary system formation relies extensively on our understanding of the aerodynamic interaction between gas and dust in protoplanetary disks. Of particular importance are the mechanisms triggering fluid instabilities and clum** of dust particles into aggregates, and their subsequent inclusion into planetesimals. We introduce the Timed Epstein Multi-pressure vessel at Low Accelerations (TEMPusVoLA), which is an experimental apparatus for the study of particle dynamics and rarefied gas under micro-gravity conditions. This facility contains three experiments dedicated to studying aerodynamic processes, i) the development of pressure gradients due to collective particle-gas interaction, ii) the drag coefficients of dust aggregates with variable particle-gas velocity, iii) the effect of dust on the profile of a shear flow and resultant onset of turbulence. The approach is innovative with respect to previous experiments because we access an untouched parameter space in terms of dust particle packing fraction, and Knudsen, Stokes, and Reynolds numbers. The mechanisms investigated are also relevant for our understanding of the emission of dust from active surfaces such as cometary nuclei and new experimental data will help interpreting previous datasets (Rosetta) and prepare future spacecraft observations (Comet Interceptor). We report on the performance of the experiments, which has been tested over the course of multiple flight campaigns. The project is now ready to benefit from additional flight campaigns, to cover a wide parameter space. The outcome will be a comprehensive framework to test models and numerical recipes for studying collective dust particle aerodynamics under space-like conditions.
△ Less
Submitted 31 August, 2022;
originally announced September 2022.