Search | arXiv e-print repository

Training of Physical Neural Networks

Authors: Ali Momeni, Babak Rahmani, Benjamin Scellier, Logan G. Wright, Peter L. McMahon, Clara C. Wanjura, Yuhang Li, Anas Skalli, Natalia G. Berloff, Tatsuhiro Onodera, Ilker Oguz, Francesco Morichetti, Philipp del Hougne, Manuel Le Gallo, Abu Sebastian, Azalia Mirhoseini, Cheng Zhang, Danijela Marković, Daniel Brunner, Christophe Moser, Sylvain Gigan, Florian Marquardt, Aydogan Ozcan, Julie Grollier, Andrea J. Liu , et al. (3 additional authors not shown)

Abstract: Physical neural networks (PNNs) are a class of neural-like networks that leverage the properties of physical systems to perform computation. While PNNs are so far a niche research area with small-scale laboratory demonstrations, they are arguably one of the most underappreciated important opportunities in modern AI. Could we train AI models 1000x larger than current ones? Could we do this and also… ▽ More Physical neural networks (PNNs) are a class of neural-like networks that leverage the properties of physical systems to perform computation. While PNNs are so far a niche research area with small-scale laboratory demonstrations, they are arguably one of the most underappreciated important opportunities in modern AI. Could we train AI models 1000x larger than current ones? Could we do this and also have them perform inference locally and privately on edge devices, such as smartphones or sensors? Research over the past few years has shown that the answer to all these questions is likely "yes, with enough research": PNNs could one day radically change what is possible and practical for AI systems. To do this will however require rethinking both how AI models work, and how they are trained - primarily by considering the problems through the constraints of the underlying hardware physics. To train PNNs at large scale, many methods including backpropagation-based and backpropagation-free approaches are now being explored. These methods have various trade-offs, and so far no method has been shown to scale to the same scale and performance as the backpropagation algorithm widely used in deep learning today. However, this is rapidly changing, and a diverse ecosystem of training techniques provides clues for how PNNs may one day be utilized to create both more efficient realizations of current-scale AI models, and to enable unprecedented-scale models. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 29 pages, 4 figures

arXiv:2111.04961 [pdf]

Convolutional Neural Networks with Radio-Frequency Spintronic Nano-Devices

Authors: Nathan Leroux, Arnaud De Riz, Dédalo Sanz-Hernández, Danijela Marković, Alice Mizrahi, Julie Grollier

Abstract: Convolutional neural networks are state-of-the-art and ubiquitous in modern signal processing and machine vision. Nowadays, hardware solutions based on emerging nanodevices are designed to reduce the power consumption of these networks. Spintronics devices are promising for information processing because of the various neural and synaptic functionalities they offer. However, due to their low OFF/O… ▽ More Convolutional neural networks are state-of-the-art and ubiquitous in modern signal processing and machine vision. Nowadays, hardware solutions based on emerging nanodevices are designed to reduce the power consumption of these networks. Spintronics devices are promising for information processing because of the various neural and synaptic functionalities they offer. However, due to their low OFF/ON ratio, performing all the multiplications required for convolutions in a single step with a crossbar array of spintronic memories would cause sneak-path currents. Here we present an architecture where synaptic communications have a frequency selectivity that prevents crosstalk caused by sneak-path currents. We first demonstrate how a chain of spintronic resonators can function as synapses and make convolutions by sequentially rectifying radio-frequency signals encoding consecutive sets of inputs. We show that a parallel implementation is possible with multiple chains of spintronic resonators to avoid storing intermediate computational steps in memory. We propose two different spatial arrangements for these chains. For each of them, we explain how to tune many artificial synapses simultaneously, exploiting the synaptic weight sharing specific to convolutions. We show how information can be transmitted between convolutional layers by using spintronic oscillators as artificial microwave neurons. Finally, we simulate a network of these radio-frequency resonators and spintronic oscillators to solve the MNIST handwritten digits dataset, and obtain results comparable to software convolutional neural networks. Since it can run convolutional neural networks fully in parallel in a single step with nano devices, the architecture proposed in this paper is promising for embedded applications requiring machine vision, such as autonomous driving. △ Less

Submitted 9 November, 2021; originally announced November 2021.

arXiv:2003.04711 [pdf]

Physics for Neuromorphic Computing

Authors: Danijela Markovic, Alice Mizrahi, Damien Querlioz, Julie Grollier

Abstract: Neuromorphic computing takes inspiration from the brain to create energy efficient hardware for information processing, capable of highly sophisticated tasks. In this article, we make the case that building this new hardware necessitates reinventing electronics. We show that research in physics and material science will be key to create artificial nano-neurons and synapses, to connect them togethe… ▽ More Neuromorphic computing takes inspiration from the brain to create energy efficient hardware for information processing, capable of highly sophisticated tasks. In this article, we make the case that building this new hardware necessitates reinventing electronics. We show that research in physics and material science will be key to create artificial nano-neurons and synapses, to connect them together in huge numbers, to organize them in complex systems, and to compute with them efficiently. We describe how some researchers choose to take inspiration from artificial intelligence to move forward in this direction, whereas others prefer taking inspiration from neuroscience, and we highlight recent striking results obtained with these two approaches. Finally, we discuss the challenges and perspectives in neuromorphic physics, which include develo** the algorithms and the hardware hand in hand, making significant advances with small toy systems, as well as building large scale networks. △ Less

Submitted 8 March, 2020; originally announced March 2020.

arXiv:1811.00309 [pdf, other]

doi 10.1063/1.5079305

Reservoir computing with the frequency, phase and amplitude of spin-torque nano-oscillators

Authors: Danijela Marković, Nathan Leroux, Mathieu Riou, Flavio Abreu Araujo, Jacob Torrejon, Damien Querlioz, Akio Fukushima, Shinji Yuasa, Juan Trastoy, Paolo Bortolotti, Julie Grollier

Abstract: Spin-torque nano-oscillators can emulate neurons at the nanoscale. Recent works show that the non-linearity of their oscillation amplitude can be leveraged to achieve waveform classification for an input signal encoded in the amplitude of the input voltage. Here we show that the frequency and the phase of the oscillator can also be used to recognize waveforms. For this purpose, we phase-lock the o… ▽ More Spin-torque nano-oscillators can emulate neurons at the nanoscale. Recent works show that the non-linearity of their oscillation amplitude can be leveraged to achieve waveform classification for an input signal encoded in the amplitude of the input voltage. Here we show that the frequency and the phase of the oscillator can also be used to recognize waveforms. For this purpose, we phase-lock the oscillator to the input waveform, which carries information in its modulated frequency. In this way we considerably decrease amplitude, phase and frequency noise. We show that this method allows classifying sine and square waveforms with an accuracy above 99% when decoding the output from the oscillator amplitude, phase or frequency. We find that recognition rates are directly related to the noise and non-linearity of each variable. These results prove that spin-torque nano-oscillators offer an interesting platform to implement different computing schemes leveraging their rich dynamical features. △ Less

Submitted 1 November, 2018; originally announced November 2018.

arXiv:1310.5527 [pdf, other]

doi 10.1016/j.physrep.2013.11.002

Power laws and Self-Organized Criticality in Theory and Nature

Authors: Dimitrije Markovic, Claudius Gros

Abstract: Power laws and distributions with heavy tails are common features of many experimentally studied complex systems, like the distribution of the sizes of earthquakes and solar flares, or the duration of neuronal avalanches in the brain. Previously, researchers surmised that a single general concept may act as a unifying underlying generative mechanism, with the theory of self organized criticality b… ▽ More Power laws and distributions with heavy tails are common features of many experimentally studied complex systems, like the distribution of the sizes of earthquakes and solar flares, or the duration of neuronal avalanches in the brain. Previously, researchers surmised that a single general concept may act as a unifying underlying generative mechanism, with the theory of self organized criticality being a weighty contender. Consequently, a substantial amount of effort has gone into develo** new and extended models and, hitherto, three classes of models have emerged. The first line of models is based on a separation between the time scales of drive and dissipation, and includes the original sandpile model and its extensions, like the dissipative earthquake model. Within this approach the steady state is close to criticality in terms of an absorbing phase transition. The second line of models is based on external drives and internal dynamics competing on similar time scales and includes the coherent noise model, which has a non-critical steady state characterized by heavy-tailed distributions. The third line of models proposes a non-critical self-organizing state, being guided by an optimization principle, such as the concept of highly optimized tolerance. We present a comparative overview regarding distinct modeling approaches together with a discussion of their potential relevance as underlying generative models for real-world phenomena. The complexity of physical and biological scaling phenomena has been found to transcend the explanatory power of individual paradigmal concepts. The interaction between theoretical development and experimental observations has been very fruitful, leading to a series of novel concepts and insights. △ Less

Submitted 12 December, 2013; v1 submitted 21 October, 2013; originally announced October 2013.

Comments: Review article, Physics Reports, in press

Journal ref: Physics Reports 536, 41-74 (2014)

arXiv:0906.4905 [pdf, ps, other]

doi 10.1088/1367-2630/11/7/073002

Vertex routing models

Authors: Dimitrije Markovic, Claudius Gros

Abstract: A class of models describing the flow of information within networks via routing processes is proposed and investigated, concentrating on the effects of memory traces on the global properties. The long-term flow of information is governed by cyclic attractors, allowing to define a measure for the information centrality of a vertex given by the number of attractors passing through this vertex. We… ▽ More A class of models describing the flow of information within networks via routing processes is proposed and investigated, concentrating on the effects of memory traces on the global properties. The long-term flow of information is governed by cyclic attractors, allowing to define a measure for the information centrality of a vertex given by the number of attractors passing through this vertex. We find the number of vertices having a non-zero information centrality to be extensive/sub-extensive for models with/without a memory trace in the thermodynamic limit. We evaluate the distribution of the number of cycles, of the cycle length and of the maximal basins of attraction, finding a complete scaling collapse in the thermodynamic limit for the latter. Possible implications of our results on the information flow in social networks are discussed. △ Less

Submitted 26 June, 2009; originally announced June 2009.

Comments: 12 pages, 6 figures

Showing 1–6 of 6 results for author: Markovic, D