Search | arXiv e-print repository

Training of Physical Neural Networks

Authors: Ali Momeni, Babak Rahmani, Benjamin Scellier, Logan G. Wright, Peter L. McMahon, Clara C. Wanjura, Yuhang Li, Anas Skalli, Natalia G. Berloff, Tatsuhiro Onodera, Ilker Oguz, Francesco Morichetti, Philipp del Hougne, Manuel Le Gallo, Abu Sebastian, Azalia Mirhoseini, Cheng Zhang, Danijela Marković, Daniel Brunner, Christophe Moser, Sylvain Gigan, Florian Marquardt, Aydogan Ozcan, Julie Grollier, Andrea J. Liu , et al. (3 additional authors not shown)

Abstract: Physical neural networks (PNNs) are a class of neural-like networks that leverage the properties of physical systems to perform computation. While PNNs are so far a niche research area with small-scale laboratory demonstrations, they are arguably one of the most underappreciated important opportunities in modern AI. Could we train AI models 1000x larger than current ones? Could we do this and also… ▽ More Physical neural networks (PNNs) are a class of neural-like networks that leverage the properties of physical systems to perform computation. While PNNs are so far a niche research area with small-scale laboratory demonstrations, they are arguably one of the most underappreciated important opportunities in modern AI. Could we train AI models 1000x larger than current ones? Could we do this and also have them perform inference locally and privately on edge devices, such as smartphones or sensors? Research over the past few years has shown that the answer to all these questions is likely "yes, with enough research": PNNs could one day radically change what is possible and practical for AI systems. To do this will however require rethinking both how AI models work, and how they are trained - primarily by considering the problems through the constraints of the underlying hardware physics. To train PNNs at large scale, many methods including backpropagation-based and backpropagation-free approaches are now being explored. These methods have various trade-offs, and so far no method has been shown to scale to the same scale and performance as the backpropagation algorithm widely used in deep learning today. However, this is rapidly changing, and a diverse ecosystem of training techniques provides clues for how PNNs may one day be utilized to create both more efficient realizations of current-scale AI models, and to enable unprecedented-scale models. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 29 pages, 4 figures

arXiv:2402.17750 [pdf, other]

Scaling on-chip photonic neural processors using arbitrarily programmable wave propagation

Authors: Tatsuhiro Onodera, Martin M. Stein, Benjamin A. Ash, Mandar M. Sohoni, Melissa Bosch, Ryotatsu Yanagimoto, Marc Jankowski, Timothy P. McKenna, Tianyu Wang, Gennady Shvets, Maxim R. Shcherbakov, Logan G. Wright, Peter L. McMahon

Abstract: On-chip photonic processors for neural networks have potential benefits in both speed and energy efficiency but have not yet reached the scale at which they can outperform electronic processors. The dominant paradigm for designing on-chip photonics is to make networks of relatively bulky discrete components connected by one-dimensional waveguides. A far more compact alternative is to avoid explici… ▽ More On-chip photonic processors for neural networks have potential benefits in both speed and energy efficiency but have not yet reached the scale at which they can outperform electronic processors. The dominant paradigm for designing on-chip photonics is to make networks of relatively bulky discrete components connected by one-dimensional waveguides. A far more compact alternative is to avoid explicitly defining any components and instead sculpt the continuous substrate of the photonic processor to directly perform the computation using waves freely propagating in two dimensions. We propose and demonstrate a device whose refractive index as a function of space, $n(x,z)$, can be rapidly reprogrammed, allowing arbitrary control over the wave propagation in the device. Our device, a 2D-programmable waveguide, combines photoconductive gain with the electro-optic effect to achieve massively parallel modulation of the refractive index of a slab waveguide, with an index modulation depth of $10^{-3}$ and approximately $10^4$ programmable degrees of freedom. We used a prototype device with a functional area of $12\,\text{mm}^2$ to perform neural-network inference with up to 49-dimensional input vectors in a single pass, achieving 96% accuracy on vowel classification and 86% accuracy on $7 \times 7$-pixel MNIST handwritten-digit classification. This is a scale beyond that of previous photonic chips relying on discrete components, illustrating the benefit of the continuous-waves paradigm. In principle, with large enough chip area, the reprogrammability of the device's refractive index distribution enables the reconfigurable realization of any passive, linear photonic circuit or device. This promises the development of more compact and versatile photonic systems for a wide range of applications, including optical processing, smart sensing, spectroscopy, and optical communications. △ Less

Submitted 27 February, 2024; originally announced February 2024.

arXiv:2401.06119 [pdf, other]

Highly multimode visible squeezed light with programmable spectral correlations through broadband up-conversion

Authors: Federico Presutti, Logan G. Wright, Shi-Yuan Ma, Tianyu Wang, Benjamin K. Malia, Tatsuhiro Onodera, Peter L. McMahon

Abstract: Multimode squeezed states of light have been proposed as a resource for achieving quantum advantage in computing and sensing. Recent experiments that demonstrate multimode Gaussian states to this end have most commonly opted for spatial or temporal modes, whereas a complete system based on frequency modes has yet to be realized. Instead, we show how to use the frequency modes simultaneously squeez… ▽ More Multimode squeezed states of light have been proposed as a resource for achieving quantum advantage in computing and sensing. Recent experiments that demonstrate multimode Gaussian states to this end have most commonly opted for spatial or temporal modes, whereas a complete system based on frequency modes has yet to be realized. Instead, we show how to use the frequency modes simultaneously squeezed in a conventional, single-spatial-mode, optical parametric amplifier when pumped by ultrashort pulses. Specifically, we show how adiabatic frequency conversion can be used not only to convert the quantum state from infrared to visible wavelengths, but to concurrently manipulate the joint spectrum. This near unity-efficiency quantum frequency conversion, over a bandwidth >45 THz and, to our knowledge, the broadest to date, allows us to measure the state with an electron-multiplying CCD (EMCCD) camera-based spectrometer, at non-cryogenic temperatures. We demonstrate the squeezing of >400 frequency modes, with a mean of approximately 700 visible photons per shot. Our work shows how many-mode quantum states of light can be generated, manipulated, and measured with efficient use of hardware resources -- in our case, using one pulsed laser, two nonlinear crystals, and one camera. This ability to produce, with modest hardware resources, large multimode squeezed states with partial programmability motivates the use of frequency encoding for photonics-based quantum information processing. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2308.00088 [pdf, other]

doi 10.1038/s42254-023-00645-5

The physics of optical computing

Authors: Peter L. McMahon

Abstract: There has been a resurgence of interest in optical computing over the past decade, both in academia and in industry, with much of the excitement centered around special-purpose optical computers for neural-network processing. Optical computing has been a topic of periodic study for over 50 years, including for neural networks three decades ago, and a wide variety of optical-computing schemes and a… ▽ More There has been a resurgence of interest in optical computing over the past decade, both in academia and in industry, with much of the excitement centered around special-purpose optical computers for neural-network processing. Optical computing has been a topic of periodic study for over 50 years, including for neural networks three decades ago, and a wide variety of optical-computing schemes and architectures have been proposed. In this paper we provide a systematic explanation of why and how optics might be able to give speed or energy-efficiency benefits over electronics for computing, enumerating 11 features of optics that can be harnessed when designing an optical computer. One often-mentioned motivation for optical computing -- that the speed of light $c$ is fast -- is not a key differentiating physical property of optics for computing; understanding where an advantage could come from is more subtle. We discuss how gaining an advantage over state-of-the-art electronic processors will likely only be achievable by careful design that harnesses more than one of the 11 features, while avoiding a number of pitfalls that we describe. △ Less

Submitted 31 July, 2023; originally announced August 2023.

Comments: 31 pages; 11 figures

Journal ref: Nature Reviews Physics (2023)

arXiv:2307.15712 [pdf, other]

Quantum-noise-limited optical neural networks operating at a few quanta per activation

Authors: Shi-Yuan Ma, Tianyu Wang, Jérémie Laydevant, Logan G. Wright, Peter L. McMahon

Abstract: Analog physical neural networks, which hold promise for improved energy efficiency and speed compared to digital electronic neural networks, are nevertheless typically operated in a relatively high-power regime so that the signal-to-noise ratio (SNR) is large (>10). What happens if an analog system is instead operated in an ultra-low-power regime, in which the behavior of the system becomes highly… ▽ More Analog physical neural networks, which hold promise for improved energy efficiency and speed compared to digital electronic neural networks, are nevertheless typically operated in a relatively high-power regime so that the signal-to-noise ratio (SNR) is large (>10). What happens if an analog system is instead operated in an ultra-low-power regime, in which the behavior of the system becomes highly stochastic and the noise is no longer a small perturbation on the signal? In this paper, we study this question in the setting of optical neural networks operated in the limit where some layers use only a single photon to cause a neuron activation. Neuron activations in this limit are dominated by quantum noise from the fundamentally probabilistic nature of single-photon detection of weak optical signals. We show that it is possible to train stochastic optical neural networks to perform deterministic image-classification tasks with high accuracy in spite of the extremely high noise (SNR ~ 1) by using a training procedure that directly models the stochastic behavior of photodetection. We experimentally demonstrated MNIST classification with a test accuracy of 98% using an optical neural network with a hidden layer operating in the single-photon regime; the optical energy used to perform the classification corresponds to 0.008 photons per multiply-accumulate (MAC) operation, which is equivalent to 0.003 attojoules of optical energy per MAC. Our experiment used >40x fewer photons per inference than previous state-of-the-art low-optical-energy demonstrations, to achieve the same accuracy of >90%. Our work shows that some extremely stochastic analog systems, including those operating in the limit where quantum noise dominates, can nevertheless be used as layers in neural networks that deterministically perform classification tasks with high accuracy if they are appropriately trained. △ Less

Submitted 28 July, 2023; originally announced July 2023.

Comments: 55 pages, 27 figures

arXiv:2302.10360 [pdf, other]

Optical Transformers

Authors: Maxwell G. Anderson, Shi-Yuan Ma, Tianyu Wang, Logan G. Wright, Peter L. McMahon

Abstract: The rapidly increasing size of deep-learning models has caused renewed and growing interest in alternatives to digital computers to dramatically reduce the energy cost of running state-of-the-art neural networks. Optical matrix-vector multipliers are best suited to performing computations with very large operands, which suggests that large Transformer models could be a good target for optical comp… ▽ More The rapidly increasing size of deep-learning models has caused renewed and growing interest in alternatives to digital computers to dramatically reduce the energy cost of running state-of-the-art neural networks. Optical matrix-vector multipliers are best suited to performing computations with very large operands, which suggests that large Transformer models could be a good target for optical computing. To test this idea, we performed small-scale optical experiments with a prototype accelerator to demonstrate that Transformer operations can run on optical hardware despite noise and errors. Using simulations, validated by our experiments, we then explored the energy efficiency of optical implementations of Transformers and identified scaling laws for model performance with respect to optical energy usage. We found that the optical energy per multiply-accumulate (MAC) scales as $\frac{1}{d}$ where $d$ is the Transformer width, an asymptotic advantage over digital systems. We conclude that with well-engineered, large-scale optical hardware, it may be possible to achieve a $100 \times$ energy-efficiency advantage for running some of the largest current Transformer models, and that if both the models and the optical hardware are scaled to the quadrillion-parameter regime, optical computers could have a $>8,000\times$ energy-efficiency advantage over state-of-the-art digital-electronic processors that achieve 300 fJ/MAC. We analyzed how these results motivate and inform the construction of future optical accelerators along with optics-amenable deep-learning approaches. With assumptions about future improvements to electronics and Transformer quantization techniques (5$\times$ cheaper memory access, double the digital--analog conversion efficiency, and 4-bit precision), we estimated that optical computers' advantage against current 300-fJ/MAC digital processors could grow to $>100,000\times$. △ Less

Submitted 20 February, 2023; originally announced February 2023.

Comments: 27 pages, 13 figures

Journal ref: Transactions on Machine Learning Research, 03/2024, https://openreview.net/forum?id=Xxw0edFFQC

arXiv:2301.06727 [pdf]

doi 10.1088/2399-1984/ad299a

Roadmap for Unconventional Computing with Nanotechnology

Authors: Giovanni Finocchio, Jean Anne C. Incorvia, Joseph S. Friedman, Qu Yang, Anna Giordano, Julie Grollier, Hyunsoo Yang, Florin Ciubotaru, Andrii Chumak, Azad J. Naeemi, Sorin D. Cotofana, Riccardo Tomasello, Christos Panagopoulos, Mario Carpentieri, Peng Lin, Gang Pan, J. Joshua Yang, Aida Todri-Sanial, Gabriele Boschetto, Kremena Makasheva, Vinod K. Sangwan, Amit Ranjan Trivedi, Mark C. Hersam, Kerem Y. Camsari, Peter L. McMahon , et al. (26 additional authors not shown)

Abstract: In the "Beyond Moore's Law" era, with increasing edge intelligence, domain-specific computing embracing unconventional approaches will become increasingly prevalent. At the same time, adopting a variety of nanotechnologies will offer benefits in energy cost, computational speed, reduced footprint, cyber resilience, and processing power. The time is ripe for a roadmap for unconventional computing w… ▽ More In the "Beyond Moore's Law" era, with increasing edge intelligence, domain-specific computing embracing unconventional approaches will become increasingly prevalent. At the same time, adopting a variety of nanotechnologies will offer benefits in energy cost, computational speed, reduced footprint, cyber resilience, and processing power. The time is ripe for a roadmap for unconventional computing with nanotechnologies to guide future research, and this collection aims to fill that need. The authors provide a comprehensive roadmap for neuromorphic computing using electron spins, memristive devices, two-dimensional nanomaterials, nanomagnets, and various dynamical systems. They also address other paradigms such as Ising machines, Bayesian inference engines, probabilistic computing with p-bits, processing in memory, quantum memories and algorithms, computing with skyrmions and spin waves, and brain-inspired computing for incremental learning and problem-solving in severely resource-constrained environments. These approaches have advantages over traditional Boolean computing based on von Neumann architecture. As the computational requirements for artificial intelligence grow 50 times faster than Moore's Law for electronics, more unconventional approaches to computing and signal processing will appear on the horizon, and this roadmap will help identify future needs and challenges. In a very fertile field, experts in the field aim to present some of the dominant and most promising technologies for unconventional computing that will be around for some time to come. Within a holistic approach, the goal is to provide pathways for solidifying the field and guiding future impactful discoveries. △ Less

Submitted 27 February, 2024; v1 submitted 17 January, 2023; originally announced January 2023.

Comments: 80 pages accepted in Nano Futures

Journal ref: Nano Futures (2024)

arXiv:2208.05088 [pdf, other]

doi 10.1038/s41567-023-02075-7

Programmable large-scale simulation of bosonic transport in optical synthetic frequency lattices

Authors: Alen Senanian, Logan G. Wright, Peter F. Wade, Hannah K. Doyle, Peter L. McMahon

Abstract: Photonic simulators using synthetic frequency dimensions have enabled flexible experimental analogues of condensed-matter systems, realizing phenomena that are impractical to observe in real-space systems. However, to date such photonic simulators have been limited to small systems suffering from finite-size effects. Here, we present an analog simulator capable of simulating large 2D and 3D lattic… ▽ More Photonic simulators using synthetic frequency dimensions have enabled flexible experimental analogues of condensed-matter systems, realizing phenomena that are impractical to observe in real-space systems. However, to date such photonic simulators have been limited to small systems suffering from finite-size effects. Here, we present an analog simulator capable of simulating large 2D and 3D lattices, as well as lattices with non-planar connectivity, including a tree lattice that serves as a toy model in quantum gravity. Our demonstration is enabled by the broad bandwidth achievable in photonics, allowing our simulator to realize lattices with over 100,000 lattice sites. We explore these large lattices in a wide range of previously inaccessible regimes by using a novel method to excite arbitrary states. Our work establishes the scalability and flexibility of programmable simulators based on synthetic frequency dimensions in the optical domain. We anticipate that future extensions of this platform will leverage advances in high-bandwidth optoelectronics to support simulations of dynamic, non-equilibrium phases at the scale of millions of lattice sites, and Kerr-frequency-comb technology to simulate models with higher-order interactions, ultimately in regimes and at scales inaccessible to both digital computers and realizable materials. △ Less

Submitted 9 August, 2022; originally announced August 2022.

arXiv:2207.14293 [pdf, other]

doi 10.1038/s41566-023-01170-8

Image sensing with multilayer, nonlinear optical neural networks

Authors: Tianyu Wang, Mandar M. Sohoni, Logan G. Wright, Martin M. Stein, Shi-Yuan Ma, Tatsuhiro Onodera, Maxwell G. Anderson, Peter L. McMahon

Abstract: Optical imaging is commonly used for both scientific and technological applications across industry and academia. In image sensing, a measurement, such as of an object's position, is performed by computational analysis of a digitized image. An emerging image-sensing paradigm breaks this delineation between data collection and analysis by designing optical components to perform not imaging, but enc… ▽ More Optical imaging is commonly used for both scientific and technological applications across industry and academia. In image sensing, a measurement, such as of an object's position, is performed by computational analysis of a digitized image. An emerging image-sensing paradigm breaks this delineation between data collection and analysis by designing optical components to perform not imaging, but encoding. By optically encoding images into a compressed, low-dimensional latent space suitable for efficient post-analysis, these image sensors can operate with fewer pixels and fewer photons, allowing higher-throughput, lower-latency operation. Optical neural networks (ONNs) offer a platform for processing data in the analog, optical domain. ONN-based sensors have however been limited to linear processing, but nonlinearity is a prerequisite for depth, and multilayer NNs significantly outperform shallow NNs on many tasks. Here, we realize a multilayer ONN pre-processor for image sensing, using a commercial image intensifier as a parallel optoelectronic, optical-to-optical nonlinear activation function. We demonstrate that the nonlinear ONN pre-processor can achieve compression ratios of up to 800:1 while still enabling high accuracy across several representative computer-vision tasks, including machine-vision benchmarks, flow-cytometry image classification, and identification of objects in real scenes. In all cases we find that the ONN's nonlinearity and depth allowed it to outperform a purely linear ONN encoder. Although our experiments are specialized to ONN sensors for incoherent-light images, alternative ONN platforms should facilitate a range of ONN sensors. These ONN sensors may surpass conventional sensors by pre-processing optical information in spatial, temporal, and/or spectral dimensions, potentially with coherent and quantum qualities, all natively in the optical domain. △ Less

Submitted 27 July, 2022; originally announced July 2022.

Journal ref: Nat. Photon. 18, 1-8 (2023)

arXiv:2204.00276 [pdf, ps, other]

Ising machines as hardware solvers of combinatorial optimization problems

Authors: Naeimeh Mohseni, Peter L. McMahon, Tim Byrnes

Abstract: Ising machines are hardware solvers which aim to find the absolute or approximate ground states of the Ising model. The Ising model is of fundamental computational interest because it is possible to formulate any problem in the complexity class NP as an Ising problem with only polynomial overhead. A scalable Ising machine that outperforms existing standard digital computers could have a huge impac… ▽ More Ising machines are hardware solvers which aim to find the absolute or approximate ground states of the Ising model. The Ising model is of fundamental computational interest because it is possible to formulate any problem in the complexity class NP as an Ising problem with only polynomial overhead. A scalable Ising machine that outperforms existing standard digital computers could have a huge impact for practical applications for a wide variety of optimization problems. In this review, we survey the current status of various approaches to constructing Ising machines and explain their underlying operational principles. The types of Ising machines considered here include classical thermal annealers based on technologies such as spintronics, optics, memristors, and digital hardware accelerators; dynamical-systems solvers implemented with optics and electronics; and superconducting-circuit quantum annealers. We compare and contrast their performance using standard metrics such as the ground-state success probability and time-to-solution, give their scaling relations with problem size, and discuss their strengths and weaknesses. △ Less

Submitted 1 April, 2022; originally announced April 2022.

Comments: To appear in Nature Reviews Physics. This is a preprint version and does not include changes made during the editing and production process at the journal

arXiv:2111.13799 [pdf, other]

doi 10.1364/OPTICA.447782

Onset of non-Gaussian quantum physics in pulsed squeezing with mesoscopic fields

Authors: Ryotatsu Yanagimoto, Edwin Ng, Atsushi Yamamura, Tatsuhiro Onodera, Logan G. Wright, Marc Jankowski, M. M. Fejer, Peter L. McMahon, Hideo Mabuchi

Abstract: We study the emergence of non-Gaussian quantum features in pulsed squeezed light generation with a mesoscopic number (i.e., dozens to hundreds) of pump photons. Due to the strong optical nonlinearities necessarily involved in this regime, squeezing occurs alongside significant pump depletion, compromising the predictions made by conventional semiclassical models for squeezing. Furthermore, nonline… ▽ More We study the emergence of non-Gaussian quantum features in pulsed squeezed light generation with a mesoscopic number (i.e., dozens to hundreds) of pump photons. Due to the strong optical nonlinearities necessarily involved in this regime, squeezing occurs alongside significant pump depletion, compromising the predictions made by conventional semiclassical models for squeezing. Furthermore, nonlinear interactions among multiple frequency modes render the system dynamics exponentially intractable in naïve quantum models, requiring a more sophisticated modeling framework. To this end, we construct a nonlinear Gaussian approximation to the squeezing dynamics, defining a "Gaussian interaction frame" (GIF) in which non-Gaussian quantum dynamics can be isolated and concisely described using a few dominant (i.e., principal) supermodes. Numerical simulations of our model reveal non-Gaussian distortions of squeezing in the mesoscopic regime, largely associated with signal-pump entanglement. We argue that the state of the art in nonlinear nanophotonics is quickly approaching this regime, providing an all-optical platform for experimental studies of the semiclassical-to-quantum transition in a rich paradigm of coherent, multimode nonlinear dynamics. Mesoscopic pulsed squeezing thus provides an intriguing case study of the rapid rise in dynamic complexity associated with semiclassical-to-quantum crossover, which we view as a correlate of the emergence of new information-processing capacities in the quantum regime. △ Less

Submitted 26 November, 2021; originally announced November 2021.

Comments: The first two authors contributed equally to this work; 16 pages, 7 figures

Journal ref: Optica 9, 379 (2022)

arXiv:2104.13467 [pdf, other]

doi 10.1038/s41467-021-27774-8

An optical neural network using less than 1 photon per multiplication

Authors: Tianyu Wang, Shi-Yuan Ma, Logan G. Wright, Tatsuhiro Onodera, Brian Richard, Peter L. McMahon

Abstract: Deep learning has rapidly become a widespread tool in both scientific and commercial endeavors. Milestones of deep learning exceeding human performance have been achieved for a growing number of tasks over the past several years, across areas as diverse as game-playing, natural-language translation, and medical-image analysis. However, continued progress is increasingly hampered by the high energy… ▽ More Deep learning has rapidly become a widespread tool in both scientific and commercial endeavors. Milestones of deep learning exceeding human performance have been achieved for a growing number of tasks over the past several years, across areas as diverse as game-playing, natural-language translation, and medical-image analysis. However, continued progress is increasingly hampered by the high energy costs associated with training and running deep neural networks on electronic processors. Optical neural networks have attracted attention as an alternative physical platform for deep learning, as it has been theoretically predicted that they can fundamentally achieve higher energy efficiency than neural networks deployed on conventional digital computers. Here, we experimentally demonstrate an optical neural network achieving 99% accuracy on handwritten-digit classification using ~3.2 detected photons per weight multiplication and ~90% accuracy using ~0.64 photons (~$2.4 \times 10^{-19}$ J of optical energy) per weight multiplication. This performance was achieved using a custom free-space optical processor that executes matrix-vector multiplications in a massively parallel fashion, with up to ~0.5 million scalar (weight) multiplications performed at the same time. Using commercially available optical components and standard neural-network training methods, we demonstrated that optical neural networks can operate near the standard quantum limit with extremely low optical powers and still achieve high accuracy. Our results provide a proof-of-principle for low-optical-power operation, and with careful system design including the surrounding electronics used for data storage and control, open up a path to realizing optical processors that require only $10^{-16}$ J total energy per scalar multiplication -- which is orders of magnitude more efficient than current digital processors. △ Less

Submitted 27 April, 2021; originally announced April 2021.

Comments: 42 pages, 21 figures

Journal ref: Nature Communications 13, 123 (2022)

arXiv:2104.13386 [pdf, other]

doi 10.1038/s41586-021-04223-6

Deep physical neural networks enabled by a backpropagation algorithm for arbitrary physical systems

Authors: Logan G. Wright, Tatsuhiro Onodera, Martin M. Stein, Tianyu Wang, Darren T. Schachter, Zoey Hu, Peter L. McMahon

Abstract: Deep neural networks have become a pervasive tool in science and engineering. However, modern deep neural networks' growing energy requirements now increasingly limit their scaling and broader use. We propose a radical alternative for implementing deep neural network models: Physical Neural Networks. We introduce a hybrid physical-digital algorithm called Physics-Aware Training to efficiently trai… ▽ More Deep neural networks have become a pervasive tool in science and engineering. However, modern deep neural networks' growing energy requirements now increasingly limit their scaling and broader use. We propose a radical alternative for implementing deep neural network models: Physical Neural Networks. We introduce a hybrid physical-digital algorithm called Physics-Aware Training to efficiently train sequences of controllable physical systems to act as deep neural networks. This method automatically trains the functionality of any sequence of real physical systems, directly, using backpropagation, the same technique used for modern deep neural networks. To illustrate their generality, we demonstrate physical neural networks with three diverse physical systems-optical, mechanical, and electrical. Physical neural networks may facilitate unconventional machine learning hardware that is orders of magnitude faster and more energy efficient than conventional electronic processors. △ Less

Submitted 27 April, 2021; originally announced April 2021.

Journal ref: Nature 601, 549-555 (2022)

arXiv:2103.05629 [pdf, other]

doi 10.1103/PhysRevResearch.4.013009

Efficient sampling of ground and low-energy Ising spin configurations with a coherent Ising machine

Authors: Edwin Ng, Tatsuhiro Onodera, Satoshi Kako, Peter L. McMahon, Hideo Mabuchi, Yoshihisa Yamamoto

Abstract: We show that the nonlinear stochastic dynamics of a measurement-feedback-based coherent Ising machine (MFB-CIM) in the presence of quantum noise can be exploited to sample degenerate ground and low-energy spin configurations of the Ising model. We formulate a general discrete-time Gaussian-state model of the MFB-CIM which faithfully captures the nonlinear dynamics present at and above system thres… ▽ More We show that the nonlinear stochastic dynamics of a measurement-feedback-based coherent Ising machine (MFB-CIM) in the presence of quantum noise can be exploited to sample degenerate ground and low-energy spin configurations of the Ising model. We formulate a general discrete-time Gaussian-state model of the MFB-CIM which faithfully captures the nonlinear dynamics present at and above system threshold. This model overcomes the limitations of both mean-field models, which neglect quantum noise, and continuous-time models, which assume long photon lifetimes. Numerical simulations of our model show that when the MFB-CIM is operated in a quantum-noise-dominated regime with short photon lifetimes (i.e., low cavity finesse), homodyne monitoring of the system can efficiently produce samples of low-energy Ising spin configurations, requiring many fewer roundtrips to sample than suggested by established high-finesse, continuous-time models. We find that sampling performance is robust to, or even improved by, turning off or altogether reversing the sign of the parametric drive, but performance is critically reduced in the absence of optical nonlinearity. For the class of MAX-CUT problems with binary-signed edge weights, the number of roundtrips sufficient to fully sample all spin configurations up to the first-excited Ising energy, including all degeneracies, scales as $1.08^N$. At a problem size of $N = 100$ with a few dozen (median of 20) such desired configurations per instance, we have found median sufficient sampling times of $6\times10^6$ roundtrips; in an experimental implementation of an MFB-CIM with a 10 GHz repetition rate, this corresponds to a wall-clock sampling time of 60 ms. △ Less

Submitted 27 January, 2022; v1 submitted 9 March, 2021; originally announced March 2021.

Comments: The first two authors contributed equally to this work. 24 pages, 9 figures

Journal ref: Phys. Rev. Res. 4, 013009 (2022)

arXiv:1912.11408 [pdf, other]

doi 10.1103/PhysRevLett.124.240503

Engineering a Kerr-based Deterministic Cubic Phase Gate via Gaussian Operations

Authors: Ryotatsu Yanagimoto, Tatsuhiro Onodera, Edwin Ng, Logan G. Wright, Peter L. McMahon, Hideo Mabuchi

Abstract: We propose a deterministic, measurement-free implementation of a cubic phase gate for continuous-variable quantum information processing. In our scheme, the applications of displacement and squeezing operations allow us to engineer the effective evolution of the quantum state propagating through an optical Kerr nonlinearity. Under appropriate conditions, we show that the input state evolves accord… ▽ More We propose a deterministic, measurement-free implementation of a cubic phase gate for continuous-variable quantum information processing. In our scheme, the applications of displacement and squeezing operations allow us to engineer the effective evolution of the quantum state propagating through an optical Kerr nonlinearity. Under appropriate conditions, we show that the input state evolves according to a cubic phase Hamiltonian, and we find that the cubic phase gate error decreases inverse-quartically with the amount of quadrature squeezing, even in the presence of linear loss. We also show how our scheme can be adapted to deterministically generate a nonclassical approximate cubic phase state with high fidelity using a ratio of native nonlinearity to linear loss of only $10^{-4}$, indicating that our approach may be experimentally viable in the near term even on all-optical platforms, e.g., using quantum solitons in pulsed nonlinear nanophotonics. △ Less

Submitted 24 December, 2019; originally announced December 2019.

Comments: 10 pages, 7 figures. The first two authors contributed equally to this work

Journal ref: Phys. Rev. Lett. 124, 240503 (2020)

arXiv:1908.01364 [pdf, other]

The Capacity of Quantum Neural Networks

Authors: Logan G. Wright, Peter L. McMahon

Abstract: A key open question in quantum computation is what advantages quantum neural networks (QNNs) may have over classical neural networks (NNs), and in what situations these advantages may transpire. Here we address this question by studying the memory capacity $C$ of QNNs, which is a metric of the expressive power of a QNN that we have adapted from classical NN theory. We present a capacity inequality… ▽ More A key open question in quantum computation is what advantages quantum neural networks (QNNs) may have over classical neural networks (NNs), and in what situations these advantages may transpire. Here we address this question by studying the memory capacity $C$ of QNNs, which is a metric of the expressive power of a QNN that we have adapted from classical NN theory. We present a capacity inequality showing that the capacity of a QNN is bounded by the information $W$ that can be trained into its parameters: $C \leq W$. One consequence of this bound is that QNNs that are parameterized classically do not show an advantage in capacity over classical NNs having an equal number of parameters. However, QNNs that are parametrized with quantum states could have exponentially larger capacities. We illustrate our theoretical results with numerical experiments by simulating a particular QNN based on a Gaussian Boson Sampler. We also study the influence of sampling due to wavefunction collapse during operation of the QNN, and provide an analytical expression connecting the capacity to the number of times the quantum system is measured. △ Less

Submitted 4 August, 2019; originally announced August 2019.

arXiv:1811.10583 [pdf, other]

doi 10.1103/PhysRevA.105.033508

Nonlinear Quantum Behavior of Ultrashort-Pulse Optical Parametric Oscillators

Authors: Tatsuhiro Onodera, Edwin Ng, Chris Gustin, Niels Lörch, Atsushi Yamamura, Ryan Hamerly, Peter L. McMahon, Alireza Marandi, Hideo Mabuchi

Abstract: The quantum features of ultrashort-pulse optical parametric oscillators (OPOs) are investigated theoretically in the nonlinear regime near and above threshold. Viewing the pulsed OPO as a multimode open quantum system, we rigorously derive a general input-output model that features nonlinear coupling among many cavity (i.e., system) signal modes and a broadband single-pass (i.e., reservoir) pump f… ▽ More The quantum features of ultrashort-pulse optical parametric oscillators (OPOs) are investigated theoretically in the nonlinear regime near and above threshold. Viewing the pulsed OPO as a multimode open quantum system, we rigorously derive a general input-output model that features nonlinear coupling among many cavity (i.e., system) signal modes and a broadband single-pass (i.e., reservoir) pump field. Under appropriate assumptions, our model produces a Lindblad master equation with multimode nonlinear Lindblad operators describing two-photon dissipation and a multimode four-wave-mixing Hamiltonian describing a broadband, dispersive optical cascade, which we show is required to preserve causality. To simplify the multimode complexity of the model, we employ a supermode decomposition to perform numerical simulations in the regime where the pulsed supermodes experience strong single-photon nonlinearity. We find that the quantum nonlinear dynamics induces pump depletion as well as corrections to the below-threshold squeezing spectrum predicted by linearized models. We also observe the formation of non-Gaussian states with Wigner-function negativity and show that the multimode interactions with the pump, both dissipative and dispersive, can act as effective decoherence channels. Finally, we briefly discuss some experimental considerations for potentially observing such quantum nonlinear phenomena with ultrashort-pulse OPOs on nonlinear nanophotonic platforms. △ Less

Submitted 18 April, 2022; v1 submitted 26 November, 2018; originally announced November 2018.

Comments: The first three authors contributed equally to this work. 26 pages, 9 figures

Journal ref: Phys. Rev. A 105, 033508 (2022)

arXiv:1805.05217 [pdf, other]

doi 10.1126/sciadv.aau0823

Experimental investigation of performance differences between Coherent Ising Machines and a quantum annealer

Authors: Ryan Hamerly, Takahiro Inagaki, Peter L. McMahon, Davide Venturelli, Alireza Marandi, Tatsuhiro Onodera, Edwin Ng, Carsten Langrock, Kensuke Inaba, Toshimori Honjo, Koji Enbutsu, Takeshi Umeki, Ryoichi Kasahara, Shoko Utsunomiya, Satoshi Kako, Ken-ichi Kawarabayashi, Robert L. Byer, Martin M. Fejer, Hideo Mabuchi, Dirk Englund, Eleanor Rieffel, Hiroki Takesue, Yoshihisa Yamamoto

Abstract: Physical annealing systems provide heuristic approaches to solving NP-hard Ising optimization problems. Here, we study the performance of two types of annealing machines--a commercially available quantum annealer built by D-Wave Systems, and measurement-feedback coherent Ising machines (CIMs) based on optical parametric oscillator networks--on two classes of problems, the Sherrington-Kirkpatrick (… ▽ More Physical annealing systems provide heuristic approaches to solving NP-hard Ising optimization problems. Here, we study the performance of two types of annealing machines--a commercially available quantum annealer built by D-Wave Systems, and measurement-feedback coherent Ising machines (CIMs) based on optical parametric oscillator networks--on two classes of problems, the Sherrington-Kirkpatrick (SK) model and MAX-CUT. The D-Wave quantum annealer outperforms the CIMs on MAX-CUT on regular graphs of degree 3. On denser problems, however, we observe an exponential penalty for the quantum annealer ($\exp(-α_\textrm{DW} N^2)$) relative to CIMs ($\exp(-α_\textrm{CIM} N)$) for fixed anneal times, on both the SK model and on 50%-edge-density MAX-CUT, where the coefficients $α_\textrm{CIM}$ and $α_\textrm{DW}$ are problem-class-dependent. On instances with over $50$ vertices, a several-orders-of-magnitude time-to-solution difference exists between CIMs and the D-Wave annealer. An optimal-annealing-time analysis is also consistent with a significant projected performance difference. The difference in performance between the sparsely connected D-Wave machine and the measurement-feedback facilitated all-to-all connectivity of the CIMs provides strong experimental support for efforts to increase the connectivity of quantum annealers. △ Less

Submitted 24 May, 2019; v1 submitted 14 May, 2018; originally announced May 2018.

Comments: 12 pages, 5 figures, 1 table (main text); 14 pages, 12 figures, 2 tables (supplementary)

Journal ref: Sci. Adv. 5:eaau0823 (2019)

arXiv:1501.03535 [pdf, other]

doi 10.1007/978-3-319-19231-4_14

Towards Quantum Repeaters with Solid-State Qubits: Spin-Photon Entanglement Generation using Self-Assembled Quantum Dots

Authors: Peter L. McMahon, Kristiaan De Greve

Abstract: In this chapter we review the use of spins in optically-active InAs quantum dots as the key physical building block for constructing a quantum repeater, with a particular focus on recent results demonstrating entanglement between a quantum memory (electron spin qubit) and a flying qubit (polarization- or frequency-encoded photonic qubit). This is a first step towards demonstrating entanglement bet… ▽ More In this chapter we review the use of spins in optically-active InAs quantum dots as the key physical building block for constructing a quantum repeater, with a particular focus on recent results demonstrating entanglement between a quantum memory (electron spin qubit) and a flying qubit (polarization- or frequency-encoded photonic qubit). This is a first step towards demonstrating entanglement between distant quantum memories (realized with quantum dots), which in turn is a milestone in the roadmap for building a functional quantum repeater. We also place this experimental work in context by providing an overview of quantum repeaters, their potential uses, and the challenges in implementing them. △ Less

Submitted 14 January, 2015; originally announced January 2015.

Comments: 51 pages. Expanded version of a chapter to appear in "Engineering the Atom-Photon Interaction" (Springer-Verlag, 2015; eds. A. Predojevic and M. W. Mitchell)

arXiv:1209.6404 [pdf, ps, other]

doi 10.1364/OE.20.027510

Downconversion quantum interface for a single quantum dot spin and 1550-nm single-photon channel

Authors: Jason S. Pelc, Leo Yu, Kristiaan De Greve, Peter L. McMahon, Chandra M. Natarajan, Vahid Esfandyarpour, Sebastian Maier, Christian Schneider, Martin Kamp, Sven Höfling, Robert H. Hadfield, Alfred Forchel, Yoshihisa Yamamoto, M. M. Fejer

Abstract: Long-distance quantum communication networks require appropriate interfaces between matter qubit-based nodes and low-loss photonic quantum channels. We implement a downconversion quantum interface, where the single photons emitted from a semiconductor quantum dot at 910 nm are downconverted to 1560 nm using a fiber-coupled periodically poled lithium niobate waveguide and a 2.2-$μ$m pulsed pump las… ▽ More Long-distance quantum communication networks require appropriate interfaces between matter qubit-based nodes and low-loss photonic quantum channels. We implement a downconversion quantum interface, where the single photons emitted from a semiconductor quantum dot at 910 nm are downconverted to 1560 nm using a fiber-coupled periodically poled lithium niobate waveguide and a 2.2-$μ$m pulsed pump laser. The single-photon character of the quantum dot emission is preserved during the downconversion process: we measure a cross-correlation $g^{(2)}(τ= 0) = 0.17$ using resonant excitation of the quantum dot. We show that the downconversion interface is fully compatible with coherent optical control of the quantum dot electron spin through the observation of Rabi oscillations in the downconverted photon counts. These results represent a critical step towards a long-distance hybrid quantum network in which subsystems operating at different wavelengths are connected through quantum frequency conversion devices and 1.5-$μ$m quantum channels. △ Less

Submitted 2 May, 2013; v1 submitted 27 September, 2012; originally announced September 2012.

Journal ref: Optics Express, Vol. 20, Issue 25, pp. 27510-27519 (2012)

Showing 1–20 of 20 results for author: McMahon, P L