-
Quantum Equilibrium Propagation for efficient training of quantum systems based on Onsager reciprocity
Authors:
Clara C. Wanjura,
Florian Marquardt
Abstract:
The widespread adoption of machine learning and artificial intelligence in all branches of science and technology has created a need for energy-efficient, alternative hardware platforms. While such neuromorphic approaches have been proposed and realised for a wide range of platforms, physically extracting the gradients required for training remains challenging as generic approaches only exist in c…
▽ More
The widespread adoption of machine learning and artificial intelligence in all branches of science and technology has created a need for energy-efficient, alternative hardware platforms. While such neuromorphic approaches have been proposed and realised for a wide range of platforms, physically extracting the gradients required for training remains challenging as generic approaches only exist in certain cases. Equilibrium propagation (EP) is such a procedure that has been introduced and applied to classical energy-based models which relax to an equilibrium. Here, we show a direct connection between EP and Onsager reciprocity and exploit this to derive a quantum version of EP. This can be used to optimize loss functions that depend on the expectation values of observables of an arbitrary quantum system. Specifically, we illustrate this new concept with supervised and unsupervised learning examples in which the input or the solvable task is of quantum mechanical nature, e.g., the recognition of quantum many-body ground states, quantum phase exploration, sensing and phase boundary exploration. We propose that in the future quantum EP may be used to solve tasks such as quantum phase discovery with a quantum simulator even for Hamiltonians which are numerically hard to simulate or even partially unknown. Our scheme is relevant for a variety of quantum simulation platforms such as ion chains, superconducting qubit arrays, neutral atom Rydberg tweezer arrays and strongly interacting atoms in optical lattices.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Training of Physical Neural Networks
Authors:
Ali Momeni,
Babak Rahmani,
Benjamin Scellier,
Logan G. Wright,
Peter L. McMahon,
Clara C. Wanjura,
Yuhang Li,
Anas Skalli,
Natalia G. Berloff,
Tatsuhiro Onodera,
Ilker Oguz,
Francesco Morichetti,
Philipp del Hougne,
Manuel Le Gallo,
Abu Sebastian,
Azalia Mirhoseini,
Cheng Zhang,
Danijela Marković,
Daniel Brunner,
Christophe Moser,
Sylvain Gigan,
Florian Marquardt,
Aydogan Ozcan,
Julie Grollier,
Andrea J. Liu
, et al. (3 additional authors not shown)
Abstract:
Physical neural networks (PNNs) are a class of neural-like networks that leverage the properties of physical systems to perform computation. While PNNs are so far a niche research area with small-scale laboratory demonstrations, they are arguably one of the most underappreciated important opportunities in modern AI. Could we train AI models 1000x larger than current ones? Could we do this and also…
▽ More
Physical neural networks (PNNs) are a class of neural-like networks that leverage the properties of physical systems to perform computation. While PNNs are so far a niche research area with small-scale laboratory demonstrations, they are arguably one of the most underappreciated important opportunities in modern AI. Could we train AI models 1000x larger than current ones? Could we do this and also have them perform inference locally and privately on edge devices, such as smartphones or sensors? Research over the past few years has shown that the answer to all these questions is likely "yes, with enough research": PNNs could one day radically change what is possible and practical for AI systems. To do this will however require rethinking both how AI models work, and how they are trained - primarily by considering the problems through the constraints of the underlying hardware physics. To train PNNs at large scale, many methods including backpropagation-based and backpropagation-free approaches are now being explored. These methods have various trade-offs, and so far no method has been shown to scale to the same scale and performance as the backpropagation algorithm widely used in deep learning today. However, this is rapidly changing, and a diverse ecosystem of training techniques provides clues for how PNNs may one day be utilized to create both more efficient realizations of current-scale AI models, and to enable unprecedented-scale models.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Tackling Decision Processes with Non-Cumulative Objectives using Reinforcement Learning
Authors:
Maximilian Nägele,
Jan Olle,
Thomas Fösel,
Remmy Zen,
Florian Marquardt
Abstract:
Markov decision processes (MDPs) are used to model a wide variety of applications ranging from game playing over robotics to finance. Their optimal policy typically maximizes the expected sum of rewards given at each step of the decision process. However, a large class of problems does not fit straightforwardly into this framework: Non-cumulative Markov decision processes (NCMDPs), where instead o…
▽ More
Markov decision processes (MDPs) are used to model a wide variety of applications ranging from game playing over robotics to finance. Their optimal policy typically maximizes the expected sum of rewards given at each step of the decision process. However, a large class of problems does not fit straightforwardly into this framework: Non-cumulative Markov decision processes (NCMDPs), where instead of the expected sum of rewards, the expected value of an arbitrary function of the rewards is maximized. Example functions include the maximum of the rewards or their mean divided by their standard deviation. In this work, we introduce a general map** of NCMDPs to standard MDPs. This allows all techniques developed to find optimal policies for MDPs, such as reinforcement learning or dynamic programming, to be directly applied to the larger class of NCMDPs. Focusing on reinforcement learning, we show applications in a diverse set of tasks, including classical control, portfolio optimization in finance, and discrete optimization problems. Given our approach, we can improve both final performance and training time compared to relying on standard MDPs.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Training Coupled Phase Oscillators as a Neuromorphic Platform using Equilibrium Propagation
Authors:
Qingshan Wang,
Clara C. Wanjura,
Florian Marquardt
Abstract:
Given the rapidly growing scale and resource requirements of machine learning applications, the idea of building more efficient learning machines much closer to the laws of physics is an attractive proposition. One central question for identifying promising candidates for such neuromorphic platforms is whether not only inference but also training can exploit the physical dynamics. In this work, we…
▽ More
Given the rapidly growing scale and resource requirements of machine learning applications, the idea of building more efficient learning machines much closer to the laws of physics is an attractive proposition. One central question for identifying promising candidates for such neuromorphic platforms is whether not only inference but also training can exploit the physical dynamics. In this work, we show that it is possible to successfully train a system of coupled phase oscillators - one of the most widely investigated nonlinear dynamical systems with a multitude of physical implementations, comprising laser arrays, coupled mechanical limit cycles, superfluids, and exciton-polaritons. To this end, we apply the approach of equilibrium propagation, which permits to extract training gradients via a physical realization of backpropagation, based only on local interactions. The complex energy landscape of the XY/ Kuramoto model leads to multistability, and we show how to address this challenge. Our study identifies coupled phase oscillators as a new general-purpose neuromorphic platform and opens the door towards future experimental implementations.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Optimizing ZX-Diagrams with Deep Reinforcement Learning
Authors:
Maximilian Nägele,
Florian Marquardt
Abstract:
ZX-diagrams are a powerful graphical language for the description of quantum processes with applications in fundamental quantum mechanics, quantum circuit optimization, tensor network simulation, and many more. The utility of ZX-diagrams relies on a set of local transformation rules that can be applied to them without changing the underlying quantum process they describe. These rules can be exploi…
▽ More
ZX-diagrams are a powerful graphical language for the description of quantum processes with applications in fundamental quantum mechanics, quantum circuit optimization, tensor network simulation, and many more. The utility of ZX-diagrams relies on a set of local transformation rules that can be applied to them without changing the underlying quantum process they describe. These rules can be exploited to optimize the structure of ZX-diagrams for a range of applications. However, finding an optimal sequence of transformation rules is generally an open problem. In this work, we bring together ZX-diagrams with reinforcement learning, a machine learning technique designed to discover an optimal sequence of actions in a decision-making problem and show that a trained reinforcement learning agent can significantly outperform other optimization techniques like a greedy strategy or simulated annealing. The use of graph neural networks to encode the policy of the agent enables generalization to diagrams much bigger than seen during the training phase.
△ Less
Submitted 26 April, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.
-
Fully Non-Linear Neuromorphic Computing with Linear Wave Scattering
Authors:
Clara C. Wanjura,
Florian Marquardt
Abstract:
The increasing complexity of neural networks and the energy consumption associated with training and inference create a need for alternative neuromorphic approaches, e.g. using optics. Current proposals and implementations rely on physical non-linearities or opto-electronic conversion to realise the required non-linear activation function. However, there are significant challenges with these appro…
▽ More
The increasing complexity of neural networks and the energy consumption associated with training and inference create a need for alternative neuromorphic approaches, e.g. using optics. Current proposals and implementations rely on physical non-linearities or opto-electronic conversion to realise the required non-linear activation function. However, there are significant challenges with these approaches related to power levels, control, energy-efficiency, and delays. Here, we present a scheme for a neuromorphic system that relies on linear wave scattering and yet achieves non-linear processing with a high expressivity. The key idea is to inject the input via physical parameters that affect the scattering processes. Moreover, we show that gradients needed for training can be directly measured in scattering experiments. We predict classification accuracies on par with results obtained by standard artificial neural networks. Our proposal can be readily implemented with existing state-of-the-art, scalable platforms, e.g. in optics, microwave and electrical circuits, and we propose an integrated-photonics implementation based on racetrack resonators that achieves high connectivity with a minimal number of waveguide crossings.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
Deep Bayesian Experimental Design for Quantum Many-Body Systems
Authors:
Leopoldo Sarra,
Florian Marquardt
Abstract:
Bayesian experimental design is a technique that allows to efficiently select measurements to characterize a physical system by maximizing the expected information gain. Recent developments in deep neural networks and normalizing flows allow for a more efficient approximation of the posterior and thus the extension of this technique to complex high-dimensional situations. In this paper, we show ho…
▽ More
Bayesian experimental design is a technique that allows to efficiently select measurements to characterize a physical system by maximizing the expected information gain. Recent developments in deep neural networks and normalizing flows allow for a more efficient approximation of the posterior and thus the extension of this technique to complex high-dimensional situations. In this paper, we show how this approach holds promise for adaptive measurement strategies to characterize present-day quantum technology platforms. In particular, we focus on arrays of coupled cavities and qubit arrays. Both represent model systems of high relevance for modern applications, like quantum simulations and computing, and both have been realized in platforms where measurement and control can be exploited to characterize and counteract unavoidable disorder. Thus, they represent ideal targets for applications of Bayesian experimental design.
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
Investigation of inverse design of multilayer thin-films with conditional invertible Neural Networks
Authors:
Alexander Luce,
Ali Mahdavi,
Heribert Wankerl,
Florian Marquardt
Abstract:
The task of designing optical multilayer thin-films regarding a given target is currently solved using gradient-based optimization in conjunction with methods that can introduce additional thin-film layers. Recently, Deep Learning and Reinforcement Learning have been been introduced to the task of designing thin-films with great success, however a trained network is usually only able to become pro…
▽ More
The task of designing optical multilayer thin-films regarding a given target is currently solved using gradient-based optimization in conjunction with methods that can introduce additional thin-film layers. Recently, Deep Learning and Reinforcement Learning have been been introduced to the task of designing thin-films with great success, however a trained network is usually only able to become proficient for a single target and must be retrained if the optical targets are varied. In this work, we apply conditional Invertible Neural Networks (cINN) to inversely designing multilayer thin-films given an optical target. Since the cINN learns the energy landscape of all thin-film configurations within the training dataset, we show that cINNs can generate a stochastic ensemble of proposals for thin-film configurations that that are reasonably close to the desired target depending only on random variables. By refining the proposed configurations further by a local optimization, we show that the generated thin-films reach the target with significantly greater precision than comparable state-of-the art approaches. Furthermore, we tested the generative capabilities on samples which are outside the training data distribution and found that the cINN was able to predict thin-films for out-of-distribution targets, too. The results suggest that in order to improve the generative design of thin-films, it is instructive to use established and new machine learning methods in conjunction in order to obtain the most favorable results.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
Artificial Intelligence and Machine Learning for Quantum Technologies
Authors:
Mario Krenn,
Jonas Landgraf,
Thomas Foesel,
Florian Marquardt
Abstract:
In recent years, the dramatic progress in machine learning has begun to impact many areas of science and technology significantly. In the present perspective article, we explore how quantum technologies are benefiting from this revolution. We showcase in illustrative examples how scientists in the past few years have started to use machine learning and more broadly methods of artificial intelligen…
▽ More
In recent years, the dramatic progress in machine learning has begun to impact many areas of science and technology significantly. In the present perspective article, we explore how quantum technologies are benefiting from this revolution. We showcase in illustrative examples how scientists in the past few years have started to use machine learning and more broadly methods of artificial intelligence to analyze quantum measurements, estimate the parameters of quantum devices, discover new quantum experimental setups, protocols, and feedback strategies, and generally improve aspects of quantum computing, quantum communication, and quantum simulation. We highlight open challenges and future possibilities and conclude with some speculative visions for the next decade.
△ Less
Submitted 7 August, 2022;
originally announced August 2022.
-
TMM-Fast: A Transfer Matrix Computation Package for Multilayer Thin-Film Optimization
Authors:
Alexander Luce,
Ali Mahdavi,
Florian Marquardt,
Heribert Wankerl
Abstract:
Achieving the desired optical response from a multilayer thin-film structure over a broad range of wavelengths and angles of incidence can be challenging. An advanced thin-film structure can consist of multiple materials with different thicknesses and numerous layers. Design and optimization of complex thin-film structures with multiple variables is a computationally heavy problem that is still un…
▽ More
Achieving the desired optical response from a multilayer thin-film structure over a broad range of wavelengths and angles of incidence can be challenging. An advanced thin-film structure can consist of multiple materials with different thicknesses and numerous layers. Design and optimization of complex thin-film structures with multiple variables is a computationally heavy problem that is still under active research. To enable fast and easy experimentation with new optimization techniques, we propose the Python package TMM-Fast which enables parallelized computation of reflection and transmission of light at different angles of incidence and wavelengths through the multilayer thin-film. By decreasing computational time, generating datasets for machine learning becomes feasible and evolutionary optimization can be used effectively. Additionally, the sub-package TMM-Torch allows to directly compute analytical gradients for local optimization by using PyTorch Autograd functionality. Finally, an OpenAi Gym environment is presented which allows the user to train reinforcement learning agents on the problem of finding multilayer thin-film configurations.
△ Less
Submitted 24 November, 2021;
originally announced November 2021.
-
Self-learning Machines based on Hamiltonian Echo Backpropagation
Authors:
Victor Lopez-Pastor,
Florian Marquardt
Abstract:
A physical self-learning machine can be defined as a nonlinear dynamical system that can be trained on data (similar to artificial neural networks), but where the update of the internal degrees of freedom that serve as learnable parameters happens autonomously. In this way, neither external processing and feedback nor knowledge of (and control of) these internal degrees of freedom is required. We…
▽ More
A physical self-learning machine can be defined as a nonlinear dynamical system that can be trained on data (similar to artificial neural networks), but where the update of the internal degrees of freedom that serve as learnable parameters happens autonomously. In this way, neither external processing and feedback nor knowledge of (and control of) these internal degrees of freedom is required. We introduce a general scheme for self-learning in any time-reversible Hamiltonian system. We illustrate the training of such a self-learning machine numerically for the case of coupled nonlinear wave fields.
△ Less
Submitted 7 February, 2023; v1 submitted 8 March, 2021;
originally announced March 2021.
-
Renormalized Mutual Information for Artificial Scientific Discovery
Authors:
Leopoldo Sarra,
Andrea Aiello,
Florian Marquardt
Abstract:
We derive a well-defined renormalized version of mutual information that allows to estimate the dependence between continuous random variables in the important case when one is deterministically dependent on the other. This is the situation relevant for feature extraction, where the goal is to produce a low-dimensional effective description of a high-dimensional system. Our approach enables the di…
▽ More
We derive a well-defined renormalized version of mutual information that allows to estimate the dependence between continuous random variables in the important case when one is deterministically dependent on the other. This is the situation relevant for feature extraction, where the goal is to produce a low-dimensional effective description of a high-dimensional system. Our approach enables the discovery of collective variables in physical systems, thus adding to the toolbox of artificial scientific discovery, while also aiding the analysis of information flow in artificial neural networks.
△ Less
Submitted 5 March, 2021; v1 submitted 4 May, 2020;
originally announced May 2020.