-
Bayesian Inference with Deep Weakly Nonlinear Networks
Authors:
Boris Hanin,
Alexander Zlokapa
Abstract:
We show at a physics level of rigor that Bayesian inference with a fully connected neural network and a shaped nonlinearity of the form $φ(t) = t + ψt^3/L$ is (perturbatively) solvable in the regime where the number of training datapoints $P$ , the input dimension $N_0$, the network layer widths $N$, and the network depth $L$ are simultaneously large. Our results hold with weak assumptions on the…
▽ More
We show at a physics level of rigor that Bayesian inference with a fully connected neural network and a shaped nonlinearity of the form $φ(t) = t + ψt^3/L$ is (perturbatively) solvable in the regime where the number of training datapoints $P$ , the input dimension $N_0$, the network layer widths $N$, and the network depth $L$ are simultaneously large. Our results hold with weak assumptions on the data; the main constraint is that $P < N_0$. We provide techniques to compute the model evidence and posterior to arbitrary order in $1/N$ and at arbitrary temperature. We report the following results from the first-order computation:
1. When the width $N$ is much larger than the depth $L$ and training set size $P$, neural network Bayesian inference coincides with Bayesian inference using a kernel. The value of $ψ$ determines the curvature of a sphere, hyperbola, or plane into which the training data is implicitly embedded under the feature map.
2. When $LP/N$ is a small constant, neural network Bayesian inference departs from the kernel regime. At zero temperature, neural network Bayesian inference is equivalent to Bayesian inference using a data-dependent kernel, and $LP/N$ serves as an effective depth that controls the extent of feature learning.
3. In the restricted case of deep linear networks ($ψ=0$) and noisy data, we show a simple data model for which evidence and generalization error are optimal at zero temperature. As $LP/N$ increases, both evidence and generalization further improve, demonstrating the benefit of depth in benign overfitting.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Long-range wormhole teleportation
Authors:
Joseph D. Lykken,
Daniel Jafferis,
Alexander Zlokapa,
David K. Kolchmeyer,
Samantha I. Davis,
Hartmut Neven,
Maria Spiropulu
Abstract:
We extend the protocol of Gao and Jafferis arXiv:1911.07416 to allow wormhole teleportation between two entangled copies of the Sachdev-Ye-Kitaev (SYK) model communicating only through a classical channel. We demonstrate in finite $N$ simulations that the protocol exhibits the characteristic holographic features of wormhole teleportation discussed and summarized in Jafferis et al. https://www.natu…
▽ More
We extend the protocol of Gao and Jafferis arXiv:1911.07416 to allow wormhole teleportation between two entangled copies of the Sachdev-Ye-Kitaev (SYK) model communicating only through a classical channel. We demonstrate in finite $N$ simulations that the protocol exhibits the characteristic holographic features of wormhole teleportation discussed and summarized in Jafferis et al. https://www.nature.com/articles/s41586-022-05424-3 . We review and exhibit in detail how these holographic features relate to size winding which, as first shown by Brown et al. arXiv:1911.06314 and Nezami et al. arXiv:2102.01064, encodes a dual description of wormhole teleportation.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Hamiltonian simulation for low-energy states with optimal time dependence
Authors:
Alexander Zlokapa,
Rolando D. Somma
Abstract:
We consider the task of simulating time evolution under a Hamiltonian $H$ within its low-energy subspace. Assuming access to a block-encoding of $H'=(H-E)/λ$ for some $E \in \mathbb R$, the goal is to implement an $ε$-approximation to $e^{-itH}$ when the initial state is confined to the subspace corresponding to eigenvalues $[-1, -1+Δ/λ]$ of $H'$. We present a quantum algorithm that uses…
▽ More
We consider the task of simulating time evolution under a Hamiltonian $H$ within its low-energy subspace. Assuming access to a block-encoding of $H'=(H-E)/λ$ for some $E \in \mathbb R$, the goal is to implement an $ε$-approximation to $e^{-itH}$ when the initial state is confined to the subspace corresponding to eigenvalues $[-1, -1+Δ/λ]$ of $H'$. We present a quantum algorithm that uses $O(t\sqrt{λΓ} + \sqrt{λ/Γ}\log(1/ε))$ queries to the block-encoding for any $Γ$ such that $Δ\leq Γ\leq λ$. When $\log(1/ε) = o(tλ)$ and $Δ/λ= o(1)$, this result improves over generic methods with query complexity $Ω(tλ)$. Our quantum algorithm leverages spectral gap amplification and the quantum singular value transform. Using standard access models for $H$, we show that the ability to efficiently block-encode $H'$ is equivalent to $H$ being what we refer to as a "gap-amplifiable" Hamiltonian. This includes physically relevant examples such as frustration-free systems, and it encompasses all previously considered settings of low-energy simulation algorithms. We also provide lower bounds for low-energy simulation. In the worst case, we show that the low-energy condition cannot be used to improve the runtime of Hamiltonian simulation. For gap-amplifiable Hamiltonians, we prove that our algorithm is tight in the query model with respect to $t$, $Δ$, and $λ$. In the practically relevant regime where $\log (1/ε) = o(tΔ)$ and $Δ/λ= o(1)$, we also prove a matching lower bound in gate complexity (up to log factors). To establish the query lower bounds, we consider $\mathrm{PARITY}\circ\mathrm{OR}$ and degree bounds on trigonometric polynomials. To establish the lower bound on gate complexity, we use a circuit-to-Hamiltonian reduction acting on a low-energy state.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Comment on "Comment on "Traversable wormhole dynamics on a quantum processor" "
Authors:
Daniel Jafferis,
Alexander Zlokapa,
Joseph D. Lykken,
David K. Kolchmeyer,
Samantha I. Davis,
Nikolai Lauk,
Hartmut Neven,
Maria Spiropulu
Abstract:
We observe that the comment of [1, arXiv:2302.07897] is consistent with [2] on key points: i) the microscopic mechanism of the experimentally observed teleportation is size winding and ii) the system thermalizes and scrambles at the time of teleportation. These properties are consistent with a gravitational interpretation of the teleportation dynamics, as opposed to the late-time dynamics. The obj…
▽ More
We observe that the comment of [1, arXiv:2302.07897] is consistent with [2] on key points: i) the microscopic mechanism of the experimentally observed teleportation is size winding and ii) the system thermalizes and scrambles at the time of teleportation. These properties are consistent with a gravitational interpretation of the teleportation dynamics, as opposed to the late-time dynamics. The objections of [1] concern counterfactual scenarios outside of the experimentally implemented protocol.
△ Less
Submitted 27 March, 2023;
originally announced March 2023.
-
Bayesian Interpolation with Deep Linear Networks
Authors:
Boris Hanin,
Alexander Zlokapa
Abstract:
Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory. We give here a complete solution in the special case of linear networks with output dimension one trained using zero noise Bayesian inference with Gaussian weight priors and mean squared error as a negative log-likelihood. For any training dataset, network dep…
▽ More
Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory. We give here a complete solution in the special case of linear networks with output dimension one trained using zero noise Bayesian inference with Gaussian weight priors and mean squared error as a negative log-likelihood. For any training dataset, network depth, and hidden layer widths, we find non-asymptotic expressions for the predictive posterior and Bayesian model evidence in terms of Meijer-G functions, a class of meromorphic special functions of a single complex variable. Through novel asymptotic expansions of these Meijer-G functions, a rich new picture of the joint role of depth, width, and dataset size emerges. We show that linear networks make provably optimal predictions at infinite depth: the posterior of infinitely deep linear networks with data-agnostic priors is the same as that of shallow networks with evidence-maximizing data-dependent priors. This yields a principled reason to prefer deeper networks when priors are forced to be data-agnostic. Moreover, we show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth, elucidating the salutary role of increased depth for model selection. Underpinning our results is a novel emergent notion of effective depth, given by the number of hidden layers times the number of data points divided by the network width; this determines the structure of the posterior in the large-data limit.
△ Less
Submitted 14 May, 2023; v1 submitted 29 December, 2022;
originally announced December 2022.
-
Fault-Tolerant Neural Networks from Biological Error Correction Codes
Authors:
Alexander Zlokapa,
Andrew K. Tan,
John M. Martyn,
Ila R. Fiete,
Max Tegmark,
Isaac L. Chuang
Abstract:
It has been an open question in deep learning if fault-tolerant computation is possible: can arbitrarily reliable computation be achieved using only unreliable neurons? In the grid cells of the mammalian cortex, analog error correction codes have been observed to protect states against neural spiking noise, but their role in information processing is unclear. Here, we use these biological error co…
▽ More
It has been an open question in deep learning if fault-tolerant computation is possible: can arbitrarily reliable computation be achieved using only unreliable neurons? In the grid cells of the mammalian cortex, analog error correction codes have been observed to protect states against neural spiking noise, but their role in information processing is unclear. Here, we use these biological error correction codes to develop a universal fault-tolerant neural network that achieves reliable computation if the faultiness of each neuron lies below a sharp threshold; remarkably, we find that noisy biological neurons fall below this threshold. The discovery of a phase transition from faulty to fault-tolerant neural computation suggests a mechanism for reliable computation in the cortex and opens a path towards understanding noisy analog systems relevant to artificial intelligence and neuromorphic computing.
△ Less
Submitted 9 February, 2024; v1 submitted 25 February, 2022;
originally announced February 2022.
-
A quantum algorithm for training wide and deep classical neural networks
Authors:
Alexander Zlokapa,
Hartmut Neven,
Seth Lloyd
Abstract:
Given the success of deep learning in classical machine learning, quantum algorithms for traditional neural network architectures may provide one of the most promising settings for quantum machine learning. Considering a fully-connected feedforward neural network, we show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving q…
▽ More
Given the success of deep learning in classical machine learning, quantum algorithms for traditional neural network architectures may provide one of the most promising settings for quantum machine learning. Considering a fully-connected feedforward neural network, we show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems. We propose a quantum algorithm to approximately train a wide and deep neural network up to $O(1/n)$ error for a training set of size $n$ by performing sparse matrix inversion in $O(\log n)$ time. To achieve an end-to-end exponential speedup over gradient descent, the data distribution must permit efficient state preparation and readout. We numerically demonstrate that the MNIST image dataset satisfies such conditions; moreover, the quantum algorithm matches the accuracy of the fully-connected network. Beyond the proven architecture, we provide empirical evidence for $O(\log n)$ training of a convolutional neural network with pooling.
△ Less
Submitted 19 July, 2021;
originally announced July 2021.
-
Entangling Quantum Generative Adversarial Networks
Authors:
Murphy Yuezhen Niu,
Alexander Zlokapa,
Michael Broughton,
Sergio Boixo,
Masoud Mohseni,
Vadim Smelyanskyi,
Hartmut Neven
Abstract:
Generative adversarial networks (GANs) are one of the most widely adopted semisupervised and unsupervised machine learning methods for high-definition image, video, and audio generation. In this work, we propose a new type of architecture for quantum generative adversarial networks (entangling quantum GAN, EQ-GAN) that overcomes some limitations of previously proposed quantum GANs. Leveraging the…
▽ More
Generative adversarial networks (GANs) are one of the most widely adopted semisupervised and unsupervised machine learning methods for high-definition image, video, and audio generation. In this work, we propose a new type of architecture for quantum generative adversarial networks (entangling quantum GAN, EQ-GAN) that overcomes some limitations of previously proposed quantum GANs. Leveraging the entangling power of quantum circuits, EQ-GAN guarantees the convergence to a Nash equilibrium under minimax optimization of the discriminator and generator circuits by performing entangling operations between both the generator output and true quantum data. We show that EQ-GAN has additional robustness against coherent errors and demonstrate the effectiveness of EQ-GAN experimentally in a Google Sycamore superconducting quantum processor. By adversarially learning efficient representations of quantum states, we prepare an approximate quantum random access memory (QRAM) and demonstrate its use in applications including the training of quantum neural networks.
△ Less
Submitted 23 May, 2021; v1 submitted 30 April, 2021;
originally announced May 2021.
-
A deep learning model for noise prediction on near-term quantum devices
Authors:
Alexander Zlokapa,
Alexandru Gheorghiu
Abstract:
We present an approach for a deep-learning compiler of quantum circuits, designed to reduce the output noise of circuits run on a specific device. We train a convolutional neural network on experimental data from a quantum device to learn a hardware-specific noise model. A compiler then uses the trained network as a noise predictor and inserts sequences of gates in circuits so as to minimize expec…
▽ More
We present an approach for a deep-learning compiler of quantum circuits, designed to reduce the output noise of circuits run on a specific device. We train a convolutional neural network on experimental data from a quantum device to learn a hardware-specific noise model. A compiler then uses the trained network as a noise predictor and inserts sequences of gates in circuits so as to minimize expected noise. We tested this approach on the IBM 5-qubit devices and observed a reduction in output noise of 12.3% (95% CI [11.5%, 13.0%]) compared to the circuits obtained by the Qiskit compiler. Moreover, the trained noise model is hardware-specific: applying a noise model trained on one device to another device yields a noise reduction of only 5.2% (95% CI [4.9%, 5.6%]). These results suggest that device-specific compilers using machine learning may yield higher fidelity operations and provide insights for the design of noise models.
△ Less
Submitted 21 May, 2020;
originally announced May 2020.
-
Boundaries of quantum supremacy via random circuit sampling
Authors:
Alexander Zlokapa,
Sergio Boixo,
Daniel Lidar
Abstract:
Google's recent quantum supremacy experiment heralded a transition point where quantum computing performed a computational task, random circuit sampling, that is beyond the practical reach of modern supercomputers. We examine the constraints of the observed quantum runtime advantage in an extrapolation to circuits with a larger number of qubits and gates. Due to the exponential decrease of the exp…
▽ More
Google's recent quantum supremacy experiment heralded a transition point where quantum computing performed a computational task, random circuit sampling, that is beyond the practical reach of modern supercomputers. We examine the constraints of the observed quantum runtime advantage in an extrapolation to circuits with a larger number of qubits and gates. Due to the exponential decrease of the experimental fidelity with the number of qubits and gates, we demonstrate for current fidelities a theoretical classical runtime advantage for circuits deeper than a few hundred gates, while quantum runtimes for cross-entropy benchmarking limit the region of a quantum advantage to a few hundred qubits. However, the quantum runtime advantage boundary in circuit width and depth grows exponentially with respect to reduced error rates, and our work highlights the importance of continued progress along this line. Extrapolations of measured error rates suggest that the limiting circuit size for which a computationally feasible quantum runtime advantage in cross-entropy benchmarking can be achieved approximately coincides with expectations for early implementations of the surface code and other quantum error correction methods. Thus the boundaries of quantum supremacy via random circuit sampling may fortuitously coincide with the advent of scalable, error corrected quantum computing in the near term.
△ Less
Submitted 9 October, 2020; v1 submitted 5 May, 2020;
originally announced May 2020.
-
Graph Neural Networks for Particle Reconstruction in High Energy Physics detectors
Authors:
Xiangyang Ju,
Steven Farrell,
Paolo Calafiura,
Daniel Murnane,
Prabhat,
Lindsey Gray,
Thomas Klijnsma,
Kevin Pedro,
Giuseppe Cerati,
Jim Kowalkowski,
Gabriel Perdue,
Panagiotis Spentzouris,
Nhan Tran,
Jean-Roch Vlimant,
Alexander Zlokapa,
Joosep Pata,
Maria Spiropulu,
Sitong An,
Adam Aurisano,
V Hewes,
Aristeidis Tsaris,
Kazuhiro Terao,
Tracy Usher
Abstract:
Pattern recognition problems in high energy physics are notably different from traditional machine learning applications in computer vision. Reconstruction algorithms identify and measure the kinematic properties of particles produced in high energy collisions and recorded with complex detector systems. Two critical applications are the reconstruction of charged particle trajectories in tracking d…
▽ More
Pattern recognition problems in high energy physics are notably different from traditional machine learning applications in computer vision. Reconstruction algorithms identify and measure the kinematic properties of particles produced in high energy collisions and recorded with complex detector systems. Two critical applications are the reconstruction of charged particle trajectories in tracking detectors and the reconstruction of particle showers in calorimeters. These two problems have unique challenges and characteristics, but both have high dimensionality, high degree of sparsity, and complex geometric layouts. Graph Neural Networks (GNNs) are a relatively new class of deep learning architectures which can deal with such data effectively, allowing scientists to incorporate domain knowledge in a graph structure and learn powerful representations leveraging that structure to identify patterns of interest. In this work we demonstrate the applicability of GNNs to these two diverse particle reconstruction problems.
△ Less
Submitted 3 June, 2020; v1 submitted 25 March, 2020;
originally announced March 2020.
-
TensorFlow Quantum: A Software Framework for Quantum Machine Learning
Authors:
Michael Broughton,
Guillaume Verdon,
Trevor McCourt,
Antonio J. Martinez,
Jae Hyeon Yoo,
Sergei V. Isakov,
Philip Massey,
Ramin Halavati,
Murphy Yuezhen Niu,
Alexander Zlokapa,
Evan Peters,
Owen Lockwood,
Andrea Skolik,
Sofiene Jerbi,
Vedran Dunjko,
Martin Leib,
Michael Streif,
David Von Dollen,
Hongxiang Chen,
Shuxiang Cao,
Roeland Wiersema,
Hsin-Yuan Huang,
Jarrod R. McClean,
Ryan Babbush,
Sergio Boixo
, et al. (4 additional authors not shown)
Abstract:
We introduce TensorFlow Quantum (TFQ), an open source library for the rapid prototy** of hybrid quantum-classical models for classical or quantum data. This framework offers high-level abstractions for the design and training of both discriminative and generative quantum models under TensorFlow and supports high-performance quantum circuit simulators. We provide an overview of the software archi…
▽ More
We introduce TensorFlow Quantum (TFQ), an open source library for the rapid prototy** of hybrid quantum-classical models for classical or quantum data. This framework offers high-level abstractions for the design and training of both discriminative and generative quantum models under TensorFlow and supports high-performance quantum circuit simulators. We provide an overview of the software architecture and building blocks through several examples and review the theory of hybrid quantum-classical neural networks. We illustrate TFQ functionalities via several basic applications including supervised learning for quantum classification, quantum control, simulating noisy quantum circuits, and quantum approximate optimization. Moreover, we demonstrate how one can apply TFQ to tackle advanced quantum learning tasks including meta-learning, layerwise learning, Hamiltonian learning, sampling thermal states, variational quantum eigensolvers, classification of quantum phase transitions, generative adversarial networks, and reinforcement learning. We hope this framework provides the necessary tools for the quantum computing and machine learning research communities to explore models of both natural and artificial quantum systems, and ultimately discover new quantum algorithms which could potentially yield a quantum advantage.
△ Less
Submitted 26 August, 2021; v1 submitted 5 March, 2020;
originally announced March 2020.
-
Quantum adiabatic machine learning with zooming
Authors:
Alexander Zlokapa,
Alex Mott,
Joshua Job,
Jean-Roch Vlimant,
Daniel Lidar,
Maria Spiropulu
Abstract:
Recent work has shown that quantum annealing for machine learning, referred to as QAML, can perform comparably to state-of-the-art machine learning methods with a specific application to Higgs boson classification. We propose QAML-Z, a novel algorithm that iteratively zooms in on a region of the energy surface by map** the problem to a continuous space and sequentially applying quantum annealing…
▽ More
Recent work has shown that quantum annealing for machine learning, referred to as QAML, can perform comparably to state-of-the-art machine learning methods with a specific application to Higgs boson classification. We propose QAML-Z, a novel algorithm that iteratively zooms in on a region of the energy surface by map** the problem to a continuous space and sequentially applying quantum annealing to an augmented set of weak classifiers. Results on a programmable quantum annealer show that QAML-Z matches classical deep neural network performance at small training set sizes and reduces the performance margin between QAML and classical deep neural networks by almost 50% at large training set sizes, as measured by area under the ROC curve. The significant improvement of quantum annealing algorithms for machine learning and the use of a discrete quantum algorithm on a continuous optimization problem both opens a new class of problems that can be solved by quantum annealers and suggests the approach in performance of near-term quantum machine learning towards classical benchmarks.
△ Less
Submitted 23 October, 2020; v1 submitted 13 August, 2019;
originally announced August 2019.
-
Charged particle tracking with quantum annealing-inspired optimization
Authors:
Alexander Zlokapa,
Abhishek Anand,
Jean-Roch Vlimant,
Javier M. Duarte,
Joshua Job,
Daniel Lidar,
Maria Spiropulu
Abstract:
At the High Luminosity Large Hadron Collider (HL-LHC), traditional track reconstruction techniques that are critical for analysis are expected to face challenges due to scaling with track density. Quantum annealing has shown promise in its ability to solve combinatorial optimization problems amidst an ongoing effort to establish evidence of a quantum speedup. As a step towards exploiting such pote…
▽ More
At the High Luminosity Large Hadron Collider (HL-LHC), traditional track reconstruction techniques that are critical for analysis are expected to face challenges due to scaling with track density. Quantum annealing has shown promise in its ability to solve combinatorial optimization problems amidst an ongoing effort to establish evidence of a quantum speedup. As a step towards exploiting such potential speedup, we investigate a track reconstruction approach by adapting the existing geometric Denby-Peterson (Hopfield) network method to the quantum annealing framework and to HL-LHC conditions. Furthermore, we develop additional techniques to embed the problem onto existing and near-term quantum annealing hardware. Results using simulated annealing and quantum annealing with the D-Wave 2X system on the TrackML dataset are presented, demonstrating the successful application of a quantum annealing-inspired algorithm to the track reconstruction challenge. We find that combinatorial optimization problems can effectively reconstruct tracks, suggesting possible applications for fast hardware-specific implementations at the LHC while leaving open the possibility of a quantum speedup for tracking.
△ Less
Submitted 12 August, 2019;
originally announced August 2019.