Search | arXiv e-print repository

Bayesian Inference with Deep Weakly Nonlinear Networks

Abstract: We show at a physics level of rigor that Bayesian inference with a fully connected neural network and a shaped nonlinearity of the form $φ(t) = t + ψt^3/L$ is (perturbatively) solvable in the regime where the number of training datapoints $P$ , the input dimension $N_0$, the network layer widths $N$, and the network depth $L$ are simultaneously large. Our results hold with weak assumptions on the… ▽ More We show at a physics level of rigor that Bayesian inference with a fully connected neural network and a shaped nonlinearity of the form $φ(t) = t + ψt^3/L$ is (perturbatively) solvable in the regime where the number of training datapoints $P$ , the input dimension $N_0$, the network layer widths $N$, and the network depth $L$ are simultaneously large. Our results hold with weak assumptions on the data; the main constraint is that $P < N_0$. We provide techniques to compute the model evidence and posterior to arbitrary order in $1/N$ and at arbitrary temperature. We report the following results from the first-order computation: 1. When the width $N$ is much larger than the depth $L$ and training set size $P$, neural network Bayesian inference coincides with Bayesian inference using a kernel. The value of $ψ$ determines the curvature of a sphere, hyperbola, or plane into which the training data is implicitly embedded under the feature map. 2. When $LP/N$ is a small constant, neural network Bayesian inference departs from the kernel regime. At zero temperature, neural network Bayesian inference is equivalent to Bayesian inference using a data-dependent kernel, and $LP/N$ serves as an effective depth that controls the extent of feature learning. 3. In the restricted case of deep linear networks ($ψ=0$) and noisy data, we show a simple data model for which evidence and generalization error are optimal at zero temperature. As $LP/N$ increases, both evidence and generalization further improve, demonstrating the benefit of depth in benign overfitting. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.07876 [pdf, other]

doi 10.7907/0b9qh-jt654

Long-range wormhole teleportation

Authors: Joseph D. Lykken, Daniel Jafferis, Alexander Zlokapa, David K. Kolchmeyer, Samantha I. Davis, Hartmut Neven, Maria Spiropulu

Abstract: We extend the protocol of Gao and Jafferis arXiv:1911.07416 to allow wormhole teleportation between two entangled copies of the Sachdev-Ye-Kitaev (SYK) model communicating only through a classical channel. We demonstrate in finite $N$ simulations that the protocol exhibits the characteristic holographic features of wormhole teleportation discussed and summarized in Jafferis et al. https://www.natu… ▽ More We extend the protocol of Gao and Jafferis arXiv:1911.07416 to allow wormhole teleportation between two entangled copies of the Sachdev-Ye-Kitaev (SYK) model communicating only through a classical channel. We demonstrate in finite $N$ simulations that the protocol exhibits the characteristic holographic features of wormhole teleportation discussed and summarized in Jafferis et al. https://www.nature.com/articles/s41586-022-05424-3 . We review and exhibit in detail how these holographic features relate to size winding which, as first shown by Brown et al. arXiv:1911.06314 and Nezami et al. arXiv:2102.01064, encodes a dual description of wormhole teleportation. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 52 pages, 24 figures, for submission to PRX Quantum

Report number: FERMILAB-PUB-24-0184-ETD, CaltechAUTHORS:0b9qh-jt654

arXiv:2404.03644 [pdf, other]

Hamiltonian simulation for low-energy states with optimal time dependence

Authors: Alexander Zlokapa, Rolando D. Somma

Abstract: We consider the task of simulating time evolution under a Hamiltonian $H$ within its low-energy subspace. Assuming access to a block-encoding of $H'=(H-E)/λ$ for some $E \in \mathbb R$, the goal is to implement an $ε$-approximation to $e^{-itH}$ when the initial state is confined to the subspace corresponding to eigenvalues $[-1, -1+Δ/λ]$ of $H'$. We present a quantum algorithm that uses… ▽ More We consider the task of simulating time evolution under a Hamiltonian $H$ within its low-energy subspace. Assuming access to a block-encoding of $H'=(H-E)/λ$ for some $E \in \mathbb R$, the goal is to implement an $ε$-approximation to $e^{-itH}$ when the initial state is confined to the subspace corresponding to eigenvalues $[-1, -1+Δ/λ]$ of $H'$. We present a quantum algorithm that uses $O(t\sqrt{λΓ} + \sqrt{λ/Γ}\log(1/ε))$ queries to the block-encoding for any $Γ$ such that $Δ\leq Γ\leq λ$. When $\log(1/ε) = o(tλ)$ and $Δ/λ= o(1)$, this result improves over generic methods with query complexity $Ω(tλ)$. Our quantum algorithm leverages spectral gap amplification and the quantum singular value transform. Using standard access models for $H$, we show that the ability to efficiently block-encode $H'$ is equivalent to $H$ being what we refer to as a "gap-amplifiable" Hamiltonian. This includes physically relevant examples such as frustration-free systems, and it encompasses all previously considered settings of low-energy simulation algorithms. We also provide lower bounds for low-energy simulation. In the worst case, we show that the low-energy condition cannot be used to improve the runtime of Hamiltonian simulation. For gap-amplifiable Hamiltonians, we prove that our algorithm is tight in the query model with respect to $t$, $Δ$, and $λ$. In the practically relevant regime where $\log (1/ε) = o(tΔ)$ and $Δ/λ= o(1)$, we also prove a matching lower bound in gate complexity (up to log factors). To establish the query lower bounds, we consider $\mathrm{PARITY}\circ\mathrm{OR}$ and degree bounds on trigonometric polynomials. To establish the lower bound on gate complexity, we use a circuit-to-Hamiltonian reduction acting on a low-energy state. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: 58 pages. Abstract shortened to fit within the arXiv limit

arXiv:2303.15423 [pdf, other]

Comment on "Comment on "Traversable wormhole dynamics on a quantum processor" "

Authors: Daniel Jafferis, Alexander Zlokapa, Joseph D. Lykken, David K. Kolchmeyer, Samantha I. Davis, Nikolai Lauk, Hartmut Neven, Maria Spiropulu

Abstract: We observe that the comment of [1, arXiv:2302.07897] is consistent with [2] on key points: i) the microscopic mechanism of the experimentally observed teleportation is size winding and ii) the system thermalizes and scrambles at the time of teleportation. These properties are consistent with a gravitational interpretation of the teleportation dynamics, as opposed to the late-time dynamics. The obj… ▽ More We observe that the comment of [1, arXiv:2302.07897] is consistent with [2] on key points: i) the microscopic mechanism of the experimentally observed teleportation is size winding and ii) the system thermalizes and scrambles at the time of teleportation. These properties are consistent with a gravitational interpretation of the teleportation dynamics, as opposed to the late-time dynamics. The objections of [1] concern counterfactual scenarios outside of the experimentally implemented protocol. △ Less

Submitted 27 March, 2023; originally announced March 2023.

Comments: 5 pages, 4 figures

arXiv:2212.14457 [pdf, other]

Bayesian Interpolation with Deep Linear Networks

Authors: Boris Hanin, Alexander Zlokapa

Abstract: Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory. We give here a complete solution in the special case of linear networks with output dimension one trained using zero noise Bayesian inference with Gaussian weight priors and mean squared error as a negative log-likelihood. For any training dataset, network dep… ▽ More Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory. We give here a complete solution in the special case of linear networks with output dimension one trained using zero noise Bayesian inference with Gaussian weight priors and mean squared error as a negative log-likelihood. For any training dataset, network depth, and hidden layer widths, we find non-asymptotic expressions for the predictive posterior and Bayesian model evidence in terms of Meijer-G functions, a class of meromorphic special functions of a single complex variable. Through novel asymptotic expansions of these Meijer-G functions, a rich new picture of the joint role of depth, width, and dataset size emerges. We show that linear networks make provably optimal predictions at infinite depth: the posterior of infinitely deep linear networks with data-agnostic priors is the same as that of shallow networks with evidence-maximizing data-dependent priors. This yields a principled reason to prefer deeper networks when priors are forced to be data-agnostic. Moreover, we show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth, elucidating the salutary role of increased depth for model selection. Underpinning our results is a novel emergent notion of effective depth, given by the number of hidden layers times the number of data points divided by the network width; this determines the structure of the posterior in the large-data limit. △ Less

Submitted 14 May, 2023; v1 submitted 29 December, 2022; originally announced December 2022.

arXiv:2202.12887 [pdf, other]

Fault-Tolerant Neural Networks from Biological Error Correction Codes

Authors: Alexander Zlokapa, Andrew K. Tan, John M. Martyn, Ila R. Fiete, Max Tegmark, Isaac L. Chuang

Abstract: It has been an open question in deep learning if fault-tolerant computation is possible: can arbitrarily reliable computation be achieved using only unreliable neurons? In the grid cells of the mammalian cortex, analog error correction codes have been observed to protect states against neural spiking noise, but their role in information processing is unclear. Here, we use these biological error co… ▽ More It has been an open question in deep learning if fault-tolerant computation is possible: can arbitrarily reliable computation be achieved using only unreliable neurons? In the grid cells of the mammalian cortex, analog error correction codes have been observed to protect states against neural spiking noise, but their role in information processing is unclear. Here, we use these biological error correction codes to develop a universal fault-tolerant neural network that achieves reliable computation if the faultiness of each neuron lies below a sharp threshold; remarkably, we find that noisy biological neurons fall below this threshold. The discovery of a phase transition from faulty to fault-tolerant neural computation suggests a mechanism for reliable computation in the cortex and opens a path towards understanding noisy analog systems relevant to artificial intelligence and neuromorphic computing. △ Less

Submitted 9 February, 2024; v1 submitted 25 February, 2022; originally announced February 2022.

Report number: MIT-CTP/5395

arXiv:2107.09200 [pdf, other]

A quantum algorithm for training wide and deep classical neural networks

Authors: Alexander Zlokapa, Hartmut Neven, Seth Lloyd

Abstract: Given the success of deep learning in classical machine learning, quantum algorithms for traditional neural network architectures may provide one of the most promising settings for quantum machine learning. Considering a fully-connected feedforward neural network, we show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving q… ▽ More Given the success of deep learning in classical machine learning, quantum algorithms for traditional neural network architectures may provide one of the most promising settings for quantum machine learning. Considering a fully-connected feedforward neural network, we show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems. We propose a quantum algorithm to approximately train a wide and deep neural network up to $O(1/n)$ error for a training set of size $n$ by performing sparse matrix inversion in $O(\log n)$ time. To achieve an end-to-end exponential speedup over gradient descent, the data distribution must permit efficient state preparation and readout. We numerically demonstrate that the MNIST image dataset satisfies such conditions; moreover, the quantum algorithm matches the accuracy of the fully-connected network. Beyond the proven architecture, we provide empirical evidence for $O(\log n)$ training of a convolutional neural network with pooling. △ Less

Submitted 19 July, 2021; originally announced July 2021.

Comments: 10 pages + 13 page appendix, 10 figures; code available at https://github.com/quantummind/quantum-deep-neural-network

arXiv:2105.00080 [pdf, other]

Entangling Quantum Generative Adversarial Networks

Authors: Murphy Yuezhen Niu, Alexander Zlokapa, Michael Broughton, Sergio Boixo, Masoud Mohseni, Vadim Smelyanskyi, Hartmut Neven

Abstract: Generative adversarial networks (GANs) are one of the most widely adopted semisupervised and unsupervised machine learning methods for high-definition image, video, and audio generation. In this work, we propose a new type of architecture for quantum generative adversarial networks (entangling quantum GAN, EQ-GAN) that overcomes some limitations of previously proposed quantum GANs. Leveraging the… ▽ More Generative adversarial networks (GANs) are one of the most widely adopted semisupervised and unsupervised machine learning methods for high-definition image, video, and audio generation. In this work, we propose a new type of architecture for quantum generative adversarial networks (entangling quantum GAN, EQ-GAN) that overcomes some limitations of previously proposed quantum GANs. Leveraging the entangling power of quantum circuits, EQ-GAN guarantees the convergence to a Nash equilibrium under minimax optimization of the discriminator and generator circuits by performing entangling operations between both the generator output and true quantum data. We show that EQ-GAN has additional robustness against coherent errors and demonstrate the effectiveness of EQ-GAN experimentally in a Google Sycamore superconducting quantum processor. By adversarially learning efficient representations of quantum states, we prepare an approximate quantum random access memory (QRAM) and demonstrate its use in applications including the training of quantum neural networks. △ Less

Submitted 23 May, 2021; v1 submitted 30 April, 2021; originally announced May 2021.

arXiv:2005.10811 [pdf, other]

A deep learning model for noise prediction on near-term quantum devices

Authors: Alexander Zlokapa, Alexandru Gheorghiu

Abstract: We present an approach for a deep-learning compiler of quantum circuits, designed to reduce the output noise of circuits run on a specific device. We train a convolutional neural network on experimental data from a quantum device to learn a hardware-specific noise model. A compiler then uses the trained network as a noise predictor and inserts sequences of gates in circuits so as to minimize expec… ▽ More We present an approach for a deep-learning compiler of quantum circuits, designed to reduce the output noise of circuits run on a specific device. We train a convolutional neural network on experimental data from a quantum device to learn a hardware-specific noise model. A compiler then uses the trained network as a noise predictor and inserts sequences of gates in circuits so as to minimize expected noise. We tested this approach on the IBM 5-qubit devices and observed a reduction in output noise of 12.3% (95% CI [11.5%, 13.0%]) compared to the circuits obtained by the Qiskit compiler. Moreover, the trained noise model is hardware-specific: applying a noise model trained on one device to another device yields a noise reduction of only 5.2% (95% CI [4.9%, 5.6%]). These results suggest that device-specific compilers using machine learning may yield higher fidelity operations and provide insights for the design of noise models. △ Less

Submitted 21 May, 2020; originally announced May 2020.

Comments: 5 pages, 4 figures, 1 table. Comments welcome

arXiv:2005.02464 [pdf, other]

doi 10.1038/s41534-023-00703-x

Boundaries of quantum supremacy via random circuit sampling

Authors: Alexander Zlokapa, Sergio Boixo, Daniel Lidar

Abstract: Google's recent quantum supremacy experiment heralded a transition point where quantum computing performed a computational task, random circuit sampling, that is beyond the practical reach of modern supercomputers. We examine the constraints of the observed quantum runtime advantage in an extrapolation to circuits with a larger number of qubits and gates. Due to the exponential decrease of the exp… ▽ More Google's recent quantum supremacy experiment heralded a transition point where quantum computing performed a computational task, random circuit sampling, that is beyond the practical reach of modern supercomputers. We examine the constraints of the observed quantum runtime advantage in an extrapolation to circuits with a larger number of qubits and gates. Due to the exponential decrease of the experimental fidelity with the number of qubits and gates, we demonstrate for current fidelities a theoretical classical runtime advantage for circuits deeper than a few hundred gates, while quantum runtimes for cross-entropy benchmarking limit the region of a quantum advantage to a few hundred qubits. However, the quantum runtime advantage boundary in circuit width and depth grows exponentially with respect to reduced error rates, and our work highlights the importance of continued progress along this line. Extrapolations of measured error rates suggest that the limiting circuit size for which a computationally feasible quantum runtime advantage in cross-entropy benchmarking can be achieved approximately coincides with expectations for early implementations of the surface code and other quantum error correction methods. Thus the boundaries of quantum supremacy via random circuit sampling may fortuitously coincide with the advent of scalable, error corrected quantum computing in the near term. △ Less

Submitted 9 October, 2020; v1 submitted 5 May, 2020; originally announced May 2020.

Comments: 9 pages, 3 figures

Journal ref: npj Quantum Information, 9, 36 (2023)

arXiv:2003.11603 [pdf, other]

Graph Neural Networks for Particle Reconstruction in High Energy Physics detectors

Authors: Xiangyang Ju, Steven Farrell, Paolo Calafiura, Daniel Murnane, Prabhat, Lindsey Gray, Thomas Klijnsma, Kevin Pedro, Giuseppe Cerati, Jim Kowalkowski, Gabriel Perdue, Panagiotis Spentzouris, Nhan Tran, Jean-Roch Vlimant, Alexander Zlokapa, Joosep Pata, Maria Spiropulu, Sitong An, Adam Aurisano, V Hewes, Aristeidis Tsaris, Kazuhiro Terao, Tracy Usher

Abstract: Pattern recognition problems in high energy physics are notably different from traditional machine learning applications in computer vision. Reconstruction algorithms identify and measure the kinematic properties of particles produced in high energy collisions and recorded with complex detector systems. Two critical applications are the reconstruction of charged particle trajectories in tracking d… ▽ More Pattern recognition problems in high energy physics are notably different from traditional machine learning applications in computer vision. Reconstruction algorithms identify and measure the kinematic properties of particles produced in high energy collisions and recorded with complex detector systems. Two critical applications are the reconstruction of charged particle trajectories in tracking detectors and the reconstruction of particle showers in calorimeters. These two problems have unique challenges and characteristics, but both have high dimensionality, high degree of sparsity, and complex geometric layouts. Graph Neural Networks (GNNs) are a relatively new class of deep learning architectures which can deal with such data effectively, allowing scientists to incorporate domain knowledge in a graph structure and learn powerful representations leveraging that structure to identify patterns of interest. In this work we demonstrate the applicability of GNNs to these two diverse particle reconstruction problems. △ Less

Submitted 3 June, 2020; v1 submitted 25 March, 2020; originally announced March 2020.

Comments: Presented at NeurIPS 2019 Workshop "Machine Learning and the Physical Sciences"

arXiv:2003.02989 [pdf, other]

TensorFlow Quantum: A Software Framework for Quantum Machine Learning

Authors: Michael Broughton, Guillaume Verdon, Trevor McCourt, Antonio J. Martinez, Jae Hyeon Yoo, Sergei V. Isakov, Philip Massey, Ramin Halavati, Murphy Yuezhen Niu, Alexander Zlokapa, Evan Peters, Owen Lockwood, Andrea Skolik, Sofiene Jerbi, Vedran Dunjko, Martin Leib, Michael Streif, David Von Dollen, Hongxiang Chen, Shuxiang Cao, Roeland Wiersema, Hsin-Yuan Huang, Jarrod R. McClean, Ryan Babbush, Sergio Boixo , et al. (4 additional authors not shown)

Abstract: We introduce TensorFlow Quantum (TFQ), an open source library for the rapid prototy** of hybrid quantum-classical models for classical or quantum data. This framework offers high-level abstractions for the design and training of both discriminative and generative quantum models under TensorFlow and supports high-performance quantum circuit simulators. We provide an overview of the software archi… ▽ More We introduce TensorFlow Quantum (TFQ), an open source library for the rapid prototy** of hybrid quantum-classical models for classical or quantum data. This framework offers high-level abstractions for the design and training of both discriminative and generative quantum models under TensorFlow and supports high-performance quantum circuit simulators. We provide an overview of the software architecture and building blocks through several examples and review the theory of hybrid quantum-classical neural networks. We illustrate TFQ functionalities via several basic applications including supervised learning for quantum classification, quantum control, simulating noisy quantum circuits, and quantum approximate optimization. Moreover, we demonstrate how one can apply TFQ to tackle advanced quantum learning tasks including meta-learning, layerwise learning, Hamiltonian learning, sampling thermal states, variational quantum eigensolvers, classification of quantum phase transitions, generative adversarial networks, and reinforcement learning. We hope this framework provides the necessary tools for the quantum computing and machine learning research communities to explore models of both natural and artificial quantum systems, and ultimately discover new quantum algorithms which could potentially yield a quantum advantage. △ Less

Submitted 26 August, 2021; v1 submitted 5 March, 2020; originally announced March 2020.

Comments: 56 pages, 34 figures, many updates throughout the manuscript, several new sections are added

arXiv:1908.04480 [pdf, other]

doi 10.1103/PhysRevA.102.062405

Quantum adiabatic machine learning with zooming

Authors: Alexander Zlokapa, Alex Mott, Joshua Job, Jean-Roch Vlimant, Daniel Lidar, Maria Spiropulu

Abstract: Recent work has shown that quantum annealing for machine learning, referred to as QAML, can perform comparably to state-of-the-art machine learning methods with a specific application to Higgs boson classification. We propose QAML-Z, a novel algorithm that iteratively zooms in on a region of the energy surface by map** the problem to a continuous space and sequentially applying quantum annealing… ▽ More Recent work has shown that quantum annealing for machine learning, referred to as QAML, can perform comparably to state-of-the-art machine learning methods with a specific application to Higgs boson classification. We propose QAML-Z, a novel algorithm that iteratively zooms in on a region of the energy surface by map** the problem to a continuous space and sequentially applying quantum annealing to an augmented set of weak classifiers. Results on a programmable quantum annealer show that QAML-Z matches classical deep neural network performance at small training set sizes and reduces the performance margin between QAML and classical deep neural networks by almost 50% at large training set sizes, as measured by area under the ROC curve. The significant improvement of quantum annealing algorithms for machine learning and the use of a discrete quantum algorithm on a continuous optimization problem both opens a new class of problems that can be solved by quantum annealers and suggests the approach in performance of near-term quantum machine learning towards classical benchmarks. △ Less

Submitted 23 October, 2020; v1 submitted 13 August, 2019; originally announced August 2019.

Comments: 9 pages, 5 figures

Journal ref: Phys. Rev. A 102, 062405 (2020)

arXiv:1908.04475 [pdf, other]

doi 10.1007/s42484-021-00054-w

Charged particle tracking with quantum annealing-inspired optimization

Authors: Alexander Zlokapa, Abhishek Anand, Jean-Roch Vlimant, Javier M. Duarte, Joshua Job, Daniel Lidar, Maria Spiropulu

Abstract: At the High Luminosity Large Hadron Collider (HL-LHC), traditional track reconstruction techniques that are critical for analysis are expected to face challenges due to scaling with track density. Quantum annealing has shown promise in its ability to solve combinatorial optimization problems amidst an ongoing effort to establish evidence of a quantum speedup. As a step towards exploiting such pote… ▽ More At the High Luminosity Large Hadron Collider (HL-LHC), traditional track reconstruction techniques that are critical for analysis are expected to face challenges due to scaling with track density. Quantum annealing has shown promise in its ability to solve combinatorial optimization problems amidst an ongoing effort to establish evidence of a quantum speedup. As a step towards exploiting such potential speedup, we investigate a track reconstruction approach by adapting the existing geometric Denby-Peterson (Hopfield) network method to the quantum annealing framework and to HL-LHC conditions. Furthermore, we develop additional techniques to embed the problem onto existing and near-term quantum annealing hardware. Results using simulated annealing and quantum annealing with the D-Wave 2X system on the TrackML dataset are presented, demonstrating the successful application of a quantum annealing-inspired algorithm to the track reconstruction challenge. We find that combinatorial optimization problems can effectively reconstruct tracks, suggesting possible applications for fast hardware-specific implementations at the LHC while leaving open the possibility of a quantum speedup for tracking. △ Less

Submitted 12 August, 2019; originally announced August 2019.

Comments: 18 pages, 21 figures

Journal ref: Quantum Mach. Intell. 3, 27 (2021)

Showing 1–14 of 14 results for author: Zlokapa, A