Search | arXiv e-print repository

arXiv:2204.01935 [pdf, other]

Nonlocal optimization of binary neural networks

Authors: Amir Khoshaman, Giuseppe Castiglione, Christopher Srinivasa

Abstract: We explore training Binary Neural Networks (BNNs) as a discrete variable inference problem over a factor graph. We study the behaviour of this conversion in an under-parameterized BNN setting and propose stochastic versions of Belief Propagation (BP) and Survey Propagation (SP) message passing algorithms to overcome the intractability of their current formulation. Compared to traditional gradient… ▽ More We explore training Binary Neural Networks (BNNs) as a discrete variable inference problem over a factor graph. We study the behaviour of this conversion in an under-parameterized BNN setting and propose stochastic versions of Belief Propagation (BP) and Survey Propagation (SP) message passing algorithms to overcome the intractability of their current formulation. Compared to traditional gradient methods for BNNs, our results indicate that both stochastic BP and SP find better configurations of the parameters in the BNN. △ Less

Submitted 4 April, 2022; originally announced April 2022.

Comments: arXiv admin note: text overlap with arXiv:2203.16668

arXiv:1912.02119 [pdf, other]

A Path Towards Quantum Advantage in Training Deep Generative Models with Quantum Annealers

Authors: Walter Vinci, Lorenzo Buffoni, Hossein Sadeghi, Amir Khoshaman, Evgeny Andriyash, Mohammad H. Amin

Abstract: The development of quantum-classical hybrid (QCH) algorithms is critical to achieve state-of-the-art computational models. A QCH variational autoencoder (QVAE) was introduced in Ref. [1] by some of the authors of this paper. QVAE consists of a classical auto-encoding structure realized by traditional deep neural networks to perform inference to, and generation from, a discrete latent space. The la… ▽ More The development of quantum-classical hybrid (QCH) algorithms is critical to achieve state-of-the-art computational models. A QCH variational autoencoder (QVAE) was introduced in Ref. [1] by some of the authors of this paper. QVAE consists of a classical auto-encoding structure realized by traditional deep neural networks to perform inference to, and generation from, a discrete latent space. The latent generative process is formalized as thermal sampling from either a quantum or classical Boltzmann machine (QBM or BM). This setup allows quantum-assisted training of deep generative models by physically simulating the generative process with quantum annealers. In this paper, we have successfully employed D-Wave quantum annealers as Boltzmann samplers to perform quantum-assisted, end-to-end training of QVAE. The hybrid structure of QVAE allows us to deploy current-generation quantum annealers in QCH generative models to achieve competitive performance on datasets such as MNIST. The results presented in this paper suggest that commercially available quantum annealers can be deployed, in conjunction with well-crafted classical deep neutral networks, to achieve competitive results in unsupervised and semisupervised tasks on large-scale datasets. We also provide evidence that our setup is able to exploit large latent-space (Q)BMs, which develop slowly mixing modes. This expressive latent space results in slow and inefficient classical sampling, and paves the way to achieve quantum advantage with quantum annealing in realistic sampling applications. △ Less

Submitted 4 December, 2019; originally announced December 2019.

Comments: 20 pages, 14 figures

arXiv:1805.07349 [pdf, other]

GumBolt: Extending Gumbel trick to Boltzmann priors

Authors: Amir H. Khoshaman, Mohammad H. Amin

Abstract: Boltzmann machines (BMs) are appealing candidates for powerful priors in variational autoencoders (VAEs), as they are capable of capturing nontrivial and multi-modal distributions over discrete variables. However, non-differentiability of the discrete units prohibits using the reparameterization trick, essential for low-noise back propagation. The Gumbel trick resolves this problem in a consistent… ▽ More Boltzmann machines (BMs) are appealing candidates for powerful priors in variational autoencoders (VAEs), as they are capable of capturing nontrivial and multi-modal distributions over discrete variables. However, non-differentiability of the discrete units prohibits using the reparameterization trick, essential for low-noise back propagation. The Gumbel trick resolves this problem in a consistent way by relaxing the variables and distributions, but it is incompatible with BM priors. Here, we propose the GumBolt, a model that extends the Gumbel trick to BM priors in VAEs. GumBolt is significantly simpler than the recently proposed methods with BM prior and outperforms them by a considerable margin. It achieves state-of-the-art performance on permutation invariant MNIST and OMNIGLOT datasets in the scope of models with only discrete latent variables. Moreover, the performance can be further improved by allowing multi-sampled (importance-weighted) estimation of log-likelihood in training, which was not possible with previous models. △ Less

Submitted 1 March, 2019; v1 submitted 18 May, 2018; originally announced May 2018.

Comments: 10 pages, 2 Figures, 2 Tables

arXiv:1802.05779 [pdf, other]

doi 10.1088/2058-9565/aada1f

Quantum Variational Autoencoder

Authors: Amir Khoshaman, Walter Vinci, Brandon Denis, Evgeny Andriyash, Hossein Sadeghi, Mohammad H. Amin

Abstract: Variational autoencoders (VAEs) are powerful generative models with the salient ability to perform inference. Here, we introduce a quantum variational autoencoder (QVAE): a VAE whose latent generative process is implemented as a quantum Boltzmann machine (QBM). We show that our model can be trained end-to-end by maximizing a well-defined loss-function: a 'quantum' lower-bound to a variational appr… ▽ More Variational autoencoders (VAEs) are powerful generative models with the salient ability to perform inference. Here, we introduce a quantum variational autoencoder (QVAE): a VAE whose latent generative process is implemented as a quantum Boltzmann machine (QBM). We show that our model can be trained end-to-end by maximizing a well-defined loss-function: a 'quantum' lower-bound to a variational approximation of the log-likelihood. We use quantum Monte Carlo (QMC) simulations to train and evaluate the performance of QVAEs. To achieve the best performance, we first create a VAE platform with discrete latent space generated by a restricted Boltzmann machine (RBM). Our model achieves state-of-the-art performance on the MNIST dataset when compared against similar approaches that only involve discrete variables in the generative process. We consider QVAEs with a smaller number of latent units to be able to perform QMC simulations, which are computationally expensive. We show that QVAEs can be trained effectively in regimes where quantum effects are relevant despite training via the quantum bound. Our findings open the way to the use of quantum computers to train QVAEs to achieve competitive performance for generative models. Placing a QBM in the latent space of a VAE leverages the full potential of current and next-generation quantum computers as sampling devices. △ Less

Submitted 12 January, 2019; v1 submitted 15 February, 2018; originally announced February 2018.

Comments: v2: published version. 13 pages, 3 figures, 2 tables

Journal ref: Quantum Sci. Technol. 4 (2019) 014001

arXiv:1802.04920 [pdf, other]

DVAE++: Discrete Variational Autoencoders with Overlap** Transformations

Authors: Arash Vahdat, William G. Macready, Zhengbing Bian, Amir Khoshaman, Evgeny Andriyash

Abstract: Training of discrete latent variable models remains challenging because passing gradient information through discrete units is difficult. We propose a new class of smoothing transformations based on a mixture of two overlap** distributions, and show that the proposed transformation can be used for training binary latent models with either directed or undirected priors. We derive a new variationa… ▽ More Training of discrete latent variable models remains challenging because passing gradient information through discrete units is difficult. We propose a new class of smoothing transformations based on a mixture of two overlap** distributions, and show that the proposed transformation can be used for training binary latent models with either directed or undirected priors. We derive a new variational bound to efficiently train with Boltzmann machine priors. Using this bound, we develop DVAE++, a generative model with a global discrete prior and a hierarchy of convolutional continuous variables. Experiments on several benchmarks show that overlap** transformations outperform other recent continuous relaxations of discrete latent variables including Gumbel-Softmax (Maddison et al., 2016; Jang et al., 2016), and discrete variational autoencoders (Rolfe 2016). △ Less

Submitted 25 May, 2018; v1 submitted 13 February, 2018; originally announced February 2018.

Comments: Published as a conference paper at International Conference on Machine Learning (ICML), 2018

Showing 1–5 of 5 results for author: Khoshaman, A