-
Training a quantum annealing based restricted Boltzmann machine on cybersecurity data
Authors:
Vivek Dixit,
Raja Selvarajan,
Tamer Aldwairi,
Yaroslav Koshka,
Mark A. Novotny,
Travis S. Humble,
Muhammad A. Alam,
Sabre Kais
Abstract:
We present a real-world application that uses a quantum computer. Specifically, we train a RBM using QA for cybersecurity applications. The D-Wave 2000Q has been used to implement QA. RBMs are trained on the ISCX data, which is a benchmark dataset for cybersecurity. For comparison, RBMs are also trained using CD. CD is a commonly used method for RBM training. Our analysis of the ISCX data shows th…
▽ More
We present a real-world application that uses a quantum computer. Specifically, we train a RBM using QA for cybersecurity applications. The D-Wave 2000Q has been used to implement QA. RBMs are trained on the ISCX data, which is a benchmark dataset for cybersecurity. For comparison, RBMs are also trained using CD. CD is a commonly used method for RBM training. Our analysis of the ISCX data shows that the dataset is imbalanced. We present two different schemes to balance the training dataset before feeding it to a classifier. The first scheme is based on the undersampling of benign instances. The imbalanced training dataset is divided into five sub-datasets that are trained separately. A majority voting is then performed to get the result. Our results show the majority vote increases the classification accuracy up from 90.24% to 95.68%, in the case of CD. For the case of QA, the classification accuracy increases from 74.14% to 80.04%. In the second scheme, a RBM is used to generate synthetic data to balance the training dataset. We show that both QA and CD-trained RBM can be used to generate useful synthetic data. Balanced training data is used to evaluate several classifiers. Among the classifiers investigated, K-Nearest Neighbor (KNN) and Neural Network (NN) perform better than other classifiers. They both show an accuracy of 93%. Our results show a proof-of-concept that a QA-based RBM can be trained on a 64-bit binary dataset. The illustrative example suggests the possibility to migrate many practical classification problems to QA-based techniques. Further, we show that synthetic data generated from a RBM can be used to balance the original dataset.
△ Less
Submitted 16 April, 2021; v1 submitted 24 November, 2020;
originally announced November 2020.
-
Comparison of D-Wave Quantum Annealing and Classical Simulated Annealing for Local Minima Determination
Authors:
Yaroslav Koshka,
M. A. Novotny
Abstract:
Restricted Boltzmann Machines trained with different numbers of iterations were used to provide a diverse set of energy functions each containing many local valleys (LVs) with different energies, widths, escape barrier heights, etc. They were used to verify the previously reported possibility of using the D-Wave quantum annealer (QA) to find potentially important LVs in the energy functions of Isi…
▽ More
Restricted Boltzmann Machines trained with different numbers of iterations were used to provide a diverse set of energy functions each containing many local valleys (LVs) with different energies, widths, escape barrier heights, etc. They were used to verify the previously reported possibility of using the D-Wave quantum annealer (QA) to find potentially important LVs in the energy functions of Ising spin glasses that may be missed by classical searches. For classical search, extensive simulated annealing (SA) was conducted to find as many LVs as possible regardless of the computational cost. SA was conducted long enough to ensure that the number of SA-found LVs approaches that and eventually significantly exceeds the number of the LVs found by a single call submitted to the D-Wave. Even after a prohibitively long SA search, as many as 30-50% of the D-Wave-found LVs remained not found by the SA. In order to establish if LVs found only by the D-Wave represent potentially important regions of the configuration space, they were compared to those that were found by both techniques. While the LVs found by the D-Wave but missed by SA predominantly had higher energies and lower escape barriers, there was a significant fraction having intermediate values of the energy and barrier height. With respect to most other important LV parameters, the LVs found only by the D-Wave were distributed in a wide range of the parameters' values. It was established that for large or small, shallow or deep, wide or narrow LVs, the LVs found only by the D-Wave are distinguished by a few-times smaller size of the LV basin of attraction (BoA). Apparently, the size of the BoA is not or at least is less important for QA search compared to the classical search, allowing QA to easily find many potentially important (e.g., wide and deep) LVs missed by even prohibitively lengthy classical searches.
△ Less
Submitted 8 November, 2019;
originally announced November 2019.
-
Towards Sampling from Nondirected Probabilistic Graphical models using a D-Wave Quantum Annealer
Authors:
Yaroslav Koshka,
M. A. Novotny
Abstract:
A D-Wave quantum annealer (QA) having a 2048 qubit lattice, with no missing qubits and couplings, allowed embedding of a complete graph of a Restricted Boltzmann Machine (RBM). A handwritten digit OptDigits data set having 8x7 pixels of visible units was used to train the RBM using a classical Contrastive Divergence. Embedding of the classically-trained RBM into the D-Wave lattice was used to demo…
▽ More
A D-Wave quantum annealer (QA) having a 2048 qubit lattice, with no missing qubits and couplings, allowed embedding of a complete graph of a Restricted Boltzmann Machine (RBM). A handwritten digit OptDigits data set having 8x7 pixels of visible units was used to train the RBM using a classical Contrastive Divergence. Embedding of the classically-trained RBM into the D-Wave lattice was used to demonstrate that the QA offers a high-efficiency alternative to the classical Markov Chain Monte Carlo (MCMC) for reconstructing missing labels of the test images as well as a generative model. At any training iteration, the D-Wave-based classification had classification error more than two times lower than MCMC. The main goal of this study was to investigate the quality of the sample from the RBM model distribution and its comparison to a classical MCMC sample. For the OptDigits dataset, the states in the D-Wave sample belonged to about two times more local valleys compared to the MCMC sample. All the lowest-energy (the highest joint probability) local minima in the MCMC sample were also found by the D-Wave. The D-Wave missed many of the higher-energy local valleys, while finding many "new" local valleys consistently missed by the MCMC. It was established that the "new" local valleys that the D-Wave finds are important for the model distribution in terms of the energy of the corresponding local minima, the width of the local valleys, and the height of the escape barrier.
△ Less
Submitted 30 April, 2019;
originally announced May 2019.