-
Fully autonomous tuning of a spin qubit
Authors:
Jonas Schuff,
Miguel J. Carballido,
Madeleine Kotzagiannidis,
Juan Carlos Calvo,
Marco Caselli,
Jacob Rawling,
David L. Craig,
Barnaby van Straaten,
Brandon Severin,
Federico Fedele,
Simon Svab,
Pierre Chevalier Kwon,
Rafael S. Eggli,
Taras Patlatiuk,
Nathan Korda,
Dominik Zumbühl,
Natalia Ares
Abstract:
Spanning over two decades, the study of qubits in semiconductors for quantum computing has yielded significant breakthroughs. However, the development of large-scale semiconductor quantum circuits is still limited by challenges in efficiently tuning and operating these circuits. Identifying optimal operating conditions for these qubits is complex, involving the exploration of vast parameter spaces…
▽ More
Spanning over two decades, the study of qubits in semiconductors for quantum computing has yielded significant breakthroughs. However, the development of large-scale semiconductor quantum circuits is still limited by challenges in efficiently tuning and operating these circuits. Identifying optimal operating conditions for these qubits is complex, involving the exploration of vast parameter spaces. This presents a real 'needle in the haystack' problem, which, until now, has resisted complete automation due to device variability and fabrication imperfections. In this study, we present the first fully autonomous tuning of a semiconductor qubit, from a grounded device to Rabi oscillations, a clear indication of successful qubit operation. We demonstrate this automation, achieved without human intervention, in a Ge/Si core/shell nanowire device. Our approach integrates deep learning, Bayesian optimization, and computer vision techniques. We expect this automation algorithm to apply to a wide range of semiconductor qubit devices, allowing for statistical studies of qubit quality metrics. As a demonstration of the potential of full automation, we characterise how the Rabi frequency and g-factor depend on barrier gate voltages for one of the qubits found by the algorithm. Twenty years after the initial demonstrations of spin qubit operation, this significant advancement is poised to finally catalyze the operation of large, previously unexplored quantum circuits.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
The Automated Bias Triangle Feature Extraction Framework
Authors:
Madeleine Kotzagiannidis,
Jonas Schuff,
Nathan Korda
Abstract:
Bias triangles represent features in stability diagrams of Quantum Dot (QD) devices, whose occurrence and property analysis are crucial indicators for spin physics. Nevertheless, challenges associated with quality and availability of data as well as the subtlety of physical phenomena of interest have hindered an automatic and bespoke analysis framework, often still relying (in part) on human label…
▽ More
Bias triangles represent features in stability diagrams of Quantum Dot (QD) devices, whose occurrence and property analysis are crucial indicators for spin physics. Nevertheless, challenges associated with quality and availability of data as well as the subtlety of physical phenomena of interest have hindered an automatic and bespoke analysis framework, often still relying (in part) on human labelling and verification. We introduce a feature extraction framework for bias triangles, built from unsupervised, segmentation-based computer vision methods, which facilitates the direct identification and quantification of physical properties of the former. Thereby, the need for human input or large training datasets to inform supervised learning approaches is circumvented, while additionally enabling the automation of pixelwise shape and feature labeling. In particular, we demonstrate that Pauli Spin Blockade (PSB) detection can be conducted effectively, efficiently and without any training data as a direct result of this approach.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Training of Quantum Circuits on a Hybrid Quantum Computer
Authors:
D. Zhu,
N. M. Linke,
M. Benedetti,
K. A. Landsman,
N. H. Nguyen,
C. H. Alderete,
A. Perdomo-Ortiz,
N. Korda,
A. Garfoot,
C. Brecque,
L. Egan,
O. Perdomo,
C. Monroe
Abstract:
Generative modeling is a flavor of machine learning with applications ranging from computer vision to chemical design. It is expected to be one of the techniques most suited to take advantage of the additional resources provided by near-term quantum computers. We implement a data-driven quantum circuit training algorithm on the canonical Bars-and-Stripes data set using a quantum-classical hybrid m…
▽ More
Generative modeling is a flavor of machine learning with applications ranging from computer vision to chemical design. It is expected to be one of the techniques most suited to take advantage of the additional resources provided by near-term quantum computers. We implement a data-driven quantum circuit training algorithm on the canonical Bars-and-Stripes data set using a quantum-classical hybrid machine. The training proceeds by running parameterized circuits on a trapped ion quantum computer, and feeding the results to a classical optimizer. We apply two separate strategies, Particle Swarm and Bayesian optimization to this task. We show that the convergence of the quantum circuit to the target distribution depends critically on both the quantum hardware and classical optimization strategy. Our study represents the first successful training of a high-dimensional universal quantum circuit, and highlights the promise and challenges associated with hybrid learning schemes.
△ Less
Submitted 31 October, 2019; v1 submitted 20 December, 2018;
originally announced December 2018.
-
Distributed Clustering of Linear Bandits in Peer to Peer Networks
Authors:
Nathan Korda,
Balazs Szorenyi,
Shuai Li
Abstract:
We provide two distributed confidence ball algorithms for solving linear bandit problems in peer to peer networks with limited communication capabilities. For the first, we assume that all the peers are solving the same linear bandit problem, and prove that our algorithm achieves the optimal asymptotic regret rate of any centralised algorithm that can instantly communicate information between the…
▽ More
We provide two distributed confidence ball algorithms for solving linear bandit problems in peer to peer networks with limited communication capabilities. For the first, we assume that all the peers are solving the same linear bandit problem, and prove that our algorithm achieves the optimal asymptotic regret rate of any centralised algorithm that can instantly communicate information between the peers. For the second, we assume that there are clusters of peers solving the same bandit problem within each cluster, and we prove that our algorithm discovers these clusters, while achieving the optimal asymptotic regret rate within each one. Through experiments on several real-world datasets, we demonstrate the performance of proposed algorithms compared to the state-of-the-art.
△ Less
Submitted 7 June, 2016; v1 submitted 26 April, 2016;
originally announced April 2016.
-
On TD(0) with function approximation: Concentration bounds and a centered variant with exponential convergence
Authors:
Nathaniel Korda,
L. A. Prashanth
Abstract:
We provide non-asymptotic bounds for the well-known temporal difference learning algorithm TD(0) with linear function approximators. These include high-probability bounds as well as bounds in expectation. Our analysis suggests that a step-size inversely proportional to the number of iterations cannot guarantee optimal rate of convergence unless we assume (partial) knowledge of the stationary distr…
▽ More
We provide non-asymptotic bounds for the well-known temporal difference learning algorithm TD(0) with linear function approximators. These include high-probability bounds as well as bounds in expectation. Our analysis suggests that a step-size inversely proportional to the number of iterations cannot guarantee optimal rate of convergence unless we assume (partial) knowledge of the stationary distribution for the Markov chain underlying the policy considered. We also provide bounds for the iterate averaged TD(0) variant, which gets rid of the step-size dependency while exhibiting the optimal rate of convergence. Furthermore, we propose a variant of TD(0) with linear approximators that incorporates a centering sequence, and establish that it exhibits an exponential rate of convergence in expectation. We demonstrate the usefulness of our bounds on two synthetic experimental settings.
△ Less
Submitted 1 September, 2015; v1 submitted 12 November, 2014;
originally announced November 2014.
-
Finite-Time Analysis of Kernelised Contextual Bandits
Authors:
Michal Valko,
Nathaniel Korda,
Remi Munos,
Ilias Flaounas,
Nelo Cristianini
Abstract:
We tackle the problem of online reward maximisation over a large finite set of actions described by their contexts. We focus on the case when the number of actions is too big to sample all of them even once. However we assume that we have access to the similarities between actions' contexts and that the expected reward is an arbitrary linear function of the contexts' images in the related reproduc…
▽ More
We tackle the problem of online reward maximisation over a large finite set of actions described by their contexts. We focus on the case when the number of actions is too big to sample all of them even once. However we assume that we have access to the similarities between actions' contexts and that the expected reward is an arbitrary linear function of the contexts' images in the related reproducing kernel Hilbert space (RKHS). We propose KernelUCB, a kernelised UCB algorithm, and give a cumulative regret bound through a frequentist analysis. For contextual bandits, the related algorithm GP-UCB turns out to be a special case of our algorithm, and our finite-time analysis improves the regret bound of GP-UCB for the agnostic case, both in the terms of the kernel-dependent quantity and the RKHS norm of the reward function. Moreover, for the linear kernel, our regret bound matches the lower bound for contextual linear bandits.
△ Less
Submitted 26 September, 2013;
originally announced September 2013.
-
Thompson Sampling for 1-Dimensional Exponential Family Bandits
Authors:
Nathaniel Korda,
Emilie Kaufmann,
Remi Munos
Abstract:
Thompson Sampling has been demonstrated in many complex bandit models, however the theoretical guarantees available for the parametric multi-armed bandit are still limited to the Bernoulli case. Here we extend them by proving asymptotic optimality of the algorithm using the Jeffreys prior for 1-dimensional exponential family bandits. Our proof builds on previous work, but also makes extensive use…
▽ More
Thompson Sampling has been demonstrated in many complex bandit models, however the theoretical guarantees available for the parametric multi-armed bandit are still limited to the Bernoulli case. Here we extend them by proving asymptotic optimality of the algorithm using the Jeffreys prior for 1-dimensional exponential family bandits. Our proof builds on previous work, but also makes extensive use of closed forms for Kullback-Leibler divergence and Fisher information (and thus Jeffreys prior) available in an exponential family. This allow us to give a finite time exponential concentration inequality for posterior distributions on exponential families that may be of interest in its own right. Moreover our analysis covers some distributions for which no optimistic algorithm has yet been proposed, including heavy-tailed exponential families.
△ Less
Submitted 12 July, 2013;
originally announced July 2013.
-
Fast gradient descent for drifting least squares regression, with application to bandits
Authors:
Nathaniel Korda,
Prashanth L. A.,
Rémi Munos
Abstract:
Online learning algorithms require to often recompute least squares regression estimates of parameters. We study improving the computational complexity of such algorithms by using stochastic gradient descent (SGD) type schemes in place of classic regression solvers. We show that SGD schemes efficiently track the true solutions of the regression problems, even in the presence of a drift. This findi…
▽ More
Online learning algorithms require to often recompute least squares regression estimates of parameters. We study improving the computational complexity of such algorithms by using stochastic gradient descent (SGD) type schemes in place of classic regression solvers. We show that SGD schemes efficiently track the true solutions of the regression problems, even in the presence of a drift. This finding coupled with an $O(d)$ improvement in complexity, where $d$ is the dimension of the data, make them attractive for implementation in the big data settings. In the case when strong convexity in the regression problem is guaranteed, we provide bounds on the error both in expectation and high probability (the latter is often needed to provide theoretical guarantees for higher level algorithms), despite the drifting least squares solution. As an example of this case we prove that the regret performance of an SGD version of the PEGE linear bandit algorithm [Rusmevichientong and Tsitsiklis 2010] is worse that that of PEGE itself only by a factor of $O(\log^4 n)$. When strong convexity of the regression problem cannot be guaranteed, we investigate using an adaptive regularisation. We make an empirical study of an adaptively regularised, SGD version of LinUCB [Li et al. 2010] in a news article recommendation application, which uses the large scale news recommendation dataset from Yahoo! front page. These experiments show a large gain in computational complexity, with a consistently low tracking error and click-through-rate (CTR) performance that is $75\%$ close.
△ Less
Submitted 20 November, 2014; v1 submitted 11 July, 2013;
originally announced July 2013.
-
Concentration bounds for temporal difference learning with linear function approximation: The case of batch data and uniform sampling
Authors:
L. A. Prashanth,
Nathaniel Korda,
Rémi Munos
Abstract:
We propose a stochastic approximation (SA) based method with randomization of samples for policy evaluation using the least squares temporal difference (LSTD) algorithm. Our proposed scheme is equivalent to running regular temporal difference learning with linear function approximation, albeit with samples picked uniformly from a given dataset. Our method results in an $O(d)$ improvement in comple…
▽ More
We propose a stochastic approximation (SA) based method with randomization of samples for policy evaluation using the least squares temporal difference (LSTD) algorithm. Our proposed scheme is equivalent to running regular temporal difference learning with linear function approximation, albeit with samples picked uniformly from a given dataset. Our method results in an $O(d)$ improvement in complexity in comparison to LSTD, where $d$ is the dimension of the data. We provide non-asymptotic bounds for our proposed method, both in high probability and in expectation, under the assumption that the matrix underlying the LSTD solution is positive definite. The latter assumption can be easily satisfied for the pathwise LSTD variant proposed in [23]. Moreover, we also establish that using our method in place of LSTD does not impact the rate of convergence of the approximate value function to the true value function. These rate results coupled with the low computational complexity of our method make it attractive for implementation in big data settings, where $d$ is large. A similar low-complexity alternative for least squares regression is well-known as the stochastic gradient descent (SGD) algorithm. We provide finite-time bounds for SGD. We demonstrate the practicality of our method as an efficient alternative for pathwise LSTD empirically by combining it with the least squares policy iteration (LSPI) algorithm in a traffic signal control application. We also conduct another set of experiments that combines the SA based low-complexity variant for least squares regression with the LinUCB algorithm for contextual bandits, using the large scale news recommendation dataset from Yahoo.
△ Less
Submitted 24 January, 2020; v1 submitted 11 June, 2013;
originally announced June 2013.
-
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
Authors:
Emilie Kaufmann,
Nathaniel Korda,
Rémi Munos
Abstract:
The question of the optimality of Thompson Sampling for solving the stochastic multi-armed bandit problem had been open since 1933. In this paper we answer it positively for the case of Bernoulli rewards by providing the first finite-time analysis that matches the asymptotic rate given in the Lai and Robbins lower bound for the cumulative regret. The proof is accompanied by a numerical comparison…
▽ More
The question of the optimality of Thompson Sampling for solving the stochastic multi-armed bandit problem had been open since 1933. In this paper we answer it positively for the case of Bernoulli rewards by providing the first finite-time analysis that matches the asymptotic rate given in the Lai and Robbins lower bound for the cumulative regret. The proof is accompanied by a numerical comparison with other optimal policies, experiments that have been lacking in the literature until now for the Bernoulli case.
△ Less
Submitted 19 July, 2012; v1 submitted 18 May, 2012;
originally announced May 2012.