Search | arXiv e-print repository

Large-scale quantum reservoir learning with an analog quantum computer

Authors: Milan Kornjača, Hong-Ye Hu, Chen Zhao, Jonathan Wurtz, Phillip Weinberg, Majd Hamdan, Andrii Zhdanov, Sergio H. Cantu, Hengyun Zhou, Rodrigo Araiza Bravo, Kevin Bagnall, James I. Basham, Joseph Campo, Adam Choukri, Robert DeAngelo, Paige Frederick, David Haines, Julian Hammett, Ning Hsu, Ming-Guang Hu, Florian Huber, Paul Niklas Jepsen, Ningyuan Jia, Thomas Karolyshyn, Minho Kwon , et al. (28 additional authors not shown)

Abstract: Quantum machine learning has gained considerable attention as quantum technology advances, presenting a promising approach for efficiently learning complex data patterns. Despite this promise, most contemporary quantum methods require significant resources for variational parameter optimization and face issues with vanishing gradients, leading to experiments that are either limited in scale or lac… ▽ More Quantum machine learning has gained considerable attention as quantum technology advances, presenting a promising approach for efficiently learning complex data patterns. Despite this promise, most contemporary quantum methods require significant resources for variational parameter optimization and face issues with vanishing gradients, leading to experiments that are either limited in scale or lack potential for quantum advantage. To address this, we develop a general-purpose, gradient-free, and scalable quantum reservoir learning algorithm that harnesses the quantum dynamics of neutral-atom analog quantum computers to process data. We experimentally implement the algorithm, achieving competitive performance across various categories of machine learning tasks, including binary and multi-class classification, as well as timeseries prediction. Effective and improving learning is observed with increasing system sizes of up to 108 qubits, demonstrating the largest quantum machine learning experiment to date. We further observe comparative quantum kernel advantage in learning tasks by constructing synthetic datasets based on the geometric differences between generated quantum and classical data kernels. Our findings demonstrate the potential of utilizing classically intractable quantum correlations for effective machine learning. We expect these results to stimulate further extensions to different quantum hardware and machine learning paradigms, including early fault-tolerant hardware and generative machine learning tasks. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 10 + 14 pages, 4 + 7 figures

arXiv:2310.02971 [pdf, other]

Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech Model

Authors: Kai-Wei Chang, Ming-Hsin Chen, Yun-** Lin, **g Neng Hsu, Paul Kuo-Ming Huang, Chien-yu Huang, Shang-Wen Li, Hung-yi Lee

Abstract: Prompting and adapter tuning have emerged as efficient alternatives to fine-tuning (FT) methods. However, existing studies on speech prompting focused on classification tasks and failed on more complex sequence generation tasks. Besides, adapter tuning is primarily applied with a focus on encoder-only self-supervised models. Our experiments show that prompting on Wav2Seq, a self-supervised encoder… ▽ More Prompting and adapter tuning have emerged as efficient alternatives to fine-tuning (FT) methods. However, existing studies on speech prompting focused on classification tasks and failed on more complex sequence generation tasks. Besides, adapter tuning is primarily applied with a focus on encoder-only self-supervised models. Our experiments show that prompting on Wav2Seq, a self-supervised encoder-decoder model, surpasses previous works in sequence generation tasks. It achieves a remarkable 53% relative improvement in word error rate for ASR and a 27% in F1 score for slot filling. Additionally, prompting competes with the FT method in the low-resource scenario. Moreover, we show the transferability of prompting and adapter tuning on Wav2Seq in cross-lingual ASR. When limited trainable parameters are involved, prompting and adapter tuning consistently outperform conventional FT across 7 languages. Notably, in the low-resource scenario, prompting consistently outperforms adapter tuning. △ Less

Submitted 14 November, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: Accepted to IEEE ASRU 2023

arXiv:2112.03138 [pdf, ps, other]

doi 10.1063/5.0068538

Density dependence of the excitation gaps in an undoped Si/SiGe double-quantum-well heterostructure

Authors: D. Chen, S. Cai, N. -W. Hsu, S. -H. Huang, Y. Chuang, E. Nielsen, J. -Y. Li, C. W. Liu, T. M. Lu, D. Laroche

Abstract: We report low-temperature magneto-transport measurements of an undoped Si/SiGe asymmetric double quantum well heterostructure. The density in both layers is tuned independently utilizing a top and a bottom gate, allowing the investigation of quantum wells at both imbalanced and matched densities. Integer quantum Hall states at total filling factor $ν_{\text{T}} = 1$ and $ν_{\text{T}} = 2$ are obse… ▽ More We report low-temperature magneto-transport measurements of an undoped Si/SiGe asymmetric double quantum well heterostructure. The density in both layers is tuned independently utilizing a top and a bottom gate, allowing the investigation of quantum wells at both imbalanced and matched densities. Integer quantum Hall states at total filling factor $ν_{\text{T}} = 1$ and $ν_{\text{T}} = 2$ are observed in both density regimes, and the evolution of their excitation gaps is reported as a function of density. The $ν_{\text{T}} = 1$ gap evolution departs from the behavior generally observed for valley splitting in the single layer regime. Furthermore, by comparing the $ν_{\text{T}} = 2$ gap to the single particle tunneling energy, $Δ_{\text{SAS}}$, obtained from Schrödinger-Poisson (SP) simulations, evidence for the onset of spontaneous inter-layer coherence (SIC) is observed for a relative filling fraction imbalance smaller than ${\sim}50\%$ △ Less

Submitted 6 December, 2021; originally announced December 2021.

Comments: 6 pages, 3 figures, accepted in APL

Journal ref: Appl. Phys. Lett. 119, 223103 (2021)

arXiv:1505.02016 [pdf, other]

doi 10.1103/PhysRevE.92.062925

Power-law ansatz in complex systems: excessive loss of information

Authors: Sun-Ting Tsai, Chin-De Chang, Ching-Hao Chang, Meng-Xue Tsai, Nan-Jung Hsu, Tzay-Ming Hong

Abstract: The ubiquity of power-law relations in empirical data displays physicists' love of simple laws and uncovering common causes among seemingly unrelated phenomena. However, many reported power laws lack statistical support and mechanistic backings, not to mention discrepancies with real data are often explained away as corrections due to finite size or other variables. We propose a simple experiment… ▽ More The ubiquity of power-law relations in empirical data displays physicists' love of simple laws and uncovering common causes among seemingly unrelated phenomena. However, many reported power laws lack statistical support and mechanistic backings, not to mention discrepancies with real data are often explained away as corrections due to finite size or other variables. We propose a simple experiment and rigorous statistical procedures to look into these issues. Making use of the fact that the occurrence rate and pulse intensity of crumple sound obey power law with an exponent that varies with material, we simulate a complex system with two driving mechanisms by crumpling two different sheets together. The probability function of crumple sound is found to transit from two power-law terms to a {\it bona fide} power law as compaction increases. In addition to showing the vicinity of these two distributions in the phase space, this observation nicely demonstrates the effect of interactions to bring about a subtle change in macroscopic behavior and more information may be retrieved if the data are subject to sorting. Our analyses are based on the Akaike information criterion that is a direct measurement of information loss and emphasizes the need to strike a balance between model simplicity and goodness of fit. As a show of force, the Akaike information criterion also found the Gutenberg-Richter law for earthquakes and the scale-free model for brain functional network, 2-dimensional sand pile, and solar flare intensity to suffer excessive loss of information. They resemble more the crumpled-together ball at low compactions in that there appear to be two driving mechanisms that take turns occurring. △ Less

Submitted 4 January, 2016; v1 submitted 8 May, 2015; originally announced May 2015.

Comments: 10 pages, 7 figures, accepted to Phys. Rev. E

Journal ref: Phys. Rev. E 92, 062925 (2015)

arXiv:1310.0043 [pdf, other]

Gels under stress: the origins of delayed collapse

Authors: Lisa J. Teece, James M. Hart, Kerry Yen Ni Hsu, Stephen Gilligan, Malcolm A. Faers, Paul Bartlett

Abstract: Attractive colloidal particles can form a disordered elastic solid or gel when quenched into a two-phase region, if the volume fraction is sufficiently large. When the interactions are comparable to thermal energies the stress-bearing network within the gel restructures over time as individual particle bonds break and reform. Typically, under gravity such weak gels show a prolonged period of eithe… ▽ More Attractive colloidal particles can form a disordered elastic solid or gel when quenched into a two-phase region, if the volume fraction is sufficiently large. When the interactions are comparable to thermal energies the stress-bearing network within the gel restructures over time as individual particle bonds break and reform. Typically, under gravity such weak gels show a prolonged period of either no or very slow settling, followed by a sudden and rapid collapse - a phenomenon known as delayed collapse. The link between local bond breaking events and the macroscopic process of delayed collapse is not well understood. Here we summarize the main features of delayed collapse and discuss the microscopic processes which cause it. We present a plausible model which connects the kinetics of bond breaking to gel collapse and test the model by exploring the effect of an applied external force on the stability of a gel. △ Less

Submitted 12 March, 2014; v1 submitted 30 September, 2013; originally announced October 2013.

Comments: Accepted version: 10 pages, 7 figures

arXiv:math/0702762 [pdf, ps, other]

doi 10.1214/074921706000000923

Pile-up probabilities for the Laplace likelihood estimator of a non-invertible first order moving average

Authors: F. Jay Breidt, Richard A. Davis, Nan-Jung Hsu, Murray Rosenblatt

Abstract: The first-order moving average model or MA(1) is given by $X_t=Z_t-θ_0Z_{t-1}$, with independent and identically distributed $\{Z_t\}$. This is arguably the simplest time series model that one can write down. The MA(1) with unit root ($θ_0=1$) arises naturally in a variety of time series applications. For example, if an underlying time series consists of a linear trend plus white noise errors, t… ▽ More The first-order moving average model or MA(1) is given by $X_t=Z_t-θ_0Z_{t-1}$, with independent and identically distributed $\{Z_t\}$. This is arguably the simplest time series model that one can write down. The MA(1) with unit root ($θ_0=1$) arises naturally in a variety of time series applications. For example, if an underlying time series consists of a linear trend plus white noise errors, then the differenced series is an MA(1) with unit root. In such cases, testing for a unit root of the differenced series is equivalent to testing the adequacy of the trend plus noise model. The unit root problem also arises naturally in a signal plus noise model in which the signal is modeled as a random walk. The differenced series follows a MA(1) model and has a unit root if and only if the random walk signal is in fact a constant. The asymptotic theory of various estimators based on Gaussian likelihood has been developed for the unit root case and nearly unit root case ($θ=1+β/n,β\le0$). Unlike standard $1/\sqrt{n}$-asymptotics, these estimation procedures have $1/n$-asymptotics and a so-called pile-up effect, in which P$(\hatθ=1)$ converges to a positive value. One explanation for this pile-up phenomenon is the lack of identifiability of $θ$ in the Gaussian case. That is, the Gaussian likelihood has the same value for the two sets of parameter values $(θ,σ^2)$ and $(1/θ,θ^2σ^2$). It follows that $θ=1$ is always a critical point of the likelihood function. In contrast, for non-Gaussian noise, $θ$ is identifiable for all real values. Hence it is no longer clear whether or not the same pile-up phenomenon will persist in the non-Gaussian case. In this paper, we focus on limiting pile-up probabilities for estimates of $θ_0$ based on a Laplace likelihood. In some cases, these estimates can be viewed as Least Absolute Deviation (LAD) estimates. Simulation results illustrate the limit theory. △ Less

Submitted 26 February, 2007; originally announced February 2007.

Comments: Published at http://dx.doi.org/10.1214/074921706000000923 in the IMS Lecture Notes Monograph Series (http://www.imstat.org/publications/lecnotes.htm) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-LNMS52-LNMS5201 MSC Class: 62M10 (Primary) 60F05 (Secondary)

Journal ref: IMS Lecture Notes Monograph Series 2006, Vol. 52, 1-19

Showing 1–6 of 6 results for author: Hsu, N