Search | arXiv e-print repository

Efficient Quantum Circuits for Machine Learning Activation Functions including Constant T-depth ReLU

Authors: Wei Zi, Siyi Wang, Hyunji Kim, Xiaoming Sun, Anupam Chattopadhyay, Patrick Rebentrost

Abstract: In recent years, Quantum Machine Learning (QML) has increasingly captured the interest of researchers. Among the components in this domain, activation functions hold a fundamental and indispensable role. Our research focuses on the development of activation functions quantum circuits for integration into fault-tolerant quantum computing architectures, with an emphasis on minimizing $T$-depth. Spec… ▽ More In recent years, Quantum Machine Learning (QML) has increasingly captured the interest of researchers. Among the components in this domain, activation functions hold a fundamental and indispensable role. Our research focuses on the development of activation functions quantum circuits for integration into fault-tolerant quantum computing architectures, with an emphasis on minimizing $T$-depth. Specifically, we present novel implementations of ReLU and leaky ReLU activation functions, achieving constant $T$-depths of 4 and 8, respectively. Leveraging quantum lookup tables, we extend our exploration to other activation functions such as the sigmoid. This approach enables us to customize precision and $T$-depth by adjusting the number of qubits, making our results more adaptable to various application scenarios. This study represents a significant advancement towards enhancing the practicality and application of quantum machine learning. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 13 pages

arXiv:2404.06052 [pdf, ps, other]

Shallow Quantum Circuit Implementation of Symmetric Functions with Limited Ancillary Qubits

Authors: Wei Zi, Junhong Nie, Xiaoming Sun

Abstract: In quantum computation, optimizing depth and number of ancillary qubits in quantum circuits is crucial due to constraints imposed by current quantum devices. This paper presents an innovative approach to implementing arbitrary symmetric Boolean functions using poly-logarithmic depth quantum circuits with logarithmic number of ancillary qubits. Symmetric functions are those whose outputs rely solel… ▽ More In quantum computation, optimizing depth and number of ancillary qubits in quantum circuits is crucial due to constraints imposed by current quantum devices. This paper presents an innovative approach to implementing arbitrary symmetric Boolean functions using poly-logarithmic depth quantum circuits with logarithmic number of ancillary qubits. Symmetric functions are those whose outputs rely solely on the Hamming weight of the inputs. These functions find applications across diverse domains, including quantum machine learning, arithmetic circuit synthesis, and quantum algorithm design (e.g., Grover's algorithm). Moreover, by fully leveraging the potential of qutrits (an additional energy level), the ancilla count can be further reduced to 1. The key technique involves a novel poly-logarithmic depth quantum circuit designed to compute Hamming weight without the need for ancillary qubits. The quantum circuit for Hamming weight is of independent interest because of its broad applications, such as quantum memory and quantum machine learning. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 12 pages

arXiv:2402.05053 [pdf, ps, other]

Quantum circuit for multi-qubit Toffoli gate with optimal resource

Authors: Junhong Nie, Wei Zi, Xiaoming Sun

Abstract: Resource consumption is an important issue in quantum information processing, particularly during the present NISQ era. In this paper, we investigate resource optimization of implementing multiple controlled operations, which are fundamental building blocks in the field of quantum computing and quantum simulation. We design new quantum circuits for the $n$-Toffoli gate and general multi-controlled… ▽ More Resource consumption is an important issue in quantum information processing, particularly during the present NISQ era. In this paper, we investigate resource optimization of implementing multiple controlled operations, which are fundamental building blocks in the field of quantum computing and quantum simulation. We design new quantum circuits for the $n$-Toffoli gate and general multi-controlled unitary, which have only $O(\log n)$-depth and $O(n)$-size, and only require $1$ ancillary qubit. To achieve these results, we explore the potential of ancillary qubits and discover a method to create new conditional clean qubits from existed ancillary qubits. These techniques can also be utilized to construct an efficient quantum circuit for incrementor, leading to an implementation of multi-qubit Toffoli gate with a depth of $O(\log^2n)$ and size of $O(n)$ without any ancillary qubits. Furthermore, we explore the power of ancillary qubits from the perspective of resource theory. We demonstrate that without the assistance of ancillary qubit, any quantum circuit implementation of multi-qubit Toffoli gate must employ exponential precision gates. This finding indicates a significant disparity in computational power of quantum circuits between using and not using ancillary qubits. Additionally, we discuss the comparison of the power of ancillary qubits and extra energy levels in quantum circuit design. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: 14 pages, 5 figures

arXiv:2303.12979 [pdf, other]

Optimal Synthesis of Multi-Controlled Qudit Gates

Authors: Wei Zi, Qian Li, Xiaoming Sun

Abstract: We propose a linear-size synthesis of the multi-controlled Toffoli gate on qudits with at most one borrowed ancilla. This one ancilla can even be saved when the qudit dimension is odd. Our synthesis leads to improvements in various quantum algorithms implemented on qudits. In particular, we obtain (i) a linear-size and one-clean-ancilla synthesis of multi-controlled qudit gates; (ii) an optimal-si… ▽ More We propose a linear-size synthesis of the multi-controlled Toffoli gate on qudits with at most one borrowed ancilla. This one ancilla can even be saved when the qudit dimension is odd. Our synthesis leads to improvements in various quantum algorithms implemented on qudits. In particular, we obtain (i) a linear-size and one-clean-ancilla synthesis of multi-controlled qudit gates; (ii) an optimal-size and one-clean-ancilla synthesis of unitaries on qudits; (iii) a near-optimal-size and ancilla-free/one-borrowed-ancilla implementation of classical reversible functions as qudit gates. △ Less

Submitted 22 March, 2023; originally announced March 2023.

Comments: Accepted by DAC 2023

arXiv:2211.05187 [pdf, other]

Training a Vision Transformer from scratch in less than 24 hours with 1 GPU

Authors: Saghar Irandoust, Thibaut Durand, Yunduz Rakhmangulova, Wenjie Zi, Hossein Hajimirsadeghi

Abstract: Transformers have become central to recent advances in computer vision. However, training a vision Transformer (ViT) model from scratch can be resource intensive and time consuming. In this paper, we aim to explore approaches to reduce the training costs of ViT models. We introduce some algorithmic improvements to enable training a ViT model from scratch with limited hardware (1 GPU) and time (24… ▽ More Transformers have become central to recent advances in computer vision. However, training a vision Transformer (ViT) model from scratch can be resource intensive and time consuming. In this paper, we aim to explore approaches to reduce the training costs of ViT models. We introduce some algorithmic improvements to enable training a ViT model from scratch with limited hardware (1 GPU) and time (24 hours) resources. First, we propose an efficient approach to add locality to the ViT architecture. Second, we develop a new image size curriculum learning strategy, which allows to reduce the number of patches extracted from each image at the beginning of the training. Finally, we propose a new variant of the popular ImageNet1k benchmark by adding hardware and time constraints. We evaluate our contributions on this benchmark, and show they can significantly improve performances given the proposed training budget. We will share the code in https://github.com/BorealisAI/efficient-vit-training. △ Less

Submitted 9 November, 2022; originally announced November 2022.

Comments: 7 pages, 2 figures, 1 table, published in "Has it Trained Yet? Workshop at the Conference on Neural Information Processing Systems (NeurIPS 2022)"

ACM Class: I.2.10

arXiv:2106.04559 [pdf, other]

Turing: an Accurate and Interpretable Multi-Hypothesis Cross-Domain Natural Language Database Interface

Authors: Peng Xu, Wenjie Zi, Hamidreza Shahidi, Ákos Kádár, Keyi Tang, Wei Yang, Jawad Ateeq, Harsh Barot, Meidan Alon, Yanshuai Cao

Abstract: A natural language database interface (NLDB) can democratize data-driven insights for non-technical users. However, existing Text-to-SQL semantic parsers cannot achieve high enough accuracy in the cross-database setting to allow good usability in practice. This work presents Turing, a NLDB system toward bridging this gap. The cross-domain semantic parser of Turing with our novel value prediction m… ▽ More A natural language database interface (NLDB) can democratize data-driven insights for non-technical users. However, existing Text-to-SQL semantic parsers cannot achieve high enough accuracy in the cross-database setting to allow good usability in practice. This work presents Turing, a NLDB system toward bridging this gap. The cross-domain semantic parser of Turing with our novel value prediction method achieves $75.1\%$ execution accuracy, and $78.3\%$ top-5 beam execution accuracy on the Spider validation set. To benefit from the higher beam accuracy, we design an interactive system where the SQL hypotheses in the beam are explained step-by-step in natural language, with their differences highlighted. The user can then compare and judge the hypotheses to select which one reflects their intention if any. The English explanations of SQL queries in Turing are produced by our high-precision natural language generation system based on synchronous grammars. △ Less

Submitted 8 June, 2021; originally announced June 2021.

Comments: ACL 2021 demonstration track

arXiv:2101.05430 [pdf, ps, other]

Efficient quantum circuit synthesis for SAT-oracle with limited ancillary qubit

Authors: Shuai Yang, Wei Zi, Bujiao Wu, Cheng Guo, Jialin Zhang, Xiaoming Sun

Abstract: How to implement quantum oracle with limited resources raises concerns these days. We design two ancilla-adjustable and efficient algorithms to synthesize SAT-oracle, the key component in solving SAT problems. The previous work takes 2m-1 ancillary qubits and O(m) elementary gates to synthesize an m clauses oracle. The first algorithm reduces the number of ancillary qubits to 2\sqrt{m}, with at mo… ▽ More How to implement quantum oracle with limited resources raises concerns these days. We design two ancilla-adjustable and efficient algorithms to synthesize SAT-oracle, the key component in solving SAT problems. The previous work takes 2m-1 ancillary qubits and O(m) elementary gates to synthesize an m clauses oracle. The first algorithm reduces the number of ancillary qubits to 2\sqrt{m}, with at most an eightfold increase in circuit size. The number of ancillary qubits can be further reduced to 3 with a quadratic increase in circuit size. The second algorithm aims to reduce the circuit depth. By leveraging of the second algorithm, the circuit depth can be reduced to O(log m) with m ancillary qubits. △ Less

Submitted 9 June, 2022; v1 submitted 13 January, 2021; originally announced January 2021.

Comments: 4 pages,2 figures, 1 table with Supplementary

ACM Class: B.7.1; F.4.1

arXiv:2012.15355 [pdf, other]

Optimizing Deeper Transformers on Small Datasets

Authors: Peng Xu, Dhruv Kumar, Wei Yang, Wenjie Zi, Keyi Tang, Chenyang Huang, Jackie Chi Kit Cheung, Simon J. D. Prince, Yanshuai Cao

Abstract: It is a common belief that training deep transformers from scratch requires large datasets. Consequently, for small datasets, people usually use shallow and simple additional layers on top of pre-trained models during fine-tuning. This work shows that this does not always need to be the case: with proper initialization and optimization, the benefits of very deep transformers can carry over to chal… ▽ More It is a common belief that training deep transformers from scratch requires large datasets. Consequently, for small datasets, people usually use shallow and simple additional layers on top of pre-trained models during fine-tuning. This work shows that this does not always need to be the case: with proper initialization and optimization, the benefits of very deep transformers can carry over to challenging tasks with small datasets, including Text-to-SQL semantic parsing and logical reading comprehension. In particular, we successfully train $48$ layers of transformers, comprising $24$ fine-tuned layers from pre-trained RoBERTa and $24$ relation-aware layers trained from scratch. With fewer training steps and no task-specific pre-training, we obtain the state-of-the-art performance on the challenging cross-domain Text-to-SQL parsing benchmark Spider. We achieve this by deriving a novel Data-dependent Transformer Fixed-update initialization scheme (DT-Fixup), inspired by the prior T-Fixup work. Further error analysis shows that increasing depth can help improve generalization on small datasets for hard cases that require reasoning and structural understanding. △ Less

Submitted 31 May, 2021; v1 submitted 30 December, 2020; originally announced December 2020.

Comments: Accepted at ACL 2021 main conference

arXiv:1907.05083 [pdf, ps, other]

Cake Cutting on Graphs: A Discrete and Bounded Proportional Protocol

Authors: Xiaohui Bei, Xiaoming Sun, Hao Wu, Jialin Zhang, Zhijie Zhang, Wei Zi

Abstract: The classical cake cutting problem studies how to find fair allocations of a heterogeneous and divisible resource among multiple agents. Two of the most commonly studied fairness concepts in cake cutting are proportionality and envy-freeness. It is well known that a proportional allocation among $n$ agents can be found efficiently via simple protocols [16]. For envy-freeness, in a recent breakthro… ▽ More The classical cake cutting problem studies how to find fair allocations of a heterogeneous and divisible resource among multiple agents. Two of the most commonly studied fairness concepts in cake cutting are proportionality and envy-freeness. It is well known that a proportional allocation among $n$ agents can be found efficiently via simple protocols [16]. For envy-freeness, in a recent breakthrough, Aziz and Mackenzie [5] proposed a discrete and bounded envy-free protocol for any number of players. However, the protocol suffers from high multiple-exponential query complexity and it remains open to find simpler and more efficient envy-free protocols. In this paper we consider a variation of the cake cutting problem by assuming an underlying graph over the agents whose edges describe their acquaintance relationships, and agents evaluate their shares relatively to those of their neighbors. An allocation is called locally proportional if each agent thinks she receives at least the average value over her neighbors. Local proportionality generalizes proportionality and is in an interesting middle ground between proportionality and envy-freeness: its existence is guaranteed by that of an envy-free allocation, but no simple protocol is known to produce such a locally proportional allocation for general graphs. Previous works showed locally proportional protocols for special classes of graphs, and it is listed in both [1] and [8] as an open question to design simple locally proportional protocols for more general classes of graphs. In this paper we completely resolved this open question by presenting a discrete and bounded locally proportional protocol for any given graphs. Our protocol has a query complexity of only single exponential, which is significantly smaller than the six towers of $n$ query complexity of the envy-free protocol given in [5]. △ Less

Submitted 11 July, 2019; originally announced July 2019.

Showing 1–9 of 9 results for author: Zi, W