Skip to main content

Showing 1–50 of 107 results for author: Leng, J

.
  1. arXiv:2406.09198  [pdf, other

    cs.CV

    CLIP-Driven Cloth-Agnostic Feature Learning for Cloth-Changing Person Re-Identification

    Authors: Shuang Li, Jiaxu Leng, Guozhang Li, Ji Gan, Haosheng chen, Xinbo Gao

    Abstract: Contrastive Language-Image Pre-Training (CLIP) has shown impressive performance in short-term Person Re-Identification (ReID) due to its ability to extract high-level semantic features of pedestrians, yet its direct application to Cloth-Changing Person Re-Identification (CC-ReID) faces challenges due to CLIP's image encoder overly focusing on clothes clues. To address this, we propose a novel fram… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  2. arXiv:2404.14691  [pdf, other

    cs.DC

    Towards Fast Setup and High Throughput of GPU Serverless Computing

    Authors: Han Zhao, Weihao Cui, Quan Chen, Shulai Zhang, Zijun Li, **gwen Leng, Chao Li, Deze Zeng, Minyi Guo

    Abstract: Integrating GPUs into serverless computing platforms is crucial for improving efficiency. However, existing solutions for GPU-enabled serverless computing platforms face two significant problems due to coarse-grained GPU management: long setup time and low function throughput. To address these issues, we propose SAGE, a GPU serverless framework with fast setup and high throughput. First, based o… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  3. arXiv:2404.11852  [pdf, other

    cs.AR cs.GR

    Cicero: Addressing Algorithmic and Architectural Bottlenecks in Neural Rendering by Radiance War** and Memory Optimizations

    Authors: Yu Feng, Zihan Liu, **gwen Leng, Minyi Guo, Yuhao Zhu

    Abstract: Neural Radiance Field (NeRF) is widely seen as an alternative to traditional physically-based rendering. However, NeRF has not yet seen its adoption in resource-limited mobile systems such as Virtual and Augmented Reality (VR/AR), because it is simply extremely slow. On a mobile Volta GPU, even the state-of-the-art NeRF models generally execute only at 0.8 FPS. We show that the main performance bo… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  4. arXiv:2404.07773  [pdf, other

    cs.CV

    ConsistencyDet: A Robust Object Detector with a Denoising Paradigm of Consistency Model

    Authors: Lifan Jiang, Zhihui Wang, Changmiao Wang, Ming Li, Jiaxu Leng, Xindong Wu

    Abstract: Object detection, a quintessential task in the realm of perceptual computing, can be tackled using a generative methodology. In the present study, we introduce a novel framework designed to articulate object detection as a denoising diffusion process, which operates on the perturbed bounding boxes of annotated entities. This framework, termed ConsistencyDet, leverages an innovative denoising conce… ▽ More

    Submitted 14 May, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

  5. arXiv:2404.03432  [pdf, other

    quant-ph

    Piecemeal Quantum Telescope with Superresolution

    Authors: Jian Leng, Yi-Xin Shen, Zhou-Kai Cao, Xiang-Bin Wang

    Abstract: Detecting remote objects with higher precision and resolution takes a crucial role in many scientific tasks, such as astronomical observation. Compared with classical telescopes, quantum telescopes can detect more precise angle value for single-star target. The precision of existing quantum telescopes is improved in the scale of square root of incident single photons. Here we propose the piecemeal… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: 6 figures

  6. arXiv:2402.17995  [pdf, ps, other

    math.CO math.NT

    Improved Bounds for Szemerédi's Theorem

    Authors: James Leng, Ashwin Sah, Mehtaab Sawhney

    Abstract: Let $r_k(N)$ denote the size of the largest subset of $[N] = \{1,\ldots,N\}$ with no $k$-term arithmetic progression. We show that for $k\ge 5$, there exists $c_k>0$ such that \[r_k(N)\ll N\exp(-(\log\log N)^{c_k}).\] Our proof is a consequence of recent quasipolynomial bounds on the inverse theorem for the Gowers $U^k$-norm as well as the density increment strategy of Heath-Brown and Szemerédi as… ▽ More

    Submitted 29 February, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: 13 pages

  7. arXiv:2402.17994  [pdf, ps, other

    math.CO math.DS math.NT

    Quasipolynomial bounds on the inverse theorem for the Gowers $U^{s+1}[N]$-norm

    Authors: James Leng, Ashwin Sah, Mehtaab Sawhney

    Abstract: We prove quasipolynomial bounds on the inverse theorem for the Gowers $U^{s+1}[N]$-norm. The proof is modeled after work of Green, Tao, and Ziegler and uses as a crucial input recent work of the first author regarding the equidistribution of nilsequences. In a companion paper, this result will be used to improve the bounds on Szemerédi's theorem.

    Submitted 10 April, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: 100 pages

  8. arXiv:2402.10876  [pdf, other

    cs.DC

    Accelerating Sparse DNNs Based on Tiled GEMM

    Authors: Cong Guo, Fengchen Xue, **gwen Leng, Yuxian Qiu, Yue Guan, Weihao Cui, Quan Chen, Minyi Guo

    Abstract: Network pruning can reduce the computation cost of deep neural network (DNN) models. However, sparse models often produce randomly-distributed weights to maintain accuracy, leading to irregular computations. Consequently, unstructured sparse models cannot achieve meaningful speedup on commodity hardware built for dense matrix computations. Accelerators are usually modified or designed with structu… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted by IEEE Transactions on Computers. arXiv admin note: substantial text overlap with arXiv:2008.13006

  9. arXiv:2401.13472  [pdf, other

    eess.IV cs.CV

    Segmenting Cardiac Muscle Z-disks with Deep Neural Networks

    Authors: Mihaela Croitor Ibrahim, Nishant Ravikumar, Alistair Curd, Joanna Leng, Oliver Umney, Michelle Peckham

    Abstract: Z-disks are complex structures that delineate repeating sarcomeres in striated muscle. They play significant roles in cardiomyocytes such as providing mechanical stability for the contracting sarcomere, cell signalling and autophagy. Changes in Z-disk architecture have been associated with impaired cardiac function. Hence, there is a strong need to create tools to segment Z-disks from microscopy i… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  10. arXiv:2401.08550  [pdf, other

    quant-ph cs.CE math.NA

    Expanding Hardware-Efficiently Manipulable Hilbert Space via Hamiltonian Embedding

    Authors: Jiaqi Leng, Joseph Li, Yuxiang Peng, Xiaodi Wu

    Abstract: Many promising quantum applications depend on the efficient quantum simulation of an exponentially large sparse Hamiltonian, a task known as sparse Hamiltonian simulation, which is fundamentally important in quantum computation. Although several theoretically appealing quantum algorithms have been proposed for this task, they typically require a black-box query model of the sparse Hamiltonian, ren… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 68 pages, 10 figures, an accompanying GitHub repository is at https://github.com/jiaqileng/hamiltonian-embedding

  11. arXiv:2401.08156  [pdf, other

    cs.DC

    GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching

    Authors: Cong Guo, Rui Zhang, Jiale Xu, **gwen Leng, Zihan Liu, Ziyu Huang, Minyi Guo, Hao Wu, Shouren Zhao, Jun** Zhao, Ke Zhang

    Abstract: Large-scale deep neural networks (DNNs), such as large language models (LLMs), have revolutionized the artificial intelligence (AI) field and become increasingly popular. However, training or fine-tuning such models requires substantial computational power and resources, where the memory capacity of a single acceleration device like a GPU is one of the most important bottlenecks. Owing to the proh… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted by ASPLOS24

  12. arXiv:2312.10776  [pdf, ps, other

    math.NT math.CO

    Improved bounds for five-term arithmetic progressions

    Authors: James Leng, Ashwin Sah, Mehtaab Sawhney

    Abstract: Let $r_5(N)$ be the largest cardinality of a set in $\{1,\ldots,N\}$ which does not contain $5$ elements in arithmetic progression. Then there exists a constant $c\in (0,1)$ such that \[r_5(N)\ll \frac{N}{\exp((\log\log N)^{c})}.\] Our work is a consequence of recent improved bounds on the $U^4$-inverse theorem of the first author and the fact that $3$-step nilsequences may be approximated by loca… ▽ More

    Submitted 10 April, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

    Comments: 35 pages, comments welcome!

  13. arXiv:2312.10772  [pdf, ps, other

    math.NT math.CA math.CO math.DS

    Efficient Equidistribution of Nilsequences

    Authors: James Leng

    Abstract: We give improved bounds for the equidistribution of (multiparameter) nilsequences subject to any degree filtration. The bounds we obtain are single exponential in dimension, improving on double exponential bounds of Green and Tao. To obtain these bounds, we avoid "induction of dimension" which is ubiquitous throughout higher order Fourier analysis. These improved equidistribution results are a c… ▽ More

    Submitted 27 February, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

    Comments: 56 pages, comments welcome! v4. Reorganized content from arXiv:2306.13820 and updated citations

  14. arXiv:2312.01712  [pdf, other

    cs.DC

    JUNO: Optimizing High-Dimensional Approximate Nearest Neighbour Search with Sparsity-Aware Algorithm and Ray-Tracing Core Map**

    Authors: Zihan Liu, Wentao Ni, **gwen Leng, Yu Feng, Cong Guo, Quan Chen, Chao Li, Minyi Guo, Yuhao Zhu

    Abstract: Approximate nearest neighbor (ANN) search is a widely applied technique in modern intelligent applications, such as recommendation systems and vector databases. Therefore, efficient and high-throughput execution of ANN search has become increasingly important. In this paper, we first characterize the state-of-the-art product quantization-based method of ANN search and identify a significant source… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  15. arXiv:2311.15145  [pdf, other

    cs.CV

    Choosing Wisely and Learning Deeply: Selective Cross-Modality Distillation via CLIP for Domain Generalization

    Authors: Jixuan Leng, Yijiang Li, Haohan Wang

    Abstract: Domain Generalization (DG), a crucial research area, seeks to train models across multiple domains and test them on unseen ones. In this paper, we introduce a novel approach, namely, Selective Cross-Modality Distillation for Domain Generalization (SCMD). SCMD leverages the capabilities of large vision-language models, specifically CLIP, to train a more efficient model, ensuring it acquires robust… ▽ More

    Submitted 21 April, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

  16. arXiv:2311.08217  [pdf, other

    cs.CV

    Peer is Your Pillar: A Data-unbalanced Conditional GANs for Few-shot Image Generation

    Authors: Ziqiang Li, Chaoyue Wang, Xue Rui, Chao Xue, Jiaxu Leng, Bin Li

    Abstract: Few-shot image generation aims to train generative models using a small number of training images. When there are few images available for training (e.g. 10 images), Learning From Scratch (LFS) methods often generate images that closely resemble the training data while Transfer Learning (TL) methods try to improve performance by leveraging prior knowledge from GANs pre-trained on large-scale datas… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: Under Review

  17. arXiv:2311.08134  [pdf, other

    astro-ph.IM astro-ph.HE

    Applying hybrid clustering in pulsar candidate sifting with multi-modality for FAST survey

    Authors: Zi-Yi You, Yun-Rong Pan, Zhi Ma, Li Zhang, Shuo Xiao, Dan-Dan Zhang, Shi-Jun Dang, Ru-Shuang Zhao, Pei Wang, Ai-Jun Dong, Jia-Tao Jiang, Ji-Bing Leng, Wei-An Li, Si-Yao Li

    Abstract: Pulsar search is always the basis of pulsar navigation, gravitational wave detection and other research topics. Currently, the volume of pulsar candidates collected by Five-hundred-meter Aperture Spherical radio Telescope (FAST) shows an explosive growth rate that has brought challenges for its pulsar candidate filtering System. Particularly, the multi-view heterogeneous data and class imbalance b… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  18. arXiv:2311.07102  [pdf, other

    cs.CL

    Fovea Transformer: Efficient Long-Context Modeling with Structured Fine-to-Coarse Attention

    Authors: Ziwei He, Jian Yuan, Le Zhou, **gwen Leng, Bo Jiang

    Abstract: The quadratic complexity of self-attention in Transformers has hindered the processing of long text. To alleviate this problem, previous works have proposed to sparsify the attention matrix, taking advantage of the observation that crucial information about a token can be derived from its neighbors. These methods typically combine one or another form of local attention and global attention. Such c… ▽ More

    Submitted 11 January, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

  19. arXiv:2311.03977  [pdf, ps, other

    quant-ph cs.DS math.OC

    A quantum central path algorithm for linear optimization

    Authors: Brandon Augustino, Jiaqi Leng, Giacomo Nannicini, Tamás Terlaky, Xiaodi Wu

    Abstract: We propose a novel quantum algorithm for solving linear optimization problems by quantum-mechanical simulation of the central path. While interior point methods follow the central path with an iterative algorithm that works with successive linearizations of the perturbed KKT conditions, we perform a single simulation working directly with the nonlinear complementarity equations. Combining our appr… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  20. arXiv:2311.00811  [pdf, other

    quant-ph cs.DS cs.LG math.OC

    A quantum-classical performance separation in nonconvex optimization

    Authors: Jiaqi Leng, Yufan Zheng, Xiaodi Wu

    Abstract: In this paper, we identify a family of nonconvex continuous optimization instances, each $d$-dimensional instance with $2^d$ local minima, to demonstrate a quantum-classical performance separation. Specifically, we prove that the recently proposed Quantum Hamiltonian Descent (QHD) algorithm [Leng et al., arXiv:2303.01471] is able to solve any $d$-dimensional instance from this family using… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: 32 pages, 7 figures. More details of the original Quantum Hamiltonian Descent (QHD) algorithm can be found at arXiv:2303.01471

  21. arXiv:2310.17952  [pdf, other

    cs.CV

    Shape-centered Representation Learning for Visible-Infrared Person Re-identification

    Authors: Shuang Li, Jiaxu Leng, Ji Gan, Meng**gcheng Mo, Xinbo Gao

    Abstract: Current Visible-Infrared Person Re-Identification (VI-ReID) methods prioritize extracting distinguishing appearance features, ignoring the natural resistance of body shape against modality changes. Initially, we gauged the discriminative potential of shapes by a straightforward concatenation of shape and appearance features. However, two unresolved issues persist in the utilization of shape featur… ▽ More

    Submitted 29 October, 2023; v1 submitted 27 October, 2023; originally announced October 2023.

  22. arXiv:2310.15725  [pdf, other

    cs.CV

    Ranking-based Adaptive Query Generation for DETRs in Crowded Pedestrian Detection

    Authors: Feng Gao, Jiaxu Leng, Ji Gan, Xinbo Gao

    Abstract: DEtection TRansformer (DETR) and its variants (DETRs) have been successfully applied to crowded pedestrian detection, which achieved promising performance. However, we find that, in different degrees of crowded scenes, the number of DETRs' queries must be adjusted manually, otherwise, the performance would degrade to varying degrees. In this paper, we first analyze the two current query generation… ▽ More

    Submitted 8 January, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

    Comments: 10 pages, 6 figures

  23. arXiv:2308.08174  [pdf, other

    cs.AR cs.LG

    Accelerating Generic Graph Neural Networks via Architecture, Compiler, Partition Method Co-Design

    Authors: Shuwen Lu, Zhihui Zhang, Cong Guo, **gwen Leng, Yangjie Zhou, Minyi Guo

    Abstract: Graph neural networks (GNNs) have shown significant accuracy improvements in a variety of graph learning domains, sparking considerable research interest. To translate these accuracy improvements into practical applications, it is essential to develop high-performance and efficient hardware acceleration for GNN models. However, designing GNN accelerators faces two fundamental challenges: the high… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  24. arXiv:2307.09146  [pdf, other

    cs.CV

    PRO-Face S: Privacy-preserving Reversible Obfuscation of Face Images via Secure Flow

    Authors: Lin Yuan, Kai Liang, Xiao Pu, Yan Zhang, Jiaxu Leng, Tao Wu, Nannan Wang, Xinbo Gao

    Abstract: This paper proposes a novel paradigm for facial privacy protection that unifies multiple characteristics including anonymity, diversity, reversibility and security within a single lightweight framework. We name it PRO-Face S, short for Privacy-preserving Reversible Obfuscation of Face images via Secure flow-based model. In the framework, an Invertible Neural Network (INN) is utilized to process th… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  25. arXiv:2306.13820  [pdf, ps, other

    math.NT math.CA math.CO math.DS

    Efficient equidistribution of periodic nilsequences and applications

    Authors: James Leng

    Abstract: This is a companion paper to arXiv:2312.10772. We deduce an equidistribution theorem for periodic nilsequences and use this theorem to give two applications in arithmetic combinatorics. The first application is quasi-polynomial bounds for a certain complexity one polynomial progression, improving the iterated logarithm bound previusly obtained. The second application is a proof of the quasi-polyno… ▽ More

    Submitted 27 February, 2024; v1 submitted 23 June, 2023; originally announced June 2023.

    Comments: 50 pages, comments welcome! v5. Reorganized content from arXiv:2312.10772

  26. arXiv:2306.11043  [pdf, other

    cs.DC cs.OS

    DFlow: Efficient Dataflow-based Invocation Workflow Execution for Function-as-a-Service

    Authors: Xiaoxiang Shi, Chao Li, Zijun Li, Zihan Liu, Dianmo Sheng, Quan Chen, **gwen Leng, Minyi Guo

    Abstract: The Serverless Computing is becoming increasingly popular due to its ease of use and fine-grained billing. These features make it appealing for stateful application or serverless workflow. However, current serverless workflow systems utilize a controlflow-based invocation pattern to invoke functions. In this execution pattern, the function invocation depends on the state of the function. A functio… ▽ More

    Submitted 4 July, 2023; v1 submitted 19 June, 2023; originally announced June 2023.

    Comments: 22 pages, 13 figures

  27. arXiv:2306.10855  [pdf, other

    quant-ph

    Quantum Advantage of Noisy Grover's Algorithm

    Authors: Jian Leng, Fan Yang, Xiang-Bin Wang

    Abstract: Quantum advantage is the core of quantum computing. Grover's search algorithm is the only quantum algorithm with proven advantage to any possible classical search algorithm. However, realizing this quantum advantage in practice is quite challenging since Grover's algorithm is very sensitive to noise. Here we present a noise-tolerant method that exponentially improves the noise threshold of Grover'… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

    Comments: 6 figures

  28. arXiv:2306.08423  [pdf, other

    cs.DC

    DistSim: A performance model of large-scale hybrid distributed DNN training

    Authors: Guandong Lu, Runzhe Chen, Yakai Wang, Yangjie Zhou, Rui Zhang, Zheng Hu, Yanming Miao, Zhifang Cai, Li Li, **gwen Leng, Minyi Guo

    Abstract: With the ever-increasing computational demand of DNN training workloads, distributed training has been widely adopted. A combination of data, model and pipeline parallelism strategy, called hybrid parallelism distributed training, is imported to tackle the problem of deploying large-scale models. However, how to evaluate the hybrid strategy and the utilization of each device remains a challenge si… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

  29. arXiv:2305.17408  [pdf, other

    cs.DC cs.LG

    AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels on GPUs

    Authors: Yangjie Zhou, Yaoxu Song, **gwen Leng, Zihan Liu, Weihao Cui, Zhendong Zhang, Cong Guo, Quan Chen, Li Li, Minyi Guo

    Abstract: Graph neural networks (GNNs) are powerful tools for exploring and learning from graph structures and features. As such, achieving high-performance execution for GNNs becomes crucially important. Prior works have proposed to explore the sparsity (i.e., low density) in the input graph to accelerate GNNs, which uses the full-graph-level or block-level sparsity format. We show that they fail to balanc… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

  30. Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator

    Authors: Ziwei He, Meng Yang, Minwei Feng, **gcheng Yin, Xinbing Wang, **gwen Leng, Zhouhan Lin

    Abstract: The transformer model is known to be computationally demanding, and prohibitively costly for long sequences, as the self-attention module uses a quadratic time and space complexity with respect to sequence length. Many researchers have focused on designing new forms of self-attention or introducing new parameters to overcome this limitation, however a large portion of them prohibits the model to i… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Journal ref: Findings of the Association for Computational Linguistics: ACL 2023

  31. arXiv:2305.15018  [pdf, other

    quant-ph

    Modifying $n$-qubit controlled-$ZX$ gate to be $n$-qubit Toffoli gate

    Authors: Jian Leng, Fan Yang, Xiang-Bin Wang

    Abstract: The decomposition for controlled-$ZX$ gate in [Phys. Rev. A, 87, 062318 (2013)] has a shallow circuit depth $8n-20$ with no ancilla. Here we modify this decomposition to decompose $n$-qubit Toffoli gate with only $2n-3$ additional single-qubit gates. The circuit depth is unchanged and no ancilla is needed. We explicitly show that the circuit after decomposition can be easily constructed in present… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 6 pages, 9 figures

  32. arXiv:2305.12300  [pdf, other

    quant-ph

    Improving D2p Grover's algorithm to reach performance upper bound under phase noise

    Authors: Jian Leng, Fan Yang, Xiang-Bin Wang

    Abstract: The original Grover's algorithm has a success probability to output a correct solution, while deterministic Grover's algorithms improve the success probability to 100%. However, the success probability of deterministic Grover's algorithm decreases in noisy environment. Here we improve the deterministic two-parameter (D2p) Grover's algorithm to reach the upper bound for success probability under ph… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

    Comments: 7 pages, 8 figures

  33. arXiv:2305.10801  [pdf, other

    cs.CV

    Selecting Learnable Training Samples is All DETRs Need in Crowded Pedestrian Detection

    Authors: Feng Gao, Jiaxu Leng, Gan Ji, Xinbo Gao

    Abstract: DEtection TRansformer (DETR) and its variants (DETRs) achieved impressive performance in general object detection. However, in crowded pedestrian detection, the performance of DETRs is still unsatisfactory due to the inappropriate sample selection method which results in more false positives. To settle the issue, we propose a simple but effective sample selection method for DETRs, Sample Selection… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  34. arXiv:2304.14846  [pdf

    cond-mat.mes-hall quant-ph

    Ultrafast and Electrically Tunable Rabi Frequency in a Germanium Hut Wire Hole Spin Qubit

    Authors: He Liu, Ke Wang, Fei Gao, ** Leng, Yang Liu, Yu-Chen Zhou, Gang Cao, Ting Wang, Jianjun Zhang, Peihao Huang, Hai-Ou Li, Guo-** Guo

    Abstract: Hole spin qubits based on germanium (Ge) have strong tunable spin orbit interaction (SOI) and ultrafast qubit operation speed. Here we report that the Rabi frequency (f_Rabi) of a hole spin qubit in a Ge hut wire (HW) double quantum dot (DQD) is electrically tuned through the detuning energy and middle gate voltage (V_M). f_Rabi gradually decreases with increasing detuning energy; on the contrary,… ▽ More

    Submitted 28 April, 2023; originally announced April 2023.

    Comments: 19 pages, 4 figures

    Journal ref: Nano Letters 23, 3810-3817 (2023)

  35. OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization

    Authors: Cong Guo, Jiaming Tang, Weiming Hu, **gwen Leng, Chen Zhang, Fan Yang, Yunxin Liu, Minyi Guo, Yuhao Zhu

    Abstract: Transformer-based large language models (LLMs) have achieved great success with the growing model size. LLMs' size grows by $240\times$ every two years, which outpaces the hardware progress and makes model inference increasingly costly. Model quantization is a promising approach to mitigate the widening gap between LLM size and hardware capacity. However, the existence of outliers, values with sig… ▽ More

    Submitted 15 April, 2023; originally announced April 2023.

    Comments: ISCA 2023

  36. arXiv:2304.03352  [pdf, other

    cs.AR

    ImaGen: A General Framework for Generating Memory- and Power-Efficient Image Processing Accelerators

    Authors: Nisarg Ujjainkar, **gwen Leng, Yuhao Zhu

    Abstract: Image processing algorithms are prime targets for hardware acceleration as they are commonly used in resource- and power-limited applications. Today's image processing accelerator designs make rigid assumptions about the algorithm structures and/or on-chip memory resources. As a result, they either have narrow applicability or result in inefficient designs. This paper presents a compiler framewo… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

  37. arXiv:2303.01471  [pdf, other

    quant-ph cs.LG

    Quantum Hamiltonian Descent

    Authors: Jiaqi Leng, Ethan Hickman, Joseph Li, Xiaodi Wu

    Abstract: Gradient descent is a fundamental algorithm in both theory and practice for continuous optimization. Identifying its quantum counterpart would be appealing to both theoretical and practical quantum applications. A conventional approach to quantum speedups in optimization relies on the quantum acceleration of intermediate steps of classical algorithms, while kee** the overall algorithmic trajecto… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: 71 pages, 13 figures, an accompanying website is at https://jiaqileng.github.io/quantum-hamiltonian-descent/

  38. arXiv:2302.11708  [pdf, other

    math.CA math.DS math.SP

    The fractal uncertainty principle via Dolgopyat's method in higher dimensions

    Authors: Aidan Backus, James Leng, Zhongkai Tao

    Abstract: We prove a fractal uncertainty principle with exponent $\frac{d}{2} - δ+ \varepsilon$, $\varepsilon > 0$, for Ahlfors--David regular subsets of $\mathbb R^d$ with dimension $δ$ which satisfy a suitable "nonorthogonality condition". This generalizes the application of Dolgopyat's method by Dyatlov--** (arXiv:1702.03619) to prove the same result in the special case $d = 1$. As a corollary, we get a… ▽ More

    Submitted 9 October, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

    Comments: 33 pages, 5 figures, comments welcome. Contains corrections and improved graphics

    MSC Class: 28A80; 35B34; 81Q50

  39. arXiv:2212.09635  [pdf, other

    math.NT math.CA math.CO

    Improved quadratic Gowers uniformity for the Möbius function

    Authors: James Leng

    Abstract: We demonstrate that $$\|μ\|_{U^3([N])} \ll_{A}^{\text{ineff}} \log^{-A}(N)$$ $$\|Λ- Λ_Q\|_{U^3([N])} \ll_{A}^{\text{ineff}} \log^{-A}(N)$$ for any $A > 0$ where $Λ_Q$ is an approximant to the von Mangoldt function and will be defined below, improving upon a bound of Tao-Teräväinen (2021). As a consequence, among other things, we have the following:… ▽ More

    Submitted 20 March, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: 38 pages. Comments welcome! v3: Fixed sections 1.2 and 8.1

  40. arXiv:2211.10188  [pdf, other

    cs.RO eess.SY

    Piecewise Affine Curvature model: a reduced-order model for soft robot-environment interaction beyond PCC

    Authors: Francesco Stella, Qinghua Guan, **song Leng, Cosimo Della Santina, Josie Hughes

    Abstract: Soft robot are celebrated for their propensity to enable compliant and complex robot-environment interactions. Soft robotic manipulators, or slender continuum structure robots have the potential to exploit these interactions to enable new exploration and manipulation capabilities and safe human-robot interactions. However, the interactions, or perturbations by external forces cause the soft struct… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: Submitted to IEEE RoboSoft 2023

  41. arXiv:2210.15972  [pdf, other

    cs.CV

    Contextual Learning in Fourier Complex Field for VHR Remote Sensing Images

    Authors: Yan Zhang, Xiyuan Gao, Qingyan Duan, Jiaxu Leng, Xiao Pu, Xinbo Gao

    Abstract: Very high-resolution (VHR) remote sensing (RS) image classification is the fundamental task for RS image analysis and understanding. Recently, transformer-based models demonstrated outstanding potential for learning high-order contextual relationships from natural images with general resolution (224x224 pixels) and achieved remarkable results on general image classification tasks. However, the com… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

  42. arXiv:2210.15812  [pdf, other

    quant-ph cs.LG

    Differentiable Analog Quantum Computing for Optimization and Control

    Authors: Jiaqi Leng, Yuxiang Peng, Yi-Ling Qiao, Ming Lin, Xiaodi Wu

    Abstract: We formulate the first differentiable analog quantum computing framework with a specific parameterization design at the analog signal (pulse) level to better exploit near-term quantum devices via variational methods. We further propose a scalable approach to estimate the gradients of quantum dynamics using a forward pass with Monte Carlo sampling, which leads to a quantum stochastic gradient desce… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: Code available at https://github.com/YilingQiao/diffquantum

    Journal ref: In the Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  43. arXiv:2209.10778  [pdf, other

    cs.LG

    Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training

    Authors: Cong Guo, Yuxian Qiu, **gwen Leng, Chen Zhang, Ying Cao, Quanlu Zhang, Yunxin Liu, Fan Yang, Minyi Guo

    Abstract: An activation function is an element-wise mathematical function and plays a crucial role in deep neural networks (DNN). Many novel and sophisticated activation functions have been proposed to improve the DNN accuracy but also consume massive memory in the training process with back-propagation. In this study, we propose the nested forward automatic differentiation (Forward-AD), specifically for th… ▽ More

    Submitted 22 September, 2022; originally announced September 2022.

    Comments: 8 pages, ICCD 2022

  44. arXiv:2208.14286  [pdf, other

    cs.LG

    ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization

    Authors: Cong Guo, Chen Zhang, **gwen Leng, Zihan Liu, Fan Yang, Yunxin Liu, Minyi Guo, Yuhao Zhu

    Abstract: Quantization is a technique to reduce the computation and memory cost of DNN models, which are getting increasingly large. Existing quantization solutions use fixed-point integer or floating-point types, which have limited benefits, as both require more bits to maintain the accuracy of original models. On the other hand, variable-length quantization uses low-bit quantization for normal values and… ▽ More

    Submitted 30 August, 2022; originally announced August 2022.

    Comments: 20 pages, accepted by MICRO 2022

  45. arXiv:2208.11945  [pdf, other

    cs.LG cs.CV

    Efficient Adaptive Activation Rounding for Post-Training Quantization

    Authors: Zhengyi Li, Cong Guo, Zhanda Zhu, Yangjie Zhou, Yuxian Qiu, Xiaotian Gao, **gwen Leng, Minyi Guo

    Abstract: Post-training quantization attracts increasing attention due to its convenience in deploying quantized neural networks. Although rounding-to-nearest remains the prevailing method for DNN quantization, prior research has demonstrated its suboptimal nature when applied to weight quantization. They propose optimizing weight rounding schemes by leveraging output error rather than the traditional weigh… ▽ More

    Submitted 23 August, 2023; v1 submitted 25 August, 2022; originally announced August 2022.

  46. arXiv:2206.14550  [pdf, other

    cs.AR cs.AI cs.LG

    SALO: An Efficient Spatial Accelerator Enabling Hybrid Sparse Attention Mechanisms for Long Sequences

    Authors: Guan Shen, Jieru Zhao, Quan Chen, **gwen Leng, Chao Li, Minyi Guo

    Abstract: The attention mechanisms of transformers effectively extract pertinent information from the input sequence. However, the quadratic complexity of self-attention w.r.t the sequence length incurs heavy computational and memory burdens, especially for tasks with long sequences. Existing accelerators face performance degradation in these tasks. To this end, we propose SALO to enable hybrid sparse atten… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

    Comments: Accepted by 59th DAC

  47. arXiv:2205.07324  [pdf, other

    cs.CL

    Transkimmer: Transformer Learns to Layer-wise Skim

    Authors: Yue Guan, Zhengyi Li, **gwen Leng, Zhouhan Lin, Minyi Guo

    Abstract: Transformer architecture has become the de-facto model for many machine learning tasks from natural language processing and computer vision. As such, improving its computational efficiency becomes paramount. One of the major computational inefficiency of Transformer-based models is that they spend the identical amount of computation throughout all layers. Prior works have proposed to augment the T… ▽ More

    Submitted 15 May, 2022; originally announced May 2022.

    Comments: Published as a conference paper at ACL 2022

  48. arXiv:2205.05540  [pdf, other

    math.NT math.CA math.CO

    A Quantitative Bound For Szemerédi's Theorem for a Complexity One Polynomial Progression over $\mathbb{Z}/N\mathbb{Z}$

    Authors: James Leng

    Abstract: Let $N$ be a large prime and $P, Q \in \mathbb{Z}[x]$ two linearly independent polynomials with $P(0) = Q(0) = 0$. We show that if a subset $A$ of $\mathbb{Z}/N\mathbb{Z}$ lacks a progression of the form $(x, x + P(y), x + Q(y), x + P(y) + Q(y))$, then $$|A| \le O\left(\frac{N}{\log_{(O(1))}(N)}\right)$$ where $\log_{C}(N)$ is an iterated logarithm of order $C$ (e.g., $\log_{2}(N) = \log\log(N)$).… ▽ More

    Submitted 20 May, 2024; v1 submitted 11 May, 2022; originally announced May 2022.

    Comments: 33 pages. Journal version

  49. Quantum simulation of real-space dynamics

    Authors: Andrew M. Childs, Jiaqi Leng, Tongyang Li, **-Peng Liu, Chenyi Zhang

    Abstract: Quantum simulation is a prominent application of quantum computers. While there is extensive previous work on simulating finite-dimensional systems, less is known about quantum algorithms for real-space dynamics. We conduct a systematic study of such algorithms. In particular, we show that the dynamics of a $d$-dimensional Schrödinger equation with $η$ particles can be simulated with gate complexi… ▽ More

    Submitted 7 November, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Journal ref: Quantum 6, 860 (2022)

  50. arXiv:2203.14101   

    cs.LG cs.AI cs.CL

    A Roadmap for Big Model

    Authors: Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han, Zhenghao Liu, Ning Ding, Yongming Rao, Yizhao Gao, Liang Zhang, Ming Ding, Cong Fang, Yisen Wang, Mingsheng Long, **g Zhang, Yinpeng Dong, Tianyu Pang, Peng Cui , et al. (75 additional authors not shown)

    Abstract: With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM… ▽ More

    Submitted 20 April, 2022; v1 submitted 26 March, 2022; originally announced March 2022.

    Comments: This report has been withdrawn by the authors due to critical issues in Section 2.3.1 of Article 2