Search | arXiv e-print repository

Hydrogen-free low-temperature silica for next generation integrated photonics

Authors: Zheru Qiu, Zihan Li, Rui Ning Wang, Xinru Ji, Marta Divall, Anat Siddharth, Tobias J. Kippenberg

Abstract: The advances in novel low-loss "on insulator" integrated photonics platforms beyond silicon, such as thin-film LiNbO3, LiTaO3, GaP and BaTiO3 have demonstrated major potential across a wide range of applications, due to their unique electro-optical or nonlinear optical properties. This has heralded novel devices, ranging from low-voltage and high-speed modulators to parametric amplifiers. For such… ▽ More The advances in novel low-loss "on insulator" integrated photonics platforms beyond silicon, such as thin-film LiNbO3, LiTaO3, GaP and BaTiO3 have demonstrated major potential across a wide range of applications, due to their unique electro-optical or nonlinear optical properties. This has heralded novel devices, ranging from low-voltage and high-speed modulators to parametric amplifiers. For such photonic integrated circuits, a low-loss SiO2 cladding layer is a key element, serving as a passivation layer for the waveguides and enabling efficient fiber-to-chip coupling. However, numerous novel ferroelectric or III-V "on insulator" platforms have low tolerances for process temperature. This prohibits using high-temperature anneals to remove hydrogen, a common impurity that is inherent to ordinary chemical vapor deposited SiO2 and causes significant optical loss in the near infrared. Here, we satisfy the dichotomy of a low-loss wafer scale manufactured SiO2 cladding and low processing temperature. Inspired by the manufacturing of optical fibers, we introduce a hydrogen-free, low-loss SiO2 cladding that is deposited at low temperatures (300 degrees Celsius) by using SiCl4 and O2 as precursors in inductively coupled plasma-enhanced chemical vapor deposition (ICPCVD). By replacing hydrogenous silicon precursors (e.g. SiH4) with SiCl4, the deposited film is inherently free from residual hydrogen. The process temperature is compatible with the "on insulator" platforms and CMOS electronic integrated circuits. We demonstrate a wide low-loss window that covers all telecommunication bands from 1260 nm to 1625 nm. We achieve a < 2.5 dB/m waveguide loss at 1550 nm, comparable with 1200 degree Celsius annealed films. Our SiCl4 process provides a key future cladding for all recently emerged "on-insulator" photonics platforms, that is low cost, scalable in manufacturing, and directly foundry compatible. △ Less

Submitted 26 April, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

Comments: 7 pages, 4 figures

arXiv:2312.04726 [pdf, other]

MR-conditional Robotic Actuation of Concentric Tendon-Driven Cardiac Catheters

Authors: Yifan Wang, Zheng Qiu, Junichi Tokuda, Ehud J. Schmidt, Aravindan Kolandaivelu, Yue Chen

Abstract: Atrial fibrillation (AF) and ventricular tachycardia (VT) are two of the sustained arrhythmias that significantly affect the quality of life of patients. Treatment of AF and VT often requires radiofrequency ablation of heart tissues using an ablation catheter. Recent progress in ablation therapy leverages magnetic resonance imaging (MRI) for higher contrast visual feedback, and additionally utiliz… ▽ More Atrial fibrillation (AF) and ventricular tachycardia (VT) are two of the sustained arrhythmias that significantly affect the quality of life of patients. Treatment of AF and VT often requires radiofrequency ablation of heart tissues using an ablation catheter. Recent progress in ablation therapy leverages magnetic resonance imaging (MRI) for higher contrast visual feedback, and additionally utilizes a guiding sheath with an actively deflectable tip to improve the dexterity of the catheter inside the heart. This paper presents the design and validation of an MR-conditional robotic module for automated actuation of both the ablation catheter and the sheath. The robotic module features a compact design for improved accessibility inside the MR scanner bore and is driven by piezoelectric motors to ensure MR-conditionality. The combined catheter-sheath mechanism is essentially a concentric tendon-driven continuum robot and its kinematics is modeled by the constant curvature model for closed-loop position control. Path following experiments were conducted to validate the actuation module and control scheme, achieving < 2 mm average tip position error. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: 7 pages, 7 figures, submitted to IEEE ISMR 2024

arXiv:2312.01338 [pdf, other]

doi 10.1109/TMI.2023.3335651

Enhancing and Adapting in the Clinic: Source-free Unsupervised Domain Adaptation for Medical Image Enhancement

Authors: Heng Li, Ziqin Lin, Zhongxi Qiu, Zinan Li, Huazhu Fu, Yan Hu, Jiang Liu

Abstract: Medical imaging provides many valuable clues involving anatomical structure and pathological characteristics. However, image degradation is a common issue in clinical practice, which can adversely impact the observation and diagnosis by physicians and algorithms. Although extensive enhancement models have been developed, these models require a well pre-training before deployment, while failing to… ▽ More Medical imaging provides many valuable clues involving anatomical structure and pathological characteristics. However, image degradation is a common issue in clinical practice, which can adversely impact the observation and diagnosis by physicians and algorithms. Although extensive enhancement models have been developed, these models require a well pre-training before deployment, while failing to take advantage of the potential value of inference data after deployment. In this paper, we raise an algorithm for source-free unsupervised domain adaptive medical image enhancement (SAME), which adapts and optimizes enhancement models using test data in the inference phase. A structure-preserving enhancement network is first constructed to learn a robust source model from synthesized training data. Then a teacher-student model is initialized with the source model and conducts source-free unsupervised domain adaptation (SFUDA) by knowledge distillation with the test data. Additionally, a pseudo-label picker is developed to boost the knowledge distillation of enhancement tasks. Experiments were implemented on ten datasets from three medical image modalities to validate the advantage of the proposed algorithm, and setting analysis and ablation studies were also carried out to interpret the effectiveness of SAME. The remarkable enhancement performance and benefits for downstream tasks demonstrate the potential and generalizability of SAME. The code is available at https://github.com/liamheng/Annotation-free-Medical-Image-Enhancement. △ Less

Submitted 3 December, 2023; originally announced December 2023.

Comments: 14 pages, 9 figures, in IEEE Transactions on Medical Imaging

arXiv:2311.16763 [pdf]

doi 10.1007/s11433-023-2329-x

Structural transition, electric transport, and electronic structures in the compressed trilayer nickelate La4Ni3O10

Authors: **gyuan Li, Cui-Qun Chen, Chaoxin Huang, Yifeng Han, Mengwu Huo, Xing Huang, Peiyue Ma, Zhengyang Qiu, Junfeng Chen, Xunwu Hu, Lan Chen, Tao Xie, Bing Shen, Hualei Sun, Dao-Xin Yao, Meng Wang

Abstract: Atomic structure and electronic band structure are fundamental properties for understanding the mechanism of superconductivity. Motivated by the discovery of pressure-induced high-temperature superconductivity at 80 K in the bilayer Ruddlesden-Popper nickelate La3Ni2O7, the atomic structure and electronic band structure of the trilayer nickelate La4Ni3O10 under pressure up to 44.3 GPa are investig… ▽ More Atomic structure and electronic band structure are fundamental properties for understanding the mechanism of superconductivity. Motivated by the discovery of pressure-induced high-temperature superconductivity at 80 K in the bilayer Ruddlesden-Popper nickelate La3Ni2O7, the atomic structure and electronic band structure of the trilayer nickelate La4Ni3O10 under pressure up to 44.3 GPa are investigated. A structural transition from the monoclinic P21/a space group to the tetragonal I4/mmm around 12.6-13.4 GPa is identified, accompanying with a drop of resistance below 7 K. Density functional theory calculations suggest that the bonding state of Ni 3dz2 orbital rises and crosses the Fermi level at high pressures, which may give rise to possible superconductivity observed in resistance under pressure in La4Ni3O10. The trilayer nickelate La4Ni3O10 shows some similarities with the bilayer La3Ni2O7 and has unique properties, providing a new platform to investigate the underlying mechanism of superconductivity in nickelates. △ Less

Submitted 30 March, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

Comments: 19 pages, 4 figures

Journal ref: SCIENCE CHINA Physics, Mechanics & Astronomy 67.11(2024):117403

arXiv:2311.12633 [pdf, ps, other]

Finite groups with some subgroups satisfying the partial $ Π$-property

Authors: Zhengtian Qiu, Guiyun Chen, Jianjun Liu

Abstract: Let $ H $ be a subgroup of a finite group $ G $. We say that $ H $ satisfies the partial $ Π$-property in $ G $ if there exists a chief series $ \varGamma_{G}: 1 =G_{0} < G_{1} < \cdot\cdot\cdot < G_{n}= G $ of $ G $ such that for every $ G $-chief factor $ G_{i}/G_{i-1} $ $(1\leq i\leq n) $ of $ \varGamma_{G} $, $ | G / G_{i-1} : N _{G/G_{i-1}} (HG_{i-1}/G_{i-1}\cap G_{i}/G_{i-1})| $ is a… ▽ More Let $ H $ be a subgroup of a finite group $ G $. We say that $ H $ satisfies the partial $ Π$-property in $ G $ if there exists a chief series $ \varGamma_{G}: 1 =G_{0} < G_{1} < \cdot\cdot\cdot < G_{n}= G $ of $ G $ such that for every $ G $-chief factor $ G_{i}/G_{i-1} $ $(1\leq i\leq n) $ of $ \varGamma_{G} $, $ | G / G_{i-1} : N _{G/G_{i-1}} (HG_{i-1}/G_{i-1}\cap G_{i}/G_{i-1})| $ is a $ π(HG_{i-1}/G_{i-1}\cap G_{i}/G_{i-1}) $-number. In this paper, we investigate how some subgroups satisfying the partial $Π$-property influence the structure of finite groups. △ Less

Submitted 21 November, 2023; originally announced November 2023.

arXiv:2311.12220 [pdf]

NavMarkAR: A Landmark-based Augmented Reality (AR) Wayfinding System for Enhancing Spatial Learning of Older Adults

Authors: Zhiwen Qiu, Mojtaba Ashour, Xiaohe Zhou, Saleh Kalantari

Abstract: Wayfinding in complex indoor environments is often challenging for older adults due to declines in navigational and spatial-cognition abilities. This paper introduces NavMarkAR, an augmented reality navigation system designed for smart-glasses to provide landmark-based guidance, aiming to enhance older adults' spatial navigation skills. This work addresses a significant gap in design research, wit… ▽ More Wayfinding in complex indoor environments is often challenging for older adults due to declines in navigational and spatial-cognition abilities. This paper introduces NavMarkAR, an augmented reality navigation system designed for smart-glasses to provide landmark-based guidance, aiming to enhance older adults' spatial navigation skills. This work addresses a significant gap in design research, with limited prior studies evaluating cognitive impacts of AR navigation systems. An initial usability test involved 6 participants, leading to prototype refinements, followed by a comprehensive study with 32 participants in a university setting. Results indicate improved wayfinding efficiency and cognitive map accuracy when using NavMarkAR. Future research will explore long-term cognitive skill retention with such navigational aids. △ Less

Submitted 11 December, 2023; v1 submitted 20 November, 2023; originally announced November 2023.

Comments: 24 pages

arXiv:2311.11923 [pdf]

Use of Augmented Reality in Human Wayfinding: A Systematic Review

Authors: Zhiwen Qiu, Armin Mostafavi, Saleh Kalantari

Abstract: Augmented reality technology has emerged as a promising solution to assist with wayfinding difficulties, bridging the gap between obtaining navigational assistance and maintaining an awareness of one's real-world surroundings. This article presents a systematic review of research literature related to AR navigation technologies. An in-depth analysis of 65 salient studies was conducted, addressing… ▽ More Augmented reality technology has emerged as a promising solution to assist with wayfinding difficulties, bridging the gap between obtaining navigational assistance and maintaining an awareness of one's real-world surroundings. This article presents a systematic review of research literature related to AR navigation technologies. An in-depth analysis of 65 salient studies was conducted, addressing four main research topics: 1) current state-of-the-art of AR navigational assistance technologies, 2) user experiences with these technologies, 3) the effect of AR on human wayfinding performance, and 4) impacts of AR on human navigational cognition. Notably, studies demonstrate that AR can decrease cognitive load and improve cognitive map development, in contrast to traditional guidance modalities. However, findings regarding wayfinding performance and user experience were mixed. Some studies suggest little impact of AR on improving outdoor navigational performance, and certain information modalities may be distracting and ineffective. This article discusses these nuances in detail, supporting the conclusion that AR holds great potential in enhancing wayfinding by providing enriched navigational cues, interactive experiences, and improved situational awareness. △ Less

Submitted 20 November, 2023; originally announced November 2023.

Comments: 52 pages

arXiv:2311.06243 [pdf, other]

Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Authors: Weiyang Liu, Zeju Qiu, Yao Feng, Yuliang Xiu, Yuxuan Xue, Longhui Yu, Haiwen Feng, Zhen Liu, Juyeon Heo, Songyou Peng, Yandong Wen, Michael J. Black, Adrian Weller, Bernhard Schölkopf

Abstract: Large foundation models are becoming ubiquitous, but training them from scratch is prohibitively expensive. Thus, efficiently adapting these powerful models to downstream tasks is increasingly important. In this paper, we study a principled finetuning paradigm -- Orthogonal Finetuning (OFT) -- for downstream task adaptation. Despite demonstrating good generalizability, OFT still uses a fairly larg… ▽ More Large foundation models are becoming ubiquitous, but training them from scratch is prohibitively expensive. Thus, efficiently adapting these powerful models to downstream tasks is increasingly important. In this paper, we study a principled finetuning paradigm -- Orthogonal Finetuning (OFT) -- for downstream task adaptation. Despite demonstrating good generalizability, OFT still uses a fairly large number of trainable parameters due to the high dimensionality of orthogonal matrices. To address this, we start by examining OFT from an information transmission perspective, and then identify a few key desiderata that enable better parameter-efficiency. Inspired by how the Cooley-Tukey fast Fourier transform algorithm enables efficient information transmission, we propose an efficient orthogonal parameterization using butterfly structures. We apply this parameterization to OFT, creating a novel parameter-efficient finetuning method, called Orthogonal Butterfly (BOFT). By subsuming OFT as a special case, BOFT introduces a generalized orthogonal finetuning framework. Finally, we conduct an extensive empirical study of adapting large vision transformers, large language models, and text-to-image diffusion models to various downstream tasks in vision and language. △ Less

Submitted 28 April, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

Comments: ICLR 2024 (v2: 34 pages, 19 figures)

arXiv:2311.00156 [pdf, other]

Rethinking the Cloudonomics of Efficient I/O for Data-Intensive Analytics Applications

Authors: Chunxu Tang, Yi Wang, Bin Fan, Beinan Wang, Shouwei Chen, Ziyue Qiu, Chen Liang, **g Zhao, Yu Zhu, Mingmin Chen, Zhongting Hu

Abstract: This paper explores a prevailing trend in the industry: migrating data-intensive analytics applications from on-premises to cloud-native environments. We find that the unique cost models associated with cloud-based storage necessitate a more nuanced understanding of optimizing performance. Specifically, based on traces collected from Uber's Presto fleet in production, we argue that common I/O opti… ▽ More This paper explores a prevailing trend in the industry: migrating data-intensive analytics applications from on-premises to cloud-native environments. We find that the unique cost models associated with cloud-based storage necessitate a more nuanced understanding of optimizing performance. Specifically, based on traces collected from Uber's Presto fleet in production, we argue that common I/O optimizations, such as table scan and filter, and broadcast join, may lead to unexpected costs when naively applied in the cloud. This is because traditional I/O optimizations mainly focus on improving throughput or latency in on-premises settings, without taking into account the monetary costs associated with storage API calls. In cloud environments, these costs can be significant, potentially involving billions of API calls per day just for Presto workloads at Uber scale. Presented as a case study, this paper serves as a starting point for further research to design efficient I/O strategies specifically tailored for data-intensive applications in cloud settings. △ Less

Submitted 31 October, 2023; originally announced November 2023.

Comments: 6 pages, 3 figures

arXiv:2310.20695 [pdf, other]

HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception

Authors: Junkun Yuan, Xinyu Zhang, Hao Zhou, Jian Wang, Zhongwei Qiu, Zhiyin Shao, Shaofeng Zhang, Sifan Long, Kun Kuang, Kun Yao, Junyu Han, Errui Ding, Lanfen Lin, Fei Wu, **gdong Wang

Abstract: Model pre-training is essential in human-centric perception. In this paper, we first introduce masked image modeling (MIM) as a pre-training approach for this task. Upon revisiting the MIM training strategy, we reveal that human structure priors offer significant potential. Motivated by this insight, we further incorporate an intuitive human structure prior - human parts - into pre-training. Speci… ▽ More Model pre-training is essential in human-centric perception. In this paper, we first introduce masked image modeling (MIM) as a pre-training approach for this task. Upon revisiting the MIM training strategy, we reveal that human structure priors offer significant potential. Motivated by this insight, we further incorporate an intuitive human structure prior - human parts - into pre-training. Specifically, we employ this prior to guide the mask sampling process. Image patches, corresponding to human part regions, have high priority to be masked out. This encourages the model to concentrate more on body structure information during pre-training, yielding substantial benefits across a range of human-centric perception tasks. To further capture human characteristics, we propose a structure-invariant alignment loss that enforces different masked views, guided by the human part prior, to be closely aligned for the same image. We term the entire method as HAP. HAP simply uses a plain ViT as the encoder yet establishes new state-of-the-art performance on 11 human-centric benchmarks, and on-par result on one dataset. For example, HAP achieves 78.1% mAP on MSMT17 for person re-identification, 86.54% mA on PA-100K for pedestrian attribute recognition, 78.2% AP on MS COCO for 2D pose estimation, and 56.0 PA-MPJPE on 3DPW for 3D pose and shape estimation. △ Less

Submitted 31 October, 2023; originally announced October 2023.

Comments: Accepted by NeurIPS 2023

arXiv:2310.10909 [pdf, other]

Heterogenous Memory Augmented Neural Networks

Authors: Zihan Qiu, Zhen Liu, Shuicheng Yan, Shanghang Zhang, Jie Fu

Abstract: It has been shown that semi-parametric methods, which combine standard neural networks with non-parametric components such as external memory modules and data retrieval, are particularly helpful in data scarcity and out-of-distribution (OOD) scenarios. However, existing semi-parametric methods mostly depend on independent raw data points - this strategy is difficult to scale up due to both high co… ▽ More It has been shown that semi-parametric methods, which combine standard neural networks with non-parametric components such as external memory modules and data retrieval, are particularly helpful in data scarcity and out-of-distribution (OOD) scenarios. However, existing semi-parametric methods mostly depend on independent raw data points - this strategy is difficult to scale up due to both high computational costs and the incapacity of current attention mechanisms with a large number of tokens. In this paper, we introduce a novel heterogeneous memory augmentation approach for neural networks which, by introducing learnable memory tokens with attention mechanism, can effectively boost performance without huge computational overhead. Our general-purpose method can be seamlessly combined with various backbones (MLP, CNN, GNN, and Transformer) in a plug-and-play manner. We extensively evaluate our approach on various image and graph-based tasks under both in-distribution (ID) and OOD conditions and show its competitive performance against task-specific state-of-the-art methods. Code is available at \url{https://github.com/qiuzh20/HMA}. △ Less

Submitted 16 October, 2023; originally announced October 2023.

arXiv:2310.10908 [pdf, other]

Unlocking Emergent Modularity in Large Language Models

Authors: Zihan Qiu, Zeyu Huang, Jie Fu

Abstract: Modular Neural Networks (MNNs) demonstrate various advantages over monolithic models. Existing MNNs are generally $\textit{explicit}$: their modular architectures are pre-defined, with individual modules expected to implement distinct functions. Recent works reveal that there exists $\textit{implicit}$ modularity in standard pre-trained transformers, namely $\textit{Emergent Modularity}$. They ind… ▽ More Modular Neural Networks (MNNs) demonstrate various advantages over monolithic models. Existing MNNs are generally $\textit{explicit}$: their modular architectures are pre-defined, with individual modules expected to implement distinct functions. Recent works reveal that there exists $\textit{implicit}$ modularity in standard pre-trained transformers, namely $\textit{Emergent Modularity}$. They indicate that such modular structures spontaneously exhibit during the early pre-training phase. Despite the benefits of modularity, most Language Models (LMs) are still treated as monolithic models in the pre-train and fine-tune paradigm, with their emergent modularity locked and underutilized. In this work, focusing on unlocking the emergent modularity in LMs, we showcase that standard LMs could be fine-tuned as their Mixture-of-Expert (MoEs) counterparts without introducing any extra parameters. Such MoEs are derived from emergent modularity and are referred to as Emergent MoEs (EMoE). Our experiments demonstrate that fine-tuning EMoE effectively improves downstream in-domain and out-of-domain generalization compared with vanilla fine-tuning. Our analysis and ablation studies further illustrate that it is robust to various configurations and can scale up to Large Language Models (i.e., Llama2-7B and Llama-30B). Code is available at https://github.com/qiuzh20/EMoE. △ Less

Submitted 1 April, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

Comments: NAACL2024 Main Conference

Journal ref: 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics

arXiv:2310.09814 [pdf, ps, other]

On $\mathscr L$-$Π$-property of subgroups of finite groups

Authors: Zhengtian Qiu, Guiyun Chen, Jianjun Liu

Abstract: Let $H$ be a subgroup of a finite group $G$. We say that $H$ satisfies $\mathscr L $-$Π$-property in $G$ if $| G / K : N_{G / K} (HK/K)|$ is a $π(HK/K)$-number for all maximal $G$-invariant subgroup $K$ of $H^G$. In this paper, we study the influence of some $p$-subgroups of $G$ satisfying $\mathscr L $-$Π$-property on the structure of $G$. Let $H$ be a subgroup of a finite group $G$. We say that $H$ satisfies $\mathscr L $-$Π$-property in $G$ if $| G / K : N_{G / K} (HK/K)|$ is a $π(HK/K)$-number for all maximal $G$-invariant subgroup $K$ of $H^G$. In this paper, we study the influence of some $p$-subgroups of $G$ satisfying $\mathscr L $-$Π$-property on the structure of $G$. △ Less

Submitted 1 February, 2024; v1 submitted 15 October, 2023; originally announced October 2023.

arXiv:2310.08111 [pdf, ps, other]

Homogenization of the distribution-dependent stochastic abstract fluid models

Authors: Junlong Chen, Zhaoyang Qiu, Yanbin Tang

Abstract: In this paper, we study the homogenization of the distribution-dependent stochastic abstract fluid models by combining the $two\!-\!scale$ convergence and martingale representative approach. A general framework of the homogenization research is established for stochastic abstract fluid models, which is the type of genuine-nonlinear partial differential equations including the (distribution-depende… ▽ More In this paper, we study the homogenization of the distribution-dependent stochastic abstract fluid models by combining the $two\!-\!scale$ convergence and martingale representative approach. A general framework of the homogenization research is established for stochastic abstract fluid models, which is the type of genuine-nonlinear partial differential equations including the (distribution-dependent) stochastic Navier-Stokes equations, stochastic magneto-hydrodynamic equations, stochastic Boussinesq equations, stochastic micropolar equations, stochastic Allen-Cahn equations. △ Less

Submitted 22 October, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

Comments: no comments

arXiv:2310.07961 [pdf]

doi 10.1038/s41598-023-44745-9

Three-dimensional solitons in Rydberg-Dressed cold atomic gases with spin-orbit coupling

Authors: Yuan Zhao, Heng-Jie Hu, Qian-Qian Zhou, Zhang-Cai Qiu, Li Xue, Si-Liu Xu, Qin Zhou, Boris A. Malomed

Abstract: We present numerical results for three-dimensional (3D) solitons with symmetries of the semi-vortex (SV) and mixed-mode (MM) types, which can be created in spinor Bose-Einstein condensates of Rydberg atoms under the action of the spin-orbit coupling (SOC). By means of systematic numerical computations, we demonstrate that the interplay of SOC and long-range spherically symmetric Rydberg interactio… ▽ More We present numerical results for three-dimensional (3D) solitons with symmetries of the semi-vortex (SV) and mixed-mode (MM) types, which can be created in spinor Bose-Einstein condensates of Rydberg atoms under the action of the spin-orbit coupling (SOC). By means of systematic numerical computations, we demonstrate that the interplay of SOC and long-range spherically symmetric Rydberg interactions stabilize the 3D solitons, improving their resistance to collapse. We find how the stability range depends on the strengths of the SOC and Rydberg interactions and the soft-core atomic radius. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: to be published in Scientific Reports

Journal ref: Scientific Reports 13(2023) 18079

arXiv:2310.05372 [pdf]

doi 10.1016/j.scitotenv.2023.166714

The role of hydrodynamics for the spatial distribution of high-temperature hydrothermal vent-endemic fauna in the deep ocean environment

Authors: Zhiguo He, Yingzhong Lou, Haoyang Zhang, Xiqiu Han, Thomas Pähtz, Pengcheng Jiao, Peng Hu, Yadong Zhou, Yejian Wang, Zhongyan Qiu

Abstract: Active hydrothermal vents provide the surrounding submarine environment with substantial amounts of matter and energy, thus serving as important habitats for diverse megabenthic communities in the deep ocean and constituting a unique, highly productive chemosynthetic ecosystem on Earth. Vent-endemic biological communities gather near the venting site and are usually not found beyond a distance of… ▽ More Active hydrothermal vents provide the surrounding submarine environment with substantial amounts of matter and energy, thus serving as important habitats for diverse megabenthic communities in the deep ocean and constituting a unique, highly productive chemosynthetic ecosystem on Earth. Vent-endemic biological communities gather near the venting site and are usually not found beyond a distance of the order of 100 m from the vent. This is surprising because one would actually expect matter ejected from high-temperature vents, which generate highly turbulent buoyancy plumes, to be suspended and carried far away by the plume flows and deep-sea currents. Here, we study this problem from a fluid dynamics perspective by simulating the vent hydrodynamics using a numerical model that couples the plume flow with induced matter and energy transport. We find that both low- and high-temperature vents deposit most vent matter relatively close to the plume. In particular, the tendency of turbulent buoyancy plumes to carry matter far away is strongly counteracted by generated entrainment flows back into the plume stem. The deposition ranges of organic and inorganic hydrothermal particles obtained from the simulations for various natural high-temperature vents are consistent with the observed maximum spatial extent of biological communities, evidencing that plume hydrodynamics exercises strong control over the spatial distribution of vent-endemic fauna. While other factors affecting the spatial distribution of vent-endemic fauna, such as geology and geochemistry, are site-specific, the main physical features of plume hydrodynamics unraveled in this study are largely site-unspecific and therefore universal across vent sites on Earth. △ Less

Submitted 8 October, 2023; originally announced October 2023.

Journal ref: Science of the Total Environment 904, 166714 (2023)

arXiv:2309.13817 [pdf, other]

MMA-Net: Multiple Morphology-Aware Network for Automated Cobb Angle Measurement

Authors: Zhengxuan Qiu, Jie Yang, Jiankun Wang

Abstract: Scoliosis diagnosis and assessment depend largely on the measurement of the Cobb angle in spine X-ray images. With the emergence of deep learning techniques that employ landmark detection, tilt prediction, and spine segmentation, automated Cobb angle measurement has become increasingly popular. However, these methods encounter difficulties such as high noise sensitivity, intricate computational pr… ▽ More Scoliosis diagnosis and assessment depend largely on the measurement of the Cobb angle in spine X-ray images. With the emergence of deep learning techniques that employ landmark detection, tilt prediction, and spine segmentation, automated Cobb angle measurement has become increasingly popular. However, these methods encounter difficulties such as high noise sensitivity, intricate computational procedures, and exclusive reliance on a single type of morphological information. In this paper, we introduce the Multiple Morphology-Aware Network (MMA-Net), a novel framework that improves Cobb angle measurement accuracy by integrating multiple spine morphology as attention information. In the MMA-Net, we first feed spine X-ray images into the segmentation network to produce multiple morphological information (spine region, centerline, and boundary) and then concatenate the original X-ray image with the resulting segmentation maps as input for the regression module to perform precise Cobb angle measurement. Furthermore, we devise joint loss functions for our segmentation and regression network training, respectively. We evaluate our method on the AASCE challenge dataset and achieve superior performance with the SMAPE of 7.28% and the MAE of 3.18°, indicating a strong competitiveness compared to other outstanding methods. Consequently, we can offer clinicians automated, efficient, and reliable Cobb angle measurement. △ Less

Submitted 24 September, 2023; originally announced September 2023.

arXiv:2309.13607 [pdf, other]

MM-NeRF: Multimodal-Guided 3D Multi-Style Transfer of Neural Radiance Field

Authors: Zijiang Yang, Zhongwei Qiu, Chang Xu, Dongmei Fu

Abstract: 3D style transfer aims to generate stylized views of 3D scenes with specified styles, which requires high-quality generating and kee** multi-view consistency. Existing methods still suffer the challenges of high-quality stylization with texture details and stylization with multimodal guidance. In this paper, we reveal that the common training method of stylization with NeRF, which generates styl… ▽ More 3D style transfer aims to generate stylized views of 3D scenes with specified styles, which requires high-quality generating and kee** multi-view consistency. Existing methods still suffer the challenges of high-quality stylization with texture details and stylization with multimodal guidance. In this paper, we reveal that the common training method of stylization with NeRF, which generates stylized multi-view supervision by 2D style transfer models, causes the same object in supervision to show various states (color tone, details, etc.) in different views, leading NeRF to tend to smooth the texture details, further resulting in low-quality rendering for 3D multi-style transfer. To tackle these problems, we propose a novel Multimodal-guided 3D Multi-style transfer of NeRF, termed MM-NeRF. First, MM-NeRF projects multimodal guidance into a unified space to keep the multimodal styles consistency and extracts multimodal features to guide the 3D stylization. Second, a novel multi-head learning scheme is proposed to relieve the difficulty of learning multi-style transfer, and a multi-view style consistent loss is proposed to track the inconsistency of multi-view supervision data. Finally, a novel incremental learning mechanism to generalize MM-NeRF to any new style with small costs. Extensive experiments on several real-world datasets show that MM-NeRF achieves high-quality 3D multi-style stylization with multimodal guidance, and keeps multi-view consistency and style consistency between multimodal guidance. Codes will be released. △ Less

Submitted 28 November, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

arXiv:2309.09534 [pdf, other]

Selective Volume Mixup for Video Action Recognition

Authors: Yi Tan, Zhaofan Qiu, Yanbin Hao, Ting Yao, Xiangnan He, Tao Mei

Abstract: The recent advances in Convolutional Neural Networks (CNNs) and Vision Transformers have convincingly demonstrated high learning capability for video action recognition on large datasets. Nevertheless, deep models often suffer from the overfitting effect on small-scale datasets with a limited number of training videos. A common solution is to exploit the existing image augmentation strategies for… ▽ More The recent advances in Convolutional Neural Networks (CNNs) and Vision Transformers have convincingly demonstrated high learning capability for video action recognition on large datasets. Nevertheless, deep models often suffer from the overfitting effect on small-scale datasets with a limited number of training videos. A common solution is to exploit the existing image augmentation strategies for each frame individually including Mixup, Cutmix, and RandAugment, which are not particularly optimized for video data. In this paper, we propose a novel video augmentation strategy named Selective Volume Mixup (SV-Mix) to improve the generalization ability of deep models with limited training videos. SV-Mix devises a learnable selective module to choose the most informative volumes from two videos and mixes the volumes up to achieve a new training video. Technically, we propose two new modules, i.e., a spatial selective module to select the local patches for each spatial position, and a temporal selective module to mix the entire frames for each timestamp and maintain the spatial pattern. At each time, we randomly choose one of the two modules to expand the diversity of training samples. The selective modules are jointly optimized with the video action recognition framework to find the optimal augmentation strategy. We empirically demonstrate the merits of the SV-Mix augmentation on a wide range of video action recognition benchmarks and consistently boot the performances of both CNN-based and transformer-based models. △ Less

Submitted 18 September, 2023; originally announced September 2023.

arXiv:2309.02692 [pdf, other]

Hy-DeFake: Hypergraph Neural Networks for Detecting Fake News in Online Social Networks

Authors: Xing Su, Jian Yang, Jia Wu, Zitai Qiu

Abstract: Nowadays social media is the primary platform for people to obtain news and share information. Combating online fake news has become an urgent task to reduce the damage it causes to society. Existing methods typically improve their fake news detection performances by utilizing textual auxiliary information (such as relevant retweets and comments) or simple structural information (i.e., graph const… ▽ More Nowadays social media is the primary platform for people to obtain news and share information. Combating online fake news has become an urgent task to reduce the damage it causes to society. Existing methods typically improve their fake news detection performances by utilizing textual auxiliary information (such as relevant retweets and comments) or simple structural information (i.e., graph construction). However, these methods face two challenges. First, an increasing number of users tend to directly forward the source news without adding comments, resulting in a lack of textual auxiliary information. Second, simple graphs are unable to extract complex relations beyond pairwise association in a social context. Given that real-world social networks are intricate and involve high-order relations, we argue that exploring beyond pairwise relations between news and users is crucial for fake news detection. Therefore, we propose constructing an attributed hypergraph to represent non-textual and high-order relations for user participation in news spreading. We also introduce a hypergraph neural network-based method called Hy-DeFake to tackle the challenges. Our proposed method captures semantic information from news content, credibility information from involved users, and high-order correlations between news and users to learn distinctive embeddings for fake news detection. The superiority of Hy-DeFake is demonstrated through experiments conducted on four widely-used datasets, and it is compared against eight baselines using four evaluation metrics. △ Less

Submitted 22 December, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

arXiv:2308.15172 [pdf, other]

Is visual explanation with Grad-CAM more reliable for deeper neural networks? a case study with automatic pneumothorax diagnosis

Authors: Zirui Qiu, Hassan Rivaz, Yiming Xiao

Abstract: While deep learning techniques have provided the state-of-the-art performance in various clinical tasks, explainability regarding their decision-making process can greatly enhance the credence of these methods for safer and quicker clinical adoption. With high flexibility, Gradient-weighted Class Activation Map** (Grad-CAM) has been widely adopted to offer intuitive visual interpretation of vari… ▽ More While deep learning techniques have provided the state-of-the-art performance in various clinical tasks, explainability regarding their decision-making process can greatly enhance the credence of these methods for safer and quicker clinical adoption. With high flexibility, Gradient-weighted Class Activation Map** (Grad-CAM) has been widely adopted to offer intuitive visual interpretation of various deep learning models' reasoning processes in computer-assisted diagnosis. However, despite the popularity of the technique, there is still a lack of systematic study on Grad-CAM's performance on different deep learning architectures. In this study, we investigate its robustness and effectiveness across different popular deep learning models, with a focus on the impact of the networks' depths and architecture types, by using a case study of automatic pneumothorax diagnosis in X-ray scans. Our results show that deeper neural networks do not necessarily contribute to a strong improvement of pneumothorax diagnosis accuracy, and the effectiveness of GradCAM also varies among different network architectures. △ Less

Submitted 29 August, 2023; originally announced August 2023.

arXiv:2307.12142 [pdf, other]

Free-electron interaction with nonlinear optical states in microresonators

Authors: Yujia Yang, Jan-Wilke Henke, Arslan S. Raja, F. Jasmin Kappert, Guanhao Huang, Germaine Arend, Zheru Qiu, Armin Feist, Rui Ning Wang, Aleksandr Tusnin, Alexey Tikan, Claus Ropers, Tobias J. Kippenberg

Abstract: The short de Broglie wavelength and strong interaction empower free electrons to probe scattering and excitations in materials and resolve the structure of biomolecules. Recent advances in using nanophotonic structures to mediate bilinear electron-photon interaction have brought novel optical manipulation schemes to electron beams, enabling high space-time-energy resolution electron microscopy, qu… ▽ More The short de Broglie wavelength and strong interaction empower free electrons to probe scattering and excitations in materials and resolve the structure of biomolecules. Recent advances in using nanophotonic structures to mediate bilinear electron-photon interaction have brought novel optical manipulation schemes to electron beams, enabling high space-time-energy resolution electron microscopy, quantum-coherent optical modulation, attosecond metrology and pulse generation, transverse electron wavefront sha**, dielectric laser acceleration, and electron-photon pair generation. However, photonic nanostructures also exhibit nonlinearities, which have to date not been exploited for electron-photon interactions. Here, we report the interaction of electrons with spontaneously generated Kerr nonlinear optical states inside a continuous-wave driven photonic chip-based microresonator. Optical parametric processes give rise to spatiotemporal pattern formation, or dissipative structures, corresponding to coherent or incoherent optical frequency combs. By coupling such microcombs in situ to electron beams, we demonstrate that different dissipative structures induce distinct fingerprints in the electron spectra and Ramsey-type interference patterns. In particular, using spontaneously formed femtosecond temporal solitons, we achieve ultrafast temporal gating of the electron beam without the necessity of a pulsed laser source or a pulsed electron source. Our work elucidates the interaction of free electrons with a variety of nonlinear dissipative states, demonstrates the ability to access solitons inside an electron microscope, and extends the use of microcombs to unexplored territories, with ramifications in novel ultrafast electron microscopy, light-matter interactions driven by on-chip temporal solitons, and ultra-high spatiotemporal resolution sampling of nonlinear optical dynamics and devices. △ Less

Submitted 22 July, 2023; originally announced July 2023.

arXiv:2307.08209 [pdf, other]

Ada3D : Exploiting the Spatial Redundancy with Adaptive Inference for Efficient 3D Object Detection

Authors: Tianchen Zhao, Xuefei Ning, Ke Hong, Zhongyuan Qiu, Pu Lu, Yali Zhao, Linfeng Zhang, Lipu Zhou, Guohao Dai, Huazhong Yang, Yu Wang

Abstract: Voxel-based methods have achieved state-of-the-art performance for 3D object detection in autonomous driving. However, their significant computational and memory costs pose a challenge for their application to resource-constrained vehicles. One reason for this high resource consumption is the presence of a large number of redundant background points in Lidar point clouds, resulting in spatial redu… ▽ More Voxel-based methods have achieved state-of-the-art performance for 3D object detection in autonomous driving. However, their significant computational and memory costs pose a challenge for their application to resource-constrained vehicles. One reason for this high resource consumption is the presence of a large number of redundant background points in Lidar point clouds, resulting in spatial redundancy in both 3D voxel and dense BEV map representations. To address this issue, we propose an adaptive inference framework called Ada3D, which focuses on exploiting the input-level spatial redundancy. Ada3D adaptively filters the redundant input, guided by a lightweight importance predictor and the unique properties of the Lidar point cloud. Additionally, we utilize the BEV features' intrinsic sparsity by introducing the Sparsity Preserving Batch Normalization. With Ada3D, we achieve 40% reduction for 3D voxels and decrease the density of 2D BEV feature maps from 100% to 20% without sacrificing accuracy. Ada3D reduces the model computational and memory cost by 5x, and achieves 1.52x/1.45x end-to-end GPU latency and 1.5x/4.5x GPU peak memory optimization for the 3D and 2D backbone respectively. △ Less

Submitted 8 August, 2023; v1 submitted 16 July, 2023; originally announced July 2023.

Comments: Accepted at ICCV2023

arXiv:2307.05722 [pdf, other]

Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations

Authors: Likang Wu, Zhaopeng Qiu, Zhi Zheng, Hengshu Zhu, Enhong Chen

Abstract: Large Language Models (LLMs) have revolutionized natural language processing tasks, demonstrating their exceptional capabilities in various domains. However, their potential for behavior graph understanding in job recommendations remains largely unexplored. This paper focuses on unveiling the capability of large language models in understanding behavior graphs and leveraging this understanding to… ▽ More Large Language Models (LLMs) have revolutionized natural language processing tasks, demonstrating their exceptional capabilities in various domains. However, their potential for behavior graph understanding in job recommendations remains largely unexplored. This paper focuses on unveiling the capability of large language models in understanding behavior graphs and leveraging this understanding to enhance recommendations in online recruitment, including the promotion of out-of-distribution (OOD) application. We present a novel framework that harnesses the rich contextual information and semantic representations provided by large language models to analyze behavior graphs and uncover underlying patterns and relationships. Specifically, we propose a meta-path prompt constructor that leverages LLM recommender to understand behavior graphs for the first time and design a corresponding path augmentation module to alleviate the prompt bias introduced by path-based sequence input. By leveraging this capability, our framework enables personalized and accurate job recommendations for individual users. We evaluate the effectiveness of our approach on a comprehensive dataset and demonstrate its ability to improve the relevance and quality of recommended quality. This research not only sheds light on the untapped potential of large language models but also provides valuable insights for develo** advanced recommendation systems in the recruitment market. The findings contribute to the growing field of natural language processing and offer practical implications for enhancing job search experiences. We release the code at https://github.com/WLiK/GLRec. △ Less

Submitted 23 December, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

arXiv:2307.04388 [pdf, ps, other]

Core localized alpha-channeling via low frequency Alfven mode generation in reversed shear scenarios

Authors: Zhiyong Qiu, Shizhao Wei, Tao Wang, Liu Chen, Fulvio Zonca

Abstract: A novel channel for fuel ions heating in tokamak core plasma is proposed and analyzed using nonlinear gyrokinetic theory. The channel is achieved via spontaneous decay of reversed shear Alfvén eigenmode (RSAE) into low frequency Alfvén modes (LFAM), which then heat fuel ions via collisionless ion Landau dam**. The conditions for RSAE spontaneous decay are investigated, and the saturation level a… ▽ More A novel channel for fuel ions heating in tokamak core plasma is proposed and analyzed using nonlinear gyrokinetic theory. The channel is achieved via spontaneous decay of reversed shear Alfvén eigenmode (RSAE) into low frequency Alfvén modes (LFAM), which then heat fuel ions via collisionless ion Landau dam**. The conditions for RSAE spontaneous decay are investigated, and the saturation level and the consequent fuel ion heating rate are also derived. The channel is expected to be crucial for future reactors operating under reversed shear configurations, where fusion alpha particles are generated in the tokamak core where the magnetic shear is typically reversed, and there is a dense RSAE spectrum due to the small alpha particle characteristic dimensionless orbits. △ Less

Submitted 10 July, 2023; originally announced July 2023.

Comments: 4 page proceeding for EPS-DPP conference

arXiv:2307.04271 [pdf, ps, other]

Large deviations of invariant measure for the 3D stochastic hyperdissipative Navier-Stokes equations

Authors: Zhaoyang Qiu, Hui Liu, Chengfeng Sun

Abstract: In this paper, we consider the large deviations of invariant measure for the 3D stochastic hyperdissipative Navier-Stokes equations driven by additive noise. The unique ergodicity of invariant measure as a preliminary result is proved using a deterministic argument by the exponential moment and exponential stability estimates. Then, the uniform large deviations is established by the uniform contra… ▽ More In this paper, we consider the large deviations of invariant measure for the 3D stochastic hyperdissipative Navier-Stokes equations driven by additive noise. The unique ergodicity of invariant measure as a preliminary result is proved using a deterministic argument by the exponential moment and exponential stability estimates. Then, the uniform large deviations is established by the uniform contraction principle. Finally, using the unique ergodicity and the uniform large deviations results, we prove the large deviations of invariant measure by verifying the Freidlin-Wentzell large deviations upper and lower bounds. △ Less

Submitted 9 July, 2023; originally announced July 2023.

MSC Class: 35Q35; 76D05; 35R60; 60F10

arXiv:2307.02157 [pdf, other]

Generative Job Recommendations with Large Language Model

Authors: Zhi Zheng, Zhaopeng Qiu, Xiao Hu, Likang Wu, Hengshu Zhu, Hui Xiong

Abstract: The rapid development of online recruitment services has encouraged the utilization of recommender systems to streamline the job seeking process. Predominantly, current job recommendations deploy either collaborative filtering or person-job matching strategies. However, these models tend to operate as "black-box" systems and lack the capacity to offer explainable guidance to job seekers. Moreover,… ▽ More The rapid development of online recruitment services has encouraged the utilization of recommender systems to streamline the job seeking process. Predominantly, current job recommendations deploy either collaborative filtering or person-job matching strategies. However, these models tend to operate as "black-box" systems and lack the capacity to offer explainable guidance to job seekers. Moreover, conventional matching-based recommendation methods are limited to retrieving and ranking existing jobs in the database, restricting their potential as comprehensive career AI advisors. To this end, here we present GIRL (GeneratIve job Recommendation based on Large language models), a novel approach inspired by recent advancements in the field of Large Language Models (LLMs). We initially employ a Supervised Fine-Tuning (SFT) strategy to instruct the LLM-based generator in crafting suitable Job Descriptions (JDs) based on the Curriculum Vitae (CV) of a job seeker. Moreover, we propose to train a model which can evaluate the matching degree between CVs and JDs as a reward model, and we use Proximal Policy Optimization (PPO)-based Reinforcement Learning (RL) method to further fine-tine the generator. This aligns the generator with recruiter feedback, tailoring the output to better meet employer preferences. In particular, GIRL serves as a job seeker-centric generative model, providing job suggestions without the need of a candidate set. This capability also enhances the performance of existing job recommendation models by supplementing job seeking features with generated content. With extensive experiments on a large-scale real-world dataset, we demonstrate the substantial effectiveness of our approach. We believe that GIRL introduces a paradigm-shifting approach to job recommendation systems, fostering a more personalized and comprehensive job-seeking experience. △ Less

Submitted 5 July, 2023; originally announced July 2023.

arXiv:2306.17074 [pdf, other]

Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation

Authors: Zhongwei Qiu, Qiansheng Yang, Jian Wang, Xiyu Wang, Chang Xu, Dongmei Fu, Kun Yao, Junyu Han, Errui Ding, **gdong Wang

Abstract: One of the mainstream schemes for 2D human pose estimation (HPE) is learning keypoints heatmaps by a neural network. Existing methods typically improve the quality of heatmaps by customized architectures, such as high-resolution representation and vision Transformers. In this paper, we propose \textbf{DiffusionPose}, a new scheme that formulates 2D HPE as a keypoints heatmaps generation problem fr… ▽ More One of the mainstream schemes for 2D human pose estimation (HPE) is learning keypoints heatmaps by a neural network. Existing methods typically improve the quality of heatmaps by customized architectures, such as high-resolution representation and vision Transformers. In this paper, we propose \textbf{DiffusionPose}, a new scheme that formulates 2D HPE as a keypoints heatmaps generation problem from noised heatmaps. During training, the keypoints are diffused to random distribution by adding noises and the diffusion model learns to recover ground-truth heatmaps from noised heatmaps with respect to conditions constructed by image feature. During inference, the diffusion model generates heatmaps from initialized heatmaps in a progressive denoising way. Moreover, we further explore improving the performance of DiffusionPose with conditions from human structural information. Extensive experiments show the prowess of our DiffusionPose, with improvements of 1.6, 1.2, and 1.2 mAP on widely-used COCO, CrowdPose, and AI Challenge datasets, respectively. △ Less

Submitted 29 June, 2023; originally announced June 2023.

arXiv:2306.15579 [pdf, ps, other]

Gyrokinetic theory of toroidal Alfvén eigenmode saturation via nonlinear wave-wave coupling

Authors: Zhiyong Qiu, Liu Chen, Fulvio Zonca

Abstract: Nonlinear wave-wave coupling constitutes an important route for the turbulence spectrum evolution in both space and laboratory plasmas. For example, in a reactor relevant fusion plasma, a rich spectrum of symmetry breaking shear Alfvén wave (SAW) instabilities are expected to be excited by energetic fusion alpha particles, and self-consistently determine the anomalous alpha particle transport rate… ▽ More Nonlinear wave-wave coupling constitutes an important route for the turbulence spectrum evolution in both space and laboratory plasmas. For example, in a reactor relevant fusion plasma, a rich spectrum of symmetry breaking shear Alfvén wave (SAW) instabilities are expected to be excited by energetic fusion alpha particles, and self-consistently determine the anomalous alpha particle transport rate by the saturated electromagnetic perturbations. In this work, we will show that the nonlinear gyrokinetic theory is a necessary and powerful tool in qualitatively and quantitatively investigating the nonlinear wave-wave coupling processes. More specifically, one needs to employ the gyrokinetic approach in order to account for the breaking of the ``pure Alfvénic state" in the short wavelength kinetic regime, due to the short wavelength structures associated with nonuniformity intrinsic to magnetically confined plasmas. Using well-known toroidal Alfvén eigenmode (TAE) as a paradigm case, three nonlinear wave-wave coupling channels expected to significantly influence the TAE nonlinear dynamics are investigated to demonstrate the strength and necessity of nonlinear gyrokinetic theory in predicting crucial processes in a future reactor burning plasma. These are: 1. the nonlinear excitation of meso-scale zonal field structures via modulational instability and TAE scattering into short-wavelength stable domain; 2. the TAE frequency cascading due to nonlinear ion induced scattering and the resulting saturated TAE spectrum; and 3. the cross-scale coupling of TAE with micro-scale ambient drift wave turbulence and its effect on TAE regulation and anomalous electron heating. △ Less

Submitted 27 June, 2023; originally announced June 2023.

Comments: submitted to Reviews of Modern Plasma Physics

arXiv:2306.15238 [pdf, other]

On Nonlinear Scattering of Drift Wave by Toroidal Alfven Eigenmode in Tokamak Plasmas

Authors: Liu Chen, Zhiyong Qiu, Fulvio Zonca

Abstract: Using electron drift wave (eDW) as a paradigm model, we have investigated analytically direct wave-wave interactions between a test DW and ambient toroidal Alfvén eigenmodes (TAE) in toroidal plasmas, and their effects on the stability of the eDW. The nonlinear effects enter via scatterings to short-wavelength electron Landau damped kinetic Alfvén waves (KAWs). Specifically, it is found that scatt… ▽ More Using electron drift wave (eDW) as a paradigm model, we have investigated analytically direct wave-wave interactions between a test DW and ambient toroidal Alfvén eigenmodes (TAE) in toroidal plasmas, and their effects on the stability of the eDW. The nonlinear effects enter via scatterings to short-wavelength electron Landau damped kinetic Alfvén waves (KAWs). Specifically, it is found that scatterings to upper-sideband KAW lead to stimulated absorption of eDW. Scatterings to the lower-sideband KAW, on the contrary, lead to its spontaneous emission. As a consequence, for typical parameters and fluctuation intensity, nonlinear scatterings by TAE have negligible net effects on the eDW stability; in contrast to the ``reverse" process investigated in Ref. [Nuclear Fusion {\bf 62}, 094001 (2022)], where it is shown that nonlinear scattering by ambient eDW may lead to significant dam** of TAE. △ Less

Submitted 27 June, 2023; originally announced June 2023.

Comments: Submitted to Nuclear Fusion on March 19th, 2023

arXiv:2306.08642 [pdf, other]

Nonlinear equilibria and transport processes in burning plasmas

Authors: Matteo Valerio Falessi, Liu Chen, Zhiyong Qiu, Fulvio Zonca

Abstract: In this work, we put forward a general phase-space transport theory in axisymmetric tokamak plasmas based upon the concept of zonal state (ZS). Within this theoretical framework, the ZS corresponds to a renormalized plasma nonlinear equilibrium consisting of phase-space zonal structures (PSZS) and zonal electromagnetic fields (ZFs) which evolve self-consistently with symmetry breaking fluctuations… ▽ More In this work, we put forward a general phase-space transport theory in axisymmetric tokamak plasmas based upon the concept of zonal state (ZS). Within this theoretical framework, the ZS corresponds to a renormalized plasma nonlinear equilibrium consisting of phase-space zonal structures (PSZS) and zonal electromagnetic fields (ZFs) which evolve self-consistently with symmetry breaking fluctuations and sources/collisions. More specifically, our approach involves deriving governing equations for the evolution of particle distribution functions (i.e, PSZS), which can be used to compute the corresponding macro-/meso-scale evolving magnetized plasma equilibrium adopting the Chew Goldberger Low (CGL) description, separating the spatiotemporal microscale structures. The nonlinear physics of ZFs and of geodesic acoustic modes/energetic particle driven geodesic acoustic modes is then analyzed to illustrate the implications of our theory. △ Less

Submitted 27 November, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

arXiv:2306.07990 [pdf, other]

Photonic-electronic integrated circuit-based coherent LiDAR engine

Authors: Anton Lukashchuk, Halil Kerim Yildirim, Andrea Bancora, Grigory Lihachev, Yang Liu, Zheru Qiu, Xinru Ji, Andrey Voloshin, Sunil A. Bhave, Edoardo Charbon, Tobias J. Kippenberg

Abstract: Microelectronic integration is a key enabler for the ubiquitous deployment of devices in large volumes ranging from MEMS and imaging sensors to consumer electronics. Such integration has also been achieved in photonics, where compact optical transceivers for data centers employ co-integrated photonic and electronic components. Chip-scale integration is of particular interest to coherent laser rang… ▽ More Microelectronic integration is a key enabler for the ubiquitous deployment of devices in large volumes ranging from MEMS and imaging sensors to consumer electronics. Such integration has also been achieved in photonics, where compact optical transceivers for data centers employ co-integrated photonic and electronic components. Chip-scale integration is of particular interest to coherent laser ranging i.e. frequency modulated continuous wave (FMCW LiDAR), a perception technology that benefits from instantaneous velocity and distance detection, eye-safe operation, long-range and immunity to interference. Full wafer-scale integration of this technology has been compounded by the stringent requirements on the lasers, requiring high optical coherence, low chirp nonlinearity and requiring optical amplifiers. Here, we overcome this challenge and demonstrate a photonic-electronic integrated circuit-based coherent LiDAR engine, that combined all functionalities using fully foundry-compatible wafer scale manufacturing. It is comprised of a micro-electronic based high voltage arbitrary waveform generator, a hybrid photonic circuit based tunable Vernier laser with piezoelectric actuators, and an erbium-doped waveguide optical amplifier - all realized in a wafer scale manufacturing compatible process that comprises III-V semiconductors, SiN silicon nitride photonic integrated circuits as well as 130nm SiGe BiCMOS technology. The source is a turnkey, linearization-free, and can serve as a 'drop-in' solution in any FMCW LiDAR, that can be seamlessly integrated with an existing focal plane and optical phased array LiDAR approaches, constituting a missing step towards a fully chip-scale integrated LiDAR system. △ Less

Submitted 10 June, 2023; originally announced June 2023.

arXiv:2306.07280 [pdf, other]

Controlling Text-to-Image Diffusion by Orthogonal Finetuning

Authors: Zeju Qiu, Weiyang Liu, Haiwen Feng, Yuxuan Xue, Yao Feng, Zhen Liu, Dan Zhang, Adrian Weller, Bernhard Schölkopf

Abstract: Large text-to-image diffusion models have impressive capabilities in generating photorealistic images from text prompts. How to effectively guide or control these powerful models to perform different downstream tasks becomes an important open problem. To tackle this challenge, we introduce a principled finetuning method -- Orthogonal Finetuning (OFT), for adapting text-to-image diffusion models to… ▽ More Large text-to-image diffusion models have impressive capabilities in generating photorealistic images from text prompts. How to effectively guide or control these powerful models to perform different downstream tasks becomes an important open problem. To tackle this challenge, we introduce a principled finetuning method -- Orthogonal Finetuning (OFT), for adapting text-to-image diffusion models to downstream tasks. Unlike existing methods, OFT can provably preserve hyperspherical energy which characterizes the pairwise neuron relationship on the unit hypersphere. We find that this property is crucial for preserving the semantic generation ability of text-to-image diffusion models. To improve finetuning stability, we further propose Constrained Orthogonal Finetuning (COFT) which imposes an additional radius constraint to the hypersphere. Specifically, we consider two important finetuning text-to-image tasks: subject-driven generation where the goal is to generate subject-specific images given a few images of a subject and a text prompt, and controllable generation where the goal is to enable the model to take in additional control signals. We empirically show that our OFT framework outperforms existing methods in generation quality and convergence speed. △ Less

Submitted 13 March, 2024; v1 submitted 12 June, 2023; originally announced June 2023.

Comments: NeurIPS 2023 (v3: fixed formula typos in Section 3.5, 43 pages, 34 figures, project page: https://oft.wyliu.com/)

arXiv:2306.03184 [pdf, other]

Hertz-linewidth and frequency-agile photonic integrated extended-DBR lasers

Authors: Anat Siddharth, Alaina Attanasio, Grigory Lihachev, Junyin Zhang, Zheru Qiu, Scott Kenning, Rui Ning Wang, Sunil A. Bhave, Johann Riemensberger, Tobias J. Kippenberg

Abstract: Recent advances in the development of ultra-low loss silicon nitride (Si3N4)-based photonic integrated circuits have allowed integrated lasers to achieve a coherence exceeding those of fiber lasers and enabled unprecedentedly fast (Megahertz bandwidth) tuning using monolithically integrated piezoelectrical actuators. While this marks the first time that fiber laser coherence is achieved using phot… ▽ More Recent advances in the development of ultra-low loss silicon nitride (Si3N4)-based photonic integrated circuits have allowed integrated lasers to achieve a coherence exceeding those of fiber lasers and enabled unprecedentedly fast (Megahertz bandwidth) tuning using monolithically integrated piezoelectrical actuators. While this marks the first time that fiber laser coherence is achieved using photonic integrated circuits, in conjunction with frequency agility that exceeds those of legacy bulk lasers, the approach is presently compounded by the high cost of manufacturing DFB, as required for self-injection locking, as well as the precise control over the laser current and temperature to sustain a low noise locked operation. Reflective semiconductor optical amplifiers (RSOA) provide a cost-effective alternative solution but have not yet achieved similar performance in coherence or frequency agility, as required for frequency modulated continuous wave (FMCW) LiDAR, laser locking in frequency metrology or wavelength modulation spectroscopy for gas sensing. Here, we overcome this challenge and demonstrate an RSOA-based and frequency agile integrated laser tuned with high speed, good linearity, high optical output power, and turn-key operability while maintaining a small footprint. This is achieved using a tunable extended distributed Bragg reflector (E-DBR) in an ultra-low loss 200 nm thin Si3N4 platform with monolithically integrated piezoelectric actuators. We co-integrate the DBR with a compact ultra-low loss spiral resonator to further reduce the intrinsic optical linewidth of the laser to the Hertz level -- on par with the noise of a fiber laser -- via self-injection locking. △ Less

Submitted 10 July, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

Comments: 9 pages, 4 figures

arXiv:2306.03065 [pdf, other]

doi 10.1145/3580305.3599861

LibAUC: A Deep Learning Library for X-Risk Optimization

Authors: Zhuoning Yuan, Dixian Zhu, Zi-Hao Qiu, Gang Li, Xuanhui Wang, Tianbao Yang

Abstract: This paper introduces the award-winning deep learning (DL) library called LibAUC for implementing state-of-the-art algorithms towards optimizing a family of risk functions named X-risks. X-risks refer to a family of compositional functions in which the loss function of each data point is defined in a way that contrasts the data point with a large number of others. They have broad applications in A… ▽ More This paper introduces the award-winning deep learning (DL) library called LibAUC for implementing state-of-the-art algorithms towards optimizing a family of risk functions named X-risks. X-risks refer to a family of compositional functions in which the loss function of each data point is defined in a way that contrasts the data point with a large number of others. They have broad applications in AI for solving classical and emerging problems, including but not limited to classification for imbalanced data (CID), learning to rank (LTR), and contrastive learning of representations (CLR). The motivation of develo** LibAUC is to address the convergence issues of existing libraries for solving these problems. In particular, existing libraries may not converge or require very large mini-batch sizes in order to attain good performance for these problems, due to the usage of the standard mini-batch technique in the empirical risk minimization (ERM) framework. Our library is for deep X-risk optimization (DXO) that has achieved great success in solving a variety of tasks for CID, LTR and CLR. The contributions of this paper include: (1) It introduces a new mini-batch based pipeline for implementing DXO algorithms, which differs from existing DL pipeline in the design of controlled data samplers and dynamic mini-batch losses; (2) It provides extensive benchmarking experiments for ablation studies and comparison with existing libraries. The LibAUC library features scalable performance for millions of items to be contrasted, faster and better convergence than existing libraries for optimizing X-risks, seamless PyTorch deployment and versatile APIs for various loss optimization. Our library is available to the open source community at https://github.com/Optimization-AI/LibAUC, to facilitate further academic research and industrial applications. △ Less

Submitted 5 June, 2023; originally announced June 2023.

Comments: Accepted by KDD2023

arXiv:2306.02583 [pdf, other]

Stable Diffusion is Unstable

Authors: Chengbin Du, Yanxi Li, Zhongwei Qiu, Chang Xu

Abstract: Recently, text-to-image models have been thriving. Despite their powerful generative capacity, our research has uncovered a lack of robustness in this generation process. Specifically, the introduction of small perturbations to the text prompts can result in the blending of primary subjects with other categories or their complete disappearance in the generated images. In this paper, we propose Aut… ▽ More Recently, text-to-image models have been thriving. Despite their powerful generative capacity, our research has uncovered a lack of robustness in this generation process. Specifically, the introduction of small perturbations to the text prompts can result in the blending of primary subjects with other categories or their complete disappearance in the generated images. In this paper, we propose Auto-attack on Text-to-image Models (ATM), a gradient-based approach, to effectively and efficiently generate such perturbations. By learning a Gumbel Softmax distribution, we can make the discrete process of word replacement or extension continuous, thus ensuring the differentiability of the perturbation generation. Once the distribution is learned, ATM can sample multiple attack samples simultaneously. These attack samples can prevent the generative model from generating the desired subjects without compromising image quality. ATM has achieved a 91.1% success rate in short-text attacks and an 81.2% success rate in long-text attacks. Further empirical analysis revealed four attack patterns based on: 1) the variability in generation speed, 2) the similarity of coarse-grained characteristics, 3) the polysemy of words, and 4) the positioning of words. △ Less

Submitted 6 June, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

Comments: 22 pages, 20 figures

arXiv:2306.02185 [pdf, other]

Manipulating chiral-spin transport with ferroelectric polarization

Authors: Xiaoxi Huang, Xianzhe Chen, Yuhang Li, John Mangeri, Hongrui Zhang, Maya Ramesh, Hossein Taghinejad, Peter Meisenheimer, Lucas Caretta, Sandhya Susarla, Rakshit Jain, Christoph Klewe, Tianye Wang, Rui Chen, Cheng-Hsiang Hsu, Hao Pan, Jia Yin, Padraic Shafer, Ziqiang Qiu, Davi R. Rodrigues, Olle Heinonen, Dilip Vasudevan, Jorge Iniguez, Darrell G. Schlom, Sayeef Salahuddin , et al. (6 additional authors not shown)

Abstract: A collective excitation of the spin structure in a magnetic insulator can transmit spin-angular momentum with negligible dissipation. This quantum of a spin wave, introduced more than nine decades ago, has always been manipulated through magnetic dipoles, (i.e., timereversal symmetry). Here, we report the experimental observation of chiral-spin transport in multiferroic BiFeO3, where the spin tran… ▽ More A collective excitation of the spin structure in a magnetic insulator can transmit spin-angular momentum with negligible dissipation. This quantum of a spin wave, introduced more than nine decades ago, has always been manipulated through magnetic dipoles, (i.e., timereversal symmetry). Here, we report the experimental observation of chiral-spin transport in multiferroic BiFeO3, where the spin transport is controlled by reversing the ferroelectric polarization (i.e., spatial inversion symmetry). The ferroelectrically controlled magnons produce an unprecedented ratio of up to 18% rectification at room temperature. The spin torque that the magnons in BiFeO3 carry can be used to efficiently switch the magnetization of adja-cent magnets, with a spin-torque efficiency being comparable to the spin Hall effect in heavy metals. Utilizing such a controllable magnon generation and transmission in BiFeO3, an alloxide, energy-scalable logic is demonstrated composed of spin-orbit injection, detection, and magnetoelectric control. This observation opens a new chapter of multiferroic magnons and paves an alternative pathway towards low-dissipation nanoelectronics. △ Less

Submitted 3 June, 2023; originally announced June 2023.

arXiv:2305.19860 [pdf, other]

A Survey on Large Language Models for Recommendation

Authors: Likang Wu, Zhi Zheng, Zhaopeng Qiu, Hao Wang, Hongchao Gu, Tingjia Shen, Chuan Qin, Chen Zhu, Hengshu Zhu, Qi Liu, Hui Xiong, Enhong Chen

Abstract: Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) and have recently gained significant attention in the domain of Recommendation Systems (RS). These models, trained on massive amounts of data using self-supervised learning, have demonstrated remarkable success in learning universal representations and have the potential to enhance various… ▽ More Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) and have recently gained significant attention in the domain of Recommendation Systems (RS). These models, trained on massive amounts of data using self-supervised learning, have demonstrated remarkable success in learning universal representations and have the potential to enhance various aspects of recommendation systems by some effective transfer techniques such as fine-tuning and prompt tuning, and so on. The crucial aspect of harnessing the power of language models in enhancing recommendation quality is the utilization of their high-quality representations of textual features and their extensive coverage of external knowledge to establish correlations between items and users. To provide a comprehensive understanding of the existing LLM-based recommendation systems, this survey presents a taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec), with the latter being systematically sorted out for the first time. Furthermore, we systematically review and analyze existing LLM-based recommendation systems within each paradigm, providing insights into their methodologies, techniques, and performance. Additionally, we identify key challenges and several valuable findings to provide researchers and practitioners with inspiration. We have also created a GitHub repository to index relevant papers on LLMs for recommendation, https://github.com/WLiK/LLM4Rec. △ Less

Submitted 18 June, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

Comments: 34 pages, 7 figures, 2 tables

arXiv:2305.18730 [pdf, other]

Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for Multi-Block Bilevel Optimization

Authors: Quanqi Hu, Zi-Hao Qiu, Zhishuai Guo, Lijun Zhang, Tianbao Yang

Abstract: In this paper, we consider non-convex multi-block bilevel optimization (MBBO) problems, which involve $m\gg 1$ lower level problems and have important applications in machine learning. Designing a stochastic gradient and controlling its variance is more intricate due to the hierarchical sampling of blocks and data and the unique challenge of estimating hyper-gradient. We aim to achieve three nice… ▽ More In this paper, we consider non-convex multi-block bilevel optimization (MBBO) problems, which involve $m\gg 1$ lower level problems and have important applications in machine learning. Designing a stochastic gradient and controlling its variance is more intricate due to the hierarchical sampling of blocks and data and the unique challenge of estimating hyper-gradient. We aim to achieve three nice properties for our algorithm: (a) matching the state-of-the-art complexity of standard BO problems with a single block; (b) achieving parallel speedup by sampling $I$ blocks and sampling $B$ samples for each sampled block per-iteration; (c) avoiding the computation of the inverse of a high-dimensional Hessian matrix estimator. However, it is non-trivial to achieve all of these by observing that existing works only achieve one or two of these properties. To address the involved challenges for achieving (a, b, c), we propose two stochastic algorithms by using advanced blockwise variance-reduction techniques for tracking the Hessian matrices (for low-dimensional problems) or the Hessian-vector products (for high-dimensional problems), and prove an iteration complexity of $O(\frac{mε^{-3}\mathbb{I}(I<m)}{I\sqrt{I}} + \frac{mε^{-3}}{I\sqrt{B}})$ for finding an $ε$-stationary point under appropriate conditions. We also conduct experiments to verify the effectiveness of the proposed algorithms comparing with existing MBBO algorithms. △ Less

Submitted 2 June, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

arXiv:2305.13796 [pdf, other]

SE-Bridge: Speech Enhancement with Consistent Brownian Bridge

Authors: Zhibin Qiu, Mengfan Fu, Fuchun Sun, Gulila Altenbek, Hao Huang

Abstract: We propose SE-Bridge, a novel method for speech enhancement (SE). After recently applying the diffusion models to speech enhancement, we can achieve speech enhancement by solving a stochastic differential equation (SDE). Each SDE corresponds to a probabilistic flow ordinary differential equation (PF-ODE), and the trajectory of the PF-ODE solution consists of the speech states at different moments.… ▽ More We propose SE-Bridge, a novel method for speech enhancement (SE). After recently applying the diffusion models to speech enhancement, we can achieve speech enhancement by solving a stochastic differential equation (SDE). Each SDE corresponds to a probabilistic flow ordinary differential equation (PF-ODE), and the trajectory of the PF-ODE solution consists of the speech states at different moments. Our approach is based on consistency model that ensure any speech states on the same PF-ODE trajectory, correspond to the same initial state. By integrating the Brownian Bridge process, the model is able to generate high-intelligibility speech samples without adversarial training. This is the first attempt that applies the consistency models to SE task, achieving state-of-the-art results in several metrics while saving 15 x the time required for sampling compared to the diffusion-based baseline. Our experiments on multiple datasets demonstrate the effectiveness of SE-Bridge in SE. Furthermore, we show through extensive experiments on downstream tasks, including Automatic Speech Recognition (ASR) and Speaker Verification (SV), that SE-Bridge can effectively support multiple downstream tasks. △ Less

Submitted 23 May, 2023; originally announced May 2023.

arXiv:2305.13634 [pdf, other]

SMAP: A Novel Heterogeneous Information Framework for Scenario-based Optimal Model Assignment

Authors: Zekun Qiu, Zhipu Xie, Zehua Ji, Yuhao Mao, Ke Cheng

Abstract: The increasing maturity of big data applications has led to a proliferation of models targeting the same objectives within the same scenarios and datasets. However, selecting the most suitable model that considers model's features while taking specific requirements and constraints into account still poses a significant challenge. Existing methods have focused on worker-task assignments based on cr… ▽ More The increasing maturity of big data applications has led to a proliferation of models targeting the same objectives within the same scenarios and datasets. However, selecting the most suitable model that considers model's features while taking specific requirements and constraints into account still poses a significant challenge. Existing methods have focused on worker-task assignments based on crowdsourcing, they neglect the scenario-dataset-model assignment problem. To address this challenge, a new problem named the Scenario-based Optimal Model Assignment (SOMA) problem is introduced and a novel framework entitled Scenario and Model Associative percepts (SMAP) is developed. SMAP is a heterogeneous information framework that can integrate various types of information to intelligently select a suitable dataset and allocate the optimal model for a specific scenario. To comprehensively evaluate models, a new score function that utilizes multi-head attention mechanisms is proposed. Moreover, a novel memory mechanism named the mnemonic center is developed to store the matched heterogeneous information and prevent duplicate matching. Six popular traffic scenarios are selected as study cases and extensive experiments are conducted on a dataset to verify the effectiveness and efficiency of SMAP and the score function. △ Less

Submitted 22 May, 2023; originally announced May 2023.

arXiv:2305.12649 [pdf, other]

Imbalance-Agnostic Source-Free Domain Adaptation via Avatar Prototype Alignment

Authors: Hongbin Lin, Mingkui Tan, Yifan Zhang, Zhen Qiu, Shuaicheng Niu, Dong Liu, Qing Du, Yanxia Liu

Abstract: Source-free Unsupervised Domain Adaptation (SF-UDA) aims to adapt a well-trained source model to an unlabeled target domain without access to the source data. One key challenge is the lack of source data during domain adaptation. To handle this, we propose to mine the hidden knowledge of the source model and exploit it to generate source avatar prototypes. To this end, we propose a Contrastive Pro… ▽ More Source-free Unsupervised Domain Adaptation (SF-UDA) aims to adapt a well-trained source model to an unlabeled target domain without access to the source data. One key challenge is the lack of source data during domain adaptation. To handle this, we propose to mine the hidden knowledge of the source model and exploit it to generate source avatar prototypes. To this end, we propose a Contrastive Prototype Generation and Adaptation (CPGA) method. CPGA consists of two stages: Prototype generation and Prototype adaptation. Extensive experiments on three UDA benchmark datasets demonstrate the superiority of CPGA. However, existing SF.UDA studies implicitly assume balanced class distributions for both the source and target domains, which hinders their real applications. To address this issue, we study a more practical SF-UDA task, termed imbalance-agnostic SF-UDA, where the class distributions of both the unseen source domain and unlabeled target domain are unknown and could be arbitrarily skewed. This task is much more challenging than vanilla SF-UDA due to the co-occurrence of covariate shifts and unidentified class distribution shifts between the source and target domains. To address this task, we extend CPGA and propose a new Target-aware Contrastive Prototype Generation and Adaptation (T-CPGA) method. Specifically, for better prototype adaptation in the imbalance-agnostic scenario, T-CPGA applies a new pseudo label generation strategy to identify unknown target class distribution and generate accurate pseudo labels, by utilizing the collective intelligence of the source model and an additional contrastive language-image pre-trained model. Meanwhile, we further devise a target label-distribution-aware classifier to adapt the model to the unknown target class distribution. We empirically show that T-CPGA significantly outperforms CPGA and other SF-UDA methods in imbalance-agnostic SF-UDA. △ Less

Submitted 21 May, 2023; originally announced May 2023.

Comments: arXiv admin note: text overlap with arXiv:2106.15326

arXiv:2305.11965 [pdf, other]

Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization

Authors: Zi-Hao Qiu, Quanqi Hu, Zhuoning Yuan, Denny Zhou, Lijun Zhang, Tianbao Yang

Abstract: In this paper, we aim to optimize a contrastive loss with individualized temperatures in a principled and systematic manner for self-supervised learning. The common practice of using a global temperature parameter $τ$ ignores the fact that ``not all semantics are created equal", meaning that different anchor data may have different numbers of samples with similar semantics, especially when data ex… ▽ More In this paper, we aim to optimize a contrastive loss with individualized temperatures in a principled and systematic manner for self-supervised learning. The common practice of using a global temperature parameter $τ$ ignores the fact that ``not all semantics are created equal", meaning that different anchor data may have different numbers of samples with similar semantics, especially when data exhibits long-tails. First, we propose a new robust contrastive loss inspired by distributionally robust optimization (DRO), providing us an intuition about the effect of $τ$ and a mechanism for automatic temperature individualization. Then, we propose an efficient stochastic algorithm for optimizing the robust contrastive loss with a provable convergence guarantee without using large mini-batch sizes. Theoretical and experimental results show that our algorithm automatically learns a suitable $τ$ for each sample. Specifically, samples with frequent semantics use large temperatures to keep local semantic structures, while samples with rare semantics use small temperatures to induce more separable features. Our method not only outperforms prior strong baselines (e.g., SimCLR, CLIP) on unimodal and bimodal datasets with larger improvements on imbalanced data but also is less sensitive to hyper-parameters. To our best knowledge, this is the first methodical approach to optimizing a contrastive loss with individualized temperatures. △ Less

Submitted 19 May, 2023; originally announced May 2023.

Comments: 33 pages, 11 figures, accepted by ICML2023

arXiv:2305.10163 [pdf, other]

Large Language Models Leverage External Knowledge to Extend Clinical Insight Beyond Language Boundaries

Authors: Jiageng Wu, Xian Wu, Zhaopeng Qiu, Minghui Li, Yingying Zhang, Yefeng Zheng, Changzheng Yuan, Jie Yang

Abstract: $\textbf{Objectives}… ▽ More $\textbf{Objectives}$: Large Language Models (LLMs) such as ChatGPT and Med-PaLM have excelled in various medical question-answering tasks. However, these English-centric models encounter challenges in non-English clinical settings, primarily due to limited clinical knowledge in respective languages, a consequence of imbalanced training corpora. We systematically evaluate LLMs in the Chinese medical context and develop a novel in-context learning framework to enhance their performance. $\textbf{Materials and Methods}$: The latest China National Medical Licensing Examination (CNMLE-2022) served as the benchmark. We collected 53 medical books and 381,149 medical questions to construct the medical knowledge base and question bank. The proposed Knowledge and Few-shot Enhancement In-context Learning (KFE) framework leverages the in-context learning ability of LLMs to integrate diverse external clinical knowledge sources. We evaluated KFE with ChatGPT(GPT3.5), GPT4, Baichuan2(BC2)-7B, and BC2-13B in CNMLE-2022 and investigated the effectiveness of different pathways for incorporating LLMs with medical knowledge from 7 perspectives. $\textbf{Results}$: Directly applying ChatGPT failed to qualify for the CNMLE-2022 at a score of 51. Cooperated with the KFE, the LLMs with varying sizes yielded consistent and significant improvements. The ChatGPT's performance surged to 70.04 and GPT-4 achieved the highest score of 82.59. This surpasses the qualification threshold (60) and exceeds the average human score of 68.70. It also enabled a smaller BC2-13B to pass the examination, showcasing the great potential in low-resource settings. $\textbf{Conclusion}$: By synergizing medical knowledge through in-context learning, LLM can extend clinical insight beyond language barriers, significantly reducing language-related disparities of LLM applications and ensuring global benefit in healthcare. △ Less

Submitted 29 January, 2024; v1 submitted 17 May, 2023; originally announced May 2023.

arXiv:2305.08240 [pdf, other]

Dielectric environment sensitivity of carbon centres in hexagonal boron nitride

Authors: Danis I. Badrtdinov, Carlos Rodriguez-Fernandez, Magdalena Grzeszczyk, Zhizhan Qiu, Kristina Vaklinova, Pengru Huang, Alexander Hampel, Kenji Watanabe, Takashi Taniguchi, Lu Jiong, Marek Potemski, Cyrus E. Dreyer, Maciej Koperski, Malte Rösner

Abstract: A key advantage of utilizing van der Waals materials as defect-hosting platforms for quantum applications is the controllable proximity of the defect to the surface or the substrate for improved light extraction, enhanced coupling with photonic elements, or more sensitive metrology. However, this aspect results in a significant challenge for defect identification and characterization, as the defec… ▽ More A key advantage of utilizing van der Waals materials as defect-hosting platforms for quantum applications is the controllable proximity of the defect to the surface or the substrate for improved light extraction, enhanced coupling with photonic elements, or more sensitive metrology. However, this aspect results in a significant challenge for defect identification and characterization, as the defect's optoelectronic properties depend on the specifics of the atomic environment. Here we explore the mechanisms by which the environment can influence the properties of carbon impurity centres in hexagonal boron nitride (hBN). We compare the optical and electronic properties of such defects between bulk-like and few-layer films, showing alteration of the zero-phonon line energies, modifications to their phonon sidebands, and enhancements of their inhomogeneous broadenings. To disentangle the various mechanisms responsible for these changes, including the atomic structure, electronic wavefunctions, and dielectric screening environment of the defect center, we combine ab-initio calculations based on a density-functional theory with a quantum embedding approach. By studying a variety of carbon-based defects embedded in monolayer and bulk hBN, we demonstrate that the dominant effect of the change in the environment is the screening of the density-density Coulomb interactions within and between the defect orbitals. Our comparative analysis of the experimental and theoretical findings paves the way for improved identification of defects in low-dimensional materials and the development of atomic scale sensors of dielectric environments. △ Less

Submitted 14 May, 2023; originally announced May 2023.

Comments: 38 pages, 7 figures

arXiv:2305.05323 [pdf, other]

Global simulations of kinetic-magnetohydrodynamic processes with energetic electrons in tokamak plasmas

Authors: Jian Bao, Wenlu Zhang, Ding Li, Zhihong Lin, Zhiyong Qiu, Wei Chen, Xiang Zhu, Junyi Cheng, Chao Dong, **tao Cao

Abstract: The energetic electrons (EEs) generated through auxiliary heating have been found to destabilize various Alfven eigenmodes (AEs) in recent experiments, which in turn lead to the EE transport and degrade the plasma energy confinement. In this work, we propose a global fluid-kinetic hybrid model for studying corresponding kinetic-magnetohydrodynamic (MHD) processes by coupling the drift-kinetic EEs… ▽ More The energetic electrons (EEs) generated through auxiliary heating have been found to destabilize various Alfven eigenmodes (AEs) in recent experiments, which in turn lead to the EE transport and degrade the plasma energy confinement. In this work, we propose a global fluid-kinetic hybrid model for studying corresponding kinetic-magnetohydrodynamic (MHD) processes by coupling the drift-kinetic EEs to the Landau-fluid model of bulk plasmas in a non-perturbative manner. The numerical capability of Landau-fluid bulk plasmas is obtained based on a well-benchmarked eigenvalue code MAS [Multiscale Analysis of plasma Stabilities, J. Bao et al. Nucl. Fusion accepted 2023], and the EE responses to the electromagnetic fluctuations are analytically derived, which not only contribute to the MHD interchange drive and parallel current but also lead to the newly kinetic particle compression with the precessional drift resonance in the leading order. The hybrid model is casted into a nonlinear eigenvalue matrix equation and solved iteratively using Newton's method. By calibrating the EE precession frequency against the particle equation of motion in general geometry and applying more realistic trapped particle distribution in the poloidal plane, MAS simulations of EE-driven beta-induced Alfven eigenmodes (e-BAE) show excellent agreements with gyrokinetic particle-in-cell simulations, and the non-perturbative effects of EEs on e-BAE mode structure, growth rate and dam** rate are demonstrated. With these efforts, the upgraded MAS greatly improves the computation efficiency for plasma problems related to deeply-trapped EEs, which is superior than initial-value simulations restricted by the stringent electron Courant condition regarding to the practical application of fast linear analysis. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: 24 pages, 11 figures

arXiv:2305.04824 [pdf, other]

Learning Summary-Worthy Visual Representation for Abstractive Summarization in Video

Authors: Zenan Xu, Xiaojun Meng, Yasheng Wang, Qinliang Su, Zexuan Qiu, Xin Jiang, Qun Liu

Abstract: Multimodal abstractive summarization for videos (MAS) requires generating a concise textual summary to describe the highlights of a video according to multimodal resources, in our case, the video content and its transcript. Inspired by the success of the large-scale generative pre-trained language model (GPLM) in generating high-quality textual content (e.g., summary), recent MAS methods have prop… ▽ More Multimodal abstractive summarization for videos (MAS) requires generating a concise textual summary to describe the highlights of a video according to multimodal resources, in our case, the video content and its transcript. Inspired by the success of the large-scale generative pre-trained language model (GPLM) in generating high-quality textual content (e.g., summary), recent MAS methods have proposed to adapt the GPLM to this task by equip** it with the visual information, which is often obtained through a general-purpose visual feature extractor. However, the generally extracted visual features may overlook some summary-worthy visual information, which impedes model performance. In this work, we propose a novel approach to learning the summary-worthy visual representation that facilitates abstractive summarization. Our method exploits the summary-worthy information from both the cross-modal transcript data and the knowledge that distills from the pseudo summary. Extensive experiments on three public multimodal datasets show that our method outperforms all competing baselines. Furthermore, with the advantages of summary-worthy visual information, our model can have a significant improvement on small datasets or even datasets with limited training data. △ Less

Submitted 8 May, 2023; originally announced May 2023.

Comments: Accepted by IJCAI-2023

arXiv:2305.04719 [pdf, other]

Learning to Generate Poetic Chinese Landscape Painting with Calligraphy

Authors: Shaozu Yuan, Aijun Dai, Zhiling Yan, Ruixue Liu, Meng Chen, Baoyang Chen, Zhijie Qiu, Xiaodong He

Abstract: In this paper, we present a novel system (denoted as Polaca) to generate poetic Chinese landscape painting with calligraphy. Unlike previous single image-to-image painting generation, Polaca takes the classic poetry as input and outputs the artistic landscape painting image with the corresponding calligraphy. It is equipped with three different modules to complete the whole piece of landscape pain… ▽ More In this paper, we present a novel system (denoted as Polaca) to generate poetic Chinese landscape painting with calligraphy. Unlike previous single image-to-image painting generation, Polaca takes the classic poetry as input and outputs the artistic landscape painting image with the corresponding calligraphy. It is equipped with three different modules to complete the whole piece of landscape painting artwork: the first one is a text-to-image module to generate landscape painting image, the second one is an image-to-image module to generate stylistic calligraphy image, and the third one is an image fusion module to fuse the two images into a whole piece of aesthetic artwork. △ Less

Submitted 8 May, 2023; originally announced May 2023.

Comments: Accepted by IJCAI 2022

arXiv:2305.04088 [pdf, other]

Atomically-precise Vacancy-assembled Quantum Antidots

Authors: Hanyan Fang, Harshitra Mahalingam, Xinzhe Li, Xu Han, Zhizhan Qiu, Yixuan Han, Keian Noori, Dikshant Dulal, Hongfei Chen, Pin Lyu, Tianhao Yang, **g Li, Chenliang Su, Wei Chen, Yongqing Cai, Antonio Castro H. Neto, Kostya S. Novoselov, Aleksandr Rodin, Jiong Lu

Abstract: Patterning antidots ("voids") into well-defined antidot lattices creates an intriguing class of artificial structures for the periodic modulation of 2D electron systems, leading to anomalous transport properties and exotic quantum phenomena as well as enabling the precise bandgap engineering of 2D materials to address technological bottleneck issues. However, realizing such atomic-scale quantum an… ▽ More Patterning antidots ("voids") into well-defined antidot lattices creates an intriguing class of artificial structures for the periodic modulation of 2D electron systems, leading to anomalous transport properties and exotic quantum phenomena as well as enabling the precise bandgap engineering of 2D materials to address technological bottleneck issues. However, realizing such atomic-scale quantum antidots (QADs) is infeasible by current nanolithographic techniques. Here, we report an atomically-precise bottom-up fabrication of a series of atomic-scale QADs with elegantly engineered quantum states through a controllable assembly of a chalcogenide single vacancy (SV) in 2D PtTe2, a type-II Dirac semimetal. Te SVs as atomic-scale "antidots" undergo thermal migration and assembly into highly-ordered SV lattices spaced by a single Te atom, reaching the ultimate downscaling limit of antidot lattices. Increasing the number of SVs in QADs strengthens the cumulative repulsive potential and consequently enhances collective interference of multiple-pocket scattered quasiparticles inside QADs, creating multi-level quantum hole states with tunable gap from telecom to far-infrared regime. Moreover, precisely engineered quantum hole states of QADs are symmetry-protected and thus survive upon atom-by-atom oxygen substitutional do**. Therefore, SV-assembled QADs exhibit unprecedented robustness and property tunability, which not only holds the key to their future applications but also embody a wide variety of material technologies. △ Less

Submitted 6 May, 2023; originally announced May 2023.

arXiv:2305.03652 [pdf, other]

A fully hybrid integrated Erbium-based laser

Authors: Yang Liu, Zheru Qiu, Xinru Ji, Andrea Bancora, Grigory Lihachev, Johann Riemensberger, Rui Ning Wang, Andrey Voloshin, Tobias J. Kippenberg

Abstract: Erbium-doped fiber lasers exhibit high coherence and low noise as required for applications in fiber optic sensing, gyroscopes, LiDAR, and optical frequency metrology. Endowing Erbium-based gain in photonic integrated circuits can provide a basis for miniaturizing low-noise fiber lasers to chip-scale form factor, and enable large-volume applications. Yet, while major progress has been made in the… ▽ More Erbium-doped fiber lasers exhibit high coherence and low noise as required for applications in fiber optic sensing, gyroscopes, LiDAR, and optical frequency metrology. Endowing Erbium-based gain in photonic integrated circuits can provide a basis for miniaturizing low-noise fiber lasers to chip-scale form factor, and enable large-volume applications. Yet, while major progress has been made in the last decade on integrated lasers based on silicon photonics with III-V gain media, the integration of Erbium lasers on chip has been compounded by large laser linewidth. Recent advances in photonic integrated circuit-based high-power Erbium-doped amplifiers, make a new class of rare-earth-ion-based lasers possible. Here, we demonstrate a fully integrated chip-scale Erbium laser that achieves high power, narrow linewidth, frequency agility, and the integration of a III-V pump laser. The laser circuit is based on an Erbium-implanted ultralow-loss silicon nitride Si$_3$N$4$ photonic integrated circuit. This device achieves single-mode lasing with a free-running intrinsic linewidth of 50 Hz, a relative intensity noise of $<$-150 dBc/Hz at $>$10 MHz offset, and output power up to 17 mW, approaching the performance of fiber lasers and state-of-the-art semiconductor extended cavity lasers. An intra-cavity microring-based Vernier filter enables wavelength tunability of $>$ 40 nm within the C- and L-bands while attaining side mode suppression ratio (SMSR) of $>$ 70 dB, surpassing legacy fiber lasers in tuning and SMRS performance. This new class of low-noise, tuneable Erbium waveguide laser could find applications in LiDAR, microwave photonics, optical frequency synthesis, and free-space communications. Our approach is extendable to other wavelengths, and more broadly, constitutes a novel way to photonic integrated circuit-based rare-earth-ion-doped lasers. △ Less

Submitted 5 May, 2023; originally announced May 2023.

Showing 51–100 of 372 results for author: Qiu, Z