Search | arXiv e-print repository

OSP2B: One-Stage Point-to-Box Network for 3D Siamese Tracking

Authors: Jiahao Nie, Zhiwei He, Yuxiang Yang, Zhengyi Bao, Mingyu Gao, **g Zhang

Abstract: Two-stage point-to-box network acts as a critical role in the recent popular 3D Siamese tracking paradigm, which first generates proposals and then predicts corresponding proposal-wise scores. However, such a network suffers from tedious hyper-parameter tuning and task misalignment, limiting the tracking performance. Towards these concerns, we propose a simple yet effective one-stage point-to-box… ▽ More Two-stage point-to-box network acts as a critical role in the recent popular 3D Siamese tracking paradigm, which first generates proposals and then predicts corresponding proposal-wise scores. However, such a network suffers from tedious hyper-parameter tuning and task misalignment, limiting the tracking performance. Towards these concerns, we propose a simple yet effective one-stage point-to-box network for point cloud-based 3D single object tracking. It synchronizes 3D proposal generation and center-ness score prediction by a parallel predictor without tedious hyper-parameters. To guide a task-aligned score ranking of proposals, a center-aware focal loss is proposed to supervise the training of the center-ness branch, which enhances the network's discriminative ability to distinguish proposals of different quality. Besides, we design a binary target classifier to identify target-relevant points. By integrating the derived classification scores with the center-ness scores, the resulting network can effectively suppress interference proposals and further mitigate task misalignment. Finally, we present a novel one-stage Siamese tracker OSP2B equipped with the designed network. Extensive experiments on challenging benchmarks including KITTI and Waymo SOT Dataset show that our OSP2B achieves leading performance with a considerable real-time speed.Code will be available at https://github.com/haooozi/OSP2B. △ Less

Submitted 8 May, 2023; v1 submitted 23 April, 2023; originally announced April 2023.

Comments: Accepted to IJCAI'23. Code will be available at https://github.com/haooozi/OSP2B

arXiv:2304.07741 [pdf, other]

Canvas: End-to-End Kernel Architecture Search in Neural Networks

Authors: Chenggang Zhao, Genghan Zhang, Mingyu Gao

Abstract: The demands for higher performance and accuracy in neural networks (NNs) never end. Existing tensor compilation and Neural Architecture Search (NAS) techniques orthogonally optimize the two goals but actually share many similarities in their concrete strategies. We exploit such opportunities by combining the two into one and make a case for Kernel Architecture Search (KAS). KAS reviews NAS from a… ▽ More The demands for higher performance and accuracy in neural networks (NNs) never end. Existing tensor compilation and Neural Architecture Search (NAS) techniques orthogonally optimize the two goals but actually share many similarities in their concrete strategies. We exploit such opportunities by combining the two into one and make a case for Kernel Architecture Search (KAS). KAS reviews NAS from a system perspective and zooms into a more fine-grained level to generate neural kernels with both high performance and good accuracy. To demonstrate the potential of KAS, we build an end-to-end framework, Canvas, to find high-quality kernels as convolution replacements. Canvas samples from a rich set of fine-grained primitives to stochastically and iteratively construct new kernels and evaluate them according to user-specified constraints. Canvas supports freely adjustable tensor dimension sizes inside the kernel and uses two levels of solvers to satisfy structural legality and fully utilize model budgets. The evaluation shows that by replacing standard convolutions with generated new kernels in common NNs, Canvas achieves average 1.5x speedups compared to the previous state-of-the-art with acceptable accuracy loss and search efficiency. Canvas verifies the practicability of KAS by rediscovering many manually designed kernels in the past and producing new structures that may inspire future machine learning innovations. For source code and implementation, we open-sourced Canvas at https://github.com/tsinghua-ideal/Canvas. △ Less

Submitted 18 April, 2023; v1 submitted 16 April, 2023; originally announced April 2023.

arXiv:2304.04959 [pdf, other]

AdaTT: Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations

Authors: Danwei Li, Zhengyu Zhang, Siyang Yuan, Mingze Gao, Weilin Zhang, Chaofei Yang, Xi Liu, Jiyan Yang

Abstract: Multi-task learning (MTL) aims to enhance the performance and efficiency of machine learning models by simultaneously training them on multiple tasks. However, MTL research faces two challenges: 1) effectively modeling the relationships between tasks to enable knowledge sharing, and 2) jointly learning task-specific and shared knowledge. In this paper, we present a novel model called Adaptive Task… ▽ More Multi-task learning (MTL) aims to enhance the performance and efficiency of machine learning models by simultaneously training them on multiple tasks. However, MTL research faces two challenges: 1) effectively modeling the relationships between tasks to enable knowledge sharing, and 2) jointly learning task-specific and shared knowledge. In this paper, we present a novel model called Adaptive Task-to-Task Fusion Network (AdaTT) to address both challenges. AdaTT is a deep fusion network built with task-specific and optional shared fusion units at multiple levels. By leveraging a residual mechanism and a gating mechanism for task-to-task fusion, these units adaptively learn both shared knowledge and task-specific knowledge. To evaluate AdaTT's performance, we conduct experiments on a public benchmark and an industrial recommendation dataset using various task groups. Results demonstrate AdaTT significantly outperforms existing state-of-the-art baselines. Furthermore, our end-to-end experiments reveal that the model exhibits better performance compared to alternatives. △ Less

Submitted 4 June, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

arXiv:2304.04615 [pdf, ps, other]

A Survey on Recent Teacher-student Learning Studies

Authors: Minghong Gao

Abstract: Knowledge distillation is a method of transferring the knowledge from a complex deep neural network (DNN) to a smaller and faster DNN, while preserving its accuracy. Recent variants of knowledge distillation include teaching assistant distillation, curriculum distillation, mask distillation, and decoupling distillation, which aim to improve the performance of knowledge distillation by introducing… ▽ More Knowledge distillation is a method of transferring the knowledge from a complex deep neural network (DNN) to a smaller and faster DNN, while preserving its accuracy. Recent variants of knowledge distillation include teaching assistant distillation, curriculum distillation, mask distillation, and decoupling distillation, which aim to improve the performance of knowledge distillation by introducing additional components or by changing the learning process. Teaching assistant distillation involves an intermediate model called the teaching assistant, while curriculum distillation follows a curriculum similar to human education. Mask distillation focuses on transferring the attention mechanism learned by the teacher, and decoupling distillation decouples the distillation loss from the task loss. Overall, these variants of knowledge distillation have shown promising results in improving the performance of knowledge distillation. △ Less

Submitted 10 April, 2023; originally announced April 2023.

arXiv:2304.02554 [pdf, other]

Human-like Summarization Evaluation with ChatGPT

Authors: Mingqi Gao, Jie Ruan, Renliang Sun, Xunjian Yin, Shi** Yang, Xiaojun Wan

Abstract: Evaluating text summarization is a challenging problem, and existing evaluation metrics are far from satisfactory. In this study, we explored ChatGPT's ability to perform human-like summarization evaluation using four human evaluation methods on five datasets. We found that ChatGPT was able to complete annotations relatively smoothly using Likert scale scoring, pairwise comparison, Pyramid, and bi… ▽ More Evaluating text summarization is a challenging problem, and existing evaluation metrics are far from satisfactory. In this study, we explored ChatGPT's ability to perform human-like summarization evaluation using four human evaluation methods on five datasets. We found that ChatGPT was able to complete annotations relatively smoothly using Likert scale scoring, pairwise comparison, Pyramid, and binary factuality evaluation. Additionally, it outperformed commonly used automatic evaluation metrics on some datasets. Furthermore, we discussed the impact of different prompts, compared its performance with that of human evaluation, and analyzed the generated explanations and invalid responses. △ Less

Submitted 5 April, 2023; originally announced April 2023.

Comments: 9 pages, 5 figures, in process

arXiv:2304.00242

GLT-T++: Global-Local Transformer for 3D Siamese Tracking with Ranking Loss

Authors: Jiahao Nie, Zhiwei He, Yuxiang Yang, Xudong Lv, Mingyu Gao, **g Zhang

Abstract: Siamese trackers based on 3D region proposal network (RPN) have shown remarkable success with deep Hough voting. However, using a single seed point feature as the cue for voting fails to produce high-quality 3D proposals. Additionally, the equal treatment of seed points in the voting process, regardless of their significance, exacerbates this limitation. To address these challenges, we propose a n… ▽ More Siamese trackers based on 3D region proposal network (RPN) have shown remarkable success with deep Hough voting. However, using a single seed point feature as the cue for voting fails to produce high-quality 3D proposals. Additionally, the equal treatment of seed points in the voting process, regardless of their significance, exacerbates this limitation. To address these challenges, we propose a novel transformer-based voting scheme to generate better proposals. Specifically, a global-local transformer (GLT) module is devised to integrate object- and patch-aware geometric priors into seed point features, resulting in robust and accurate cues for offset learning of seed points. To train the GLT module, we introduce an importance prediction branch that learns the potential importance weights of seed points as a training constraint. Incorporating this transformer-based voting scheme into 3D RPN, a novel Siamese method dubbed GLT-T is developed for 3D single object tracking on point clouds. Moreover, we identify that the highest-scored proposal in the Siamese paradigm may not be the most accurate proposal, which limits tracking performance. Towards this concern, we approach the binary score prediction task as a ranking problem, and design a target-aware ranking loss and a localization-aware ranking loss to produce accurate ranking of proposals. With the ranking losses, we further present GLT-T++, an enhanced version of GLT-T. Extensive experiments on multiple benchmarks demonstrate that our GLT-T and GLT-T++ outperform state-of-the-art methods in terms of tracking accuracy while maintaining a real-time inference speed. The source code will be made available at https://github.com/haooozi/GLT-T. △ Less

Submitted 16 December, 2023; v1 submitted 1 April, 2023; originally announced April 2023.

Comments: Need further revision

arXiv:2303.17814 [pdf, other]

doi 10.1364/OL.492067

OH absorption in on-chip high-Q resonators

Authors: Lue Wu, Maodong Gao, **-Yu Liu, Hao-**g Chen, Kellan Colburn, Henry A. Blauvelt, Kerry J. Vahala

Abstract: Thermal silica is a common dielectric used in all silicon-photonic circuits. And bound hydroxyl ions (Si-OH) can provide a significant component of optical loss in this material on account of the wet nature of the thermal oxidation process. A convenient way to quantify this loss relative to other mechanisms is through OH-absorption at 1380 nm. Here, using ultra-high-Q thermal-silica wedge microres… ▽ More Thermal silica is a common dielectric used in all silicon-photonic circuits. And bound hydroxyl ions (Si-OH) can provide a significant component of optical loss in this material on account of the wet nature of the thermal oxidation process. A convenient way to quantify this loss relative to other mechanisms is through OH-absorption at 1380 nm. Here, using ultra-high-Q thermal-silica wedge microresonators, the OH absorption loss peak is measured and distinguished from the scattering loss base line over a wavelength range from 680 nm to 1550 nm. Record-high on-chip resonator Q factors are observed for near-visible and visible wavelengths, and the absorption limited Q factor is as high as 8 billion in the telecom band. OH ion content level around 2.4 ppm (weight) is inferred from both Q measurements and by Secondary Ion Mass Spectroscopy (SIMS) depth profiling. △ Less

Submitted 31 March, 2023; originally announced March 2023.

Comments: 4 pages, 3 figures

arXiv:2303.16891 [pdf, other]

Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations

Authors: Vibashan VS, Ning Yu, Chen Xing, Can Qin, Mingfei Gao, Juan Carlos Niebles, Vishal M. Patel, Ran Xu

Abstract: Existing instance segmentation models learn task-specific information using manual mask annotations from base (training) categories. These mask annotations require tremendous human effort, limiting the scalability to annotate novel (new) categories. To alleviate this problem, Open-Vocabulary (OV) methods leverage large-scale image-caption pairs and vision-language models to learn novel categories.… ▽ More Existing instance segmentation models learn task-specific information using manual mask annotations from base (training) categories. These mask annotations require tremendous human effort, limiting the scalability to annotate novel (new) categories. To alleviate this problem, Open-Vocabulary (OV) methods leverage large-scale image-caption pairs and vision-language models to learn novel categories. In summary, an OV method learns task-specific information using strong supervision from base annotations and novel category information using weak supervision from image-captions pairs. This difference between strong and weak supervision leads to overfitting on base categories, resulting in poor generalization towards novel categories. In this work, we overcome this issue by learning both base and novel categories from pseudo-mask annotations generated by the vision-language model in a weakly supervised manner using our proposed Mask-free OVIS pipeline. Our method automatically generates pseudo-mask annotations by leveraging the localization ability of a pre-trained vision-language model for objects present in image-caption pairs. The generated pseudo-mask annotations are then used to supervise an instance segmentation model, freeing the entire pipeline from any labour-expensive instance-level annotations and overfitting. Our extensive experiments show that our method trained with just pseudo-masks significantly improves the mAP scores on the MS-COCO dataset and OpenImages dataset compared to the recent state-of-the-art methods trained with manual masks. Codes and models are provided in https://vibashan.github.io/ovis-web/. △ Less

Submitted 29 March, 2023; originally announced March 2023.

Comments: Accepted to CVPR 2023. Project site: https://vibashan.github.io/ovis-web/

arXiv:2303.13611 [pdf]

doi 10.1016/j.oret.2023.08.019

Perfused and Nonperfused Microaneurysms Identified and Characterized by Structural and Angiographic OCT

Authors: Min Gao, Tristan T. Hormel, Yukun Guo, Kotaro Tsuboi, Christina J. Flaxel, David Huang, Thomas S. Hwang, Yali Jia

Abstract: Purpose: Microaneurysms (MAs) have distinct, oval-shaped, hyperreflective walls on structural OCT, and inconsistent flow signal in the lumen with OCT angiography (OCTA). Their relationship to regional macular edema in diabetic retinopathy (DR) has not been quantitatively explored. Participants: A total of 99 participants, including 23 with mild, NPDR, 25 with moderate NPDR, 34 with severe NPDR, an… ▽ More Purpose: Microaneurysms (MAs) have distinct, oval-shaped, hyperreflective walls on structural OCT, and inconsistent flow signal in the lumen with OCT angiography (OCTA). Their relationship to regional macular edema in diabetic retinopathy (DR) has not been quantitatively explored. Participants: A total of 99 participants, including 23 with mild, NPDR, 25 with moderate NPDR, 34 with severe NPDR, and 17 with proliferative DR. Methods: We obtained 3 x 3-mm scans with a commercial device (Solix, Visionix/Optovue) in 99 patients with DR. Trained graders manually identified MAs and their location relative to the anatomic layers from cross-sectional OCT. Microaneurysms were first classified as perfused if flow signal was present in the OCTA channel. Then, perfused MAs were further classified into fully and partially perfused MAs based on the flow characteristics in en face OCTA. The presence of retinal fluid based on OCT near MAs was compared between perfused and nonperfused types. We also compared OCT-based MA detection to fundus photography (FP)- and fluorescein angiography (FA)-based detection. Results: We identified 308 MAs (166 fully perfused, 88 partially perfused, 54 nonperfused) in 42 eyes using OCT and OCTA. Nearly half of the MAs identified in this study straddle the inner nuclear layer and outer plexiform layer. Compared with partially perfused and nonperfused MAs, fully perfused MAs were more likely to be associated with local retinal fluid. The associated fluid volumes were larger with fully perfused MAs compared with other types. OCT/OCTA detected all MAs found on FP. Although not all MAs seen with FA were identified with OCT, some MAs seen with OCT were not visible with FA or FP. Conclusions: OCT-identified MAs with colocalized flow on OCTA are more likely to be associated with DME than those without flow. △ Less

Submitted 9 October, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

Journal ref: Ophthalmology Retina 2023

arXiv:2303.06571 [pdf, other]

Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models

Authors: Juncheng Li, Minghe Gao, Longhui Wei, Siliang Tang, Wenqiao Zhang, Mengze Li, Wei Ji, Qi Tian, Tat-Seng Chua, Yueting Zhuang

Abstract: Prompt tuning, a recently emerging paradigm, enables the powerful vision-language pre-training models to adapt to downstream tasks in a parameter -- and data -- efficient way, by learning the ``soft prompts'' to condition frozen pre-training models. Though effective, it is particularly problematic in the few-shot scenario, where prompt tuning performance is sensitive to the initialization and requ… ▽ More Prompt tuning, a recently emerging paradigm, enables the powerful vision-language pre-training models to adapt to downstream tasks in a parameter -- and data -- efficient way, by learning the ``soft prompts'' to condition frozen pre-training models. Though effective, it is particularly problematic in the few-shot scenario, where prompt tuning performance is sensitive to the initialization and requires a time-consuming process to find a good initialization, thus restricting the fast adaptation ability of the pre-training models. In addition, prompt tuning could undermine the generalizability of the pre-training models, because the learnable prompt tokens are easy to overfit to the limited training samples. To address these issues, we introduce a novel Gradient-RegulAted Meta-prompt learning (GRAM) framework that jointly meta-learns an efficient soft prompt initialization for better adaptation and a lightweight gradient regulating function for strong cross-domain generalizability in a meta-learning paradigm using only the unlabeled image-text pre-training data. Rather than designing a specific prompt tuning method, our GRAM can be easily incorporated into various prompt tuning methods in a model-agnostic way, and comprehensive experiments show that GRAM brings about consistent improvement for them in several settings (i.e., few-shot learning, cross-domain generalization, cross-dataset generalization, etc.) over 11 datasets. Further, experiments show that GRAM enables the orthogonal methods of textual and visual prompt tuning to work in a mutually-enhanced way, offering better generalizability beyond the uni-modal prompt tuning methods. △ Less

Submitted 17 August, 2023; v1 submitted 12 March, 2023; originally announced March 2023.

Comments: Accepted by ICCV 2023

arXiv:2303.01203 [pdf, other]

Insight-HXMT and GECAM-C observations of the brightest-of-all-time GRB 221009A

Authors: Zheng-Hua An, S. Antier, Xing-Zi Bi, Qing-Cui Bu, Ce Cai, Xue-Lei Cao, Anna-Elisa Camisasca, Zhi Chang, Gang Chen, Li Chen, Tian-Xiang Chen, Wen Chen, Yi-Bao Chen, Yong Chen, Yu-Peng Chen, Michael W. Coughlin, Wei-Wei Cui, Zi-Gao Dai, T. Hussenot-Desenonges, Yan-Qi Du, Yuan-Yuan Du, Yun-Fei Du, Cheng-Cheng Fan, Filippo Frontera, He Gao , et al. (153 additional authors not shown)

Abstract: GRB 221009A is the brightest gamma-ray burst ever detected since the discovery of this kind of energetic explosions. However, an accurate measurement of the prompt emission properties of this burst is very challenging due to its exceptional brightness. With joint observations of \textit{Insight}-HXMT and GECAM-C, we made an unprecedentedly accurate measurement of the emission during the first… ▽ More GRB 221009A is the brightest gamma-ray burst ever detected since the discovery of this kind of energetic explosions. However, an accurate measurement of the prompt emission properties of this burst is very challenging due to its exceptional brightness. With joint observations of \textit{Insight}-HXMT and GECAM-C, we made an unprecedentedly accurate measurement of the emission during the first $\sim$1800 s of GRB 221009A, including its precursor, main emission (ME, which dominates the burst in flux), flaring emission and early afterglow, in the hard X-ray to soft gamma-ray band from $\sim$ 10 keV to $\sim$ 6 MeV. Based on the GECAM-C unsaturated data of the ME, we measure a record-breaking isotropic equivalent energy ($E_{\rm iso}$) of $\bf \sim 1.5 \times 10^{55}$ erg, which is about eight times the total rest-mass energy of the Sun. The early afterglow data require a significant jet break between 650 s and 1100 s, most likely at $\sim950$ s from the afterglow starting time $T_{AG}$, which corresponds to a jet opening angle of $\sim {0.7^\circ} \ (η_γn)^{1/8}$, where $n$ is the ambient medium density in units of $\rm cm^{-3}$ and $η_γ$ is the ratio between $γ$-ray energy and afterglow kinetic energy. The beaming-corrected total $γ$-ray energy $E_γ$ is $\sim 1.15 \times10^{51} \ (η_γn)^{1/4}$ erg, which is typical for long GRBs. These results suggest that this GRB may have a special central engine, which could launch and collimate a very narrowly beamed jet with an ordinary energy budget, leading to exceptionally luminous gamma-ray radiation per unit solid angle. Alternatively, more GRBs might have such a narrow and bright beam, which are missed by an unfavorable viewing angle or have been detected without distance measurement. △ Less

Submitted 3 March, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

Comments: Submitted to National Science Review. This paper is under press embargo, contact the corresponding author for details

arXiv:2303.00698 [pdf, other]

Cross calibration of gamma-ray detectors (GRD) of GECAM-C

Authors: Yan-Qiu Zhang, Shao-Lin Xiong, Rui Qiao, Dong-Ya Guo, Wen-Xi Peng, Xin-Qiao Li, Wang-Chen Xue, Chao Zheng, Jia-Cong Liu, Wen-Jun Tan, Chen-Wei Wang, Peng Zhang, ** Wang, Ce Cai, Shuo Xiao, Yue Huang, Pei-Yi Feng, Xiao-Bo Li, Li-Ming Song, Qi-Bin Yi, Yi Zhao, Zhi-Wei Guo, Jian-Jian He, Chao-Yang Li, Ya-Qing Liu , et al. (20 additional authors not shown)

Abstract: The gamma-ray detectors (GRDs) of GECAM-C onborad SATech-01 satellite is designed to monitor gamma-ray transients all over the sky from 6 keV to 6 MeV. The energy response matrix is the key to do spectral measurements of bursts, which is usually generated from GEANT4 simulation and partially verified by the ground calibration. In this work, energy response matrix of GECAM-C GRD is cross-calibrated… ▽ More The gamma-ray detectors (GRDs) of GECAM-C onborad SATech-01 satellite is designed to monitor gamma-ray transients all over the sky from 6 keV to 6 MeV. The energy response matrix is the key to do spectral measurements of bursts, which is usually generated from GEANT4 simulation and partially verified by the ground calibration. In this work, energy response matrix of GECAM-C GRD is cross-calibrated with Fermi/GBM and Swift/BAT using a sample of Gamma-Ray Bursts (GRBs) and Soft Gamma-Ray Repeaters (SGRs). The calibration results show there is a good agreement between GECAM-C and other reasonably well calibrated instrument (i.e. Fermi/GBM and Swift/BAT). We also find that different GRD detectors of GECAM-C also show consistency with each other. All these results indicate that GECAM-C GRD can provide reliable spectral measurements. △ Less

Submitted 1 March, 2023; originally announced March 2023.

Comments: preliminary version, will be updated soon

arXiv:2303.00687 [pdf, other]

Ground calibration of Gamma-Ray Detectors of GECAM-C

Authors: Chao Zheng, Zheng-Hua An, Wen-Xi Peng, Da-Li Zhang, Shao-Lin Xiong, Rui. Qiao, Yan-Qiu Zhang, Wang-Chen Xue, Jia-Cong Liu, Pei-Yi Feng, Ce. Cai, Min Gao, Ke Gong, Dong-Ya Guo, Dong-Jie Hou, Gang Li, Xin-Qiao Li, Yan-Guo Li, Mao-Shun Li, Xiao-Hua Liang, Ya-Qing Liu, Xiao-**g Liu, Li-Ming Song, Xi-Lei Sun, Wen-Jun Tan , et al. (13 additional authors not shown)

Abstract: As a new member of GECAM mission, GECAM-C (also named High Energy Burst Searcher, HEBS) was launched onboard the SATech-01 satellite on July 27th, 2022, which is capable to monitor gamma-ray transients from $\sim$ 6 keV to 6 MeV. As the main detector, there are 12 gamma-ray detectors (GRDs) equipped for GECAM-C. In order to verify the GECAM-C GRD detector performance and to validate the Monte Carl… ▽ More As a new member of GECAM mission, GECAM-C (also named High Energy Burst Searcher, HEBS) was launched onboard the SATech-01 satellite on July 27th, 2022, which is capable to monitor gamma-ray transients from $\sim$ 6 keV to 6 MeV. As the main detector, there are 12 gamma-ray detectors (GRDs) equipped for GECAM-C. In order to verify the GECAM-C GRD detector performance and to validate the Monte Carlo simulations of detector response, comprehensive on-ground calibration experiments have been performed using X-ray beam and radioactive sources, including Energy-Channel relation, energy resolution, detection efficiency, SiPM voltage-gain relation and the non-uniformity of positional response. In this paper, the detailed calibration campaigns and data analysis results for GECAM-C GRDs are presented, demonstrating the excellent performance of GECAM-C GRD detectors. △ Less

Submitted 30 May, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

Comments: third version

arXiv:2303.00537 [pdf, other]

The performance of SiPM-based gamma-ray detector (GRD) of GECAM-C

Authors: Dali Zhang, Chao Zheng, Jiacong Liu, Zhenghua An, Chenwei Wang, Xiangyang Wen, Xinqiao Li, Xilei Sun, Ke Gong, Yaqing Liu, Xiao**g Liu, Sheng Yang, Wenxi Peng, Rui Qiao, Dongya Guo, Peiyi Feng, Yanqiu Zhang, Wangchen Xue, Wenjun Tan, Ce Cai, Shuo Xiao, Qibin Yi, Yanbing Xu, Min Gao, **zhou Wang , et al. (20 additional authors not shown)

Abstract: As a new member of GECAM mission, the GECAM-C (also called High Energy Burst Searcher, HEBS) is a gamma-ray all-sky monitor onboard SATech-01 satellite, which was launched on July 27th, 2022 to detect gamma-ray transients from 6 keV to 6 MeV, such as Gamma-Ray Bursts (GRBs), high energy counterpart of Gravitational Waves (GWs) and Fast Radio Bursts (FRBs), and Soft Gamma-ray Repeaters (SGRs). Toge… ▽ More As a new member of GECAM mission, the GECAM-C (also called High Energy Burst Searcher, HEBS) is a gamma-ray all-sky monitor onboard SATech-01 satellite, which was launched on July 27th, 2022 to detect gamma-ray transients from 6 keV to 6 MeV, such as Gamma-Ray Bursts (GRBs), high energy counterpart of Gravitational Waves (GWs) and Fast Radio Bursts (FRBs), and Soft Gamma-ray Repeaters (SGRs). Together with GECAM-A and GECAM-B launched in December 2020, GECAM-C will greatly improve the monitoring coverage, localization, as well as temporal and spectral measurements of gamma-ray transients. GECAM-C employs 12 SiPM-based Gamma-Ray Detectors (GRDs) to detect gamma-ray transients . In this paper, we firstly give a brief description of the design of GECAM-C GRDs, and then focus on the on-ground tests and in-flight performance of GRDs. We also did the comparison study of the SiPM in-flight performance between GECAM-C and GECAM-B. The results show GECAM-C GRD works as expected and is ready to make scientific observations. △ Less

Submitted 7 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

Comments: 18 pages, 16 figures

arXiv:2303.00058 [pdf, other]

Neural Nonnegative Matrix Factorization for Hierarchical Multilayer Topic Modeling

Authors: Tyler Will, Runyu Zhang, Eli Sadovnik, Mengdi Gao, Joshua Vendrow, Jamie Haddock, Denali Molitor, Deanna Needell

Abstract: We introduce a new method based on nonnegative matrix factorization, Neural NMF, for detecting latent hierarchical structure in data. Datasets with hierarchical structure arise in a wide variety of fields, such as document classification, image processing, and bioinformatics. Neural NMF recursively applies NMF in layers to discover overarching topics encompassing the lower-level features. We deriv… ▽ More We introduce a new method based on nonnegative matrix factorization, Neural NMF, for detecting latent hierarchical structure in data. Datasets with hierarchical structure arise in a wide variety of fields, such as document classification, image processing, and bioinformatics. Neural NMF recursively applies NMF in layers to discover overarching topics encompassing the lower-level features. We derive a backpropagation optimization scheme that allows us to frame hierarchical NMF as a neural network. We test Neural NMF on a synthetic hierarchical dataset, the 20 Newsgroups dataset, and the MyLymeData symptoms dataset. Numerical results demonstrate that Neural NMF outperforms other hierarchical NMF methods on these data sets and offers better learned hierarchical structure and interpretability of topics. △ Less

Submitted 28 February, 2023; originally announced March 2023.

arXiv:2302.14307 [pdf, other]

GradMA: A Gradient-Memory-based Accelerated Federated Learning with Alleviated Catastrophic Forgetting

Authors: Kangyang Luo, Xiang Li, Yunshi Lan, Ming Gao

Abstract: Federated Learning (FL) has emerged as a de facto machine learning area and received rapid increasing research interests from the community. However, catastrophic forgetting caused by data heterogeneity and partial participation poses distinctive challenges for FL, which are detrimental to the performance. To tackle the problems, we propose a new FL approach (namely GradMA), which takes inspiratio… ▽ More Federated Learning (FL) has emerged as a de facto machine learning area and received rapid increasing research interests from the community. However, catastrophic forgetting caused by data heterogeneity and partial participation poses distinctive challenges for FL, which are detrimental to the performance. To tackle the problems, we propose a new FL approach (namely GradMA), which takes inspiration from continual learning to simultaneously correct the server-side and worker-side update directions as well as take full advantage of server's rich computing and memory resources. Furthermore, we elaborate a memory reduction strategy to enable GradMA to accommodate FL with a large scale of workers. We then analyze convergence of GradMA theoretically under the smooth non-convex setting and show that its convergence rate achieves a linear speed up w.r.t the increasing number of sampled active workers. At last, our extensive experiments on various image classification tasks show that GradMA achieves significant performance gains in accuracy and communication efficiency compared to SOTA baselines. △ Less

Submitted 15 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

arXiv:2302.14286 [pdf, other]

HugNLP: A Unified and Comprehensive Library for Natural Language Processing

Authors: Jianing Wang, Nuo Chen, Qiushi Sun, Wenkang Huang, Chengyu Wang, Ming Gao

Abstract: In this paper, we introduce HugNLP, a unified and comprehensive library for natural language processing (NLP) with the prevalent backend of HuggingFace Transformers, which is designed for NLP researchers to easily utilize off-the-shelf algorithms and develop novel methods with user-defined models and tasks in real-world scenarios. HugNLP consists of a hierarchical structure including models, proce… ▽ More In this paper, we introduce HugNLP, a unified and comprehensive library for natural language processing (NLP) with the prevalent backend of HuggingFace Transformers, which is designed for NLP researchers to easily utilize off-the-shelf algorithms and develop novel methods with user-defined models and tasks in real-world scenarios. HugNLP consists of a hierarchical structure including models, processors and applications that unifies the learning process of pre-trained language models (PLMs) on different NLP tasks. Additionally, we present some featured NLP applications to show the effectiveness of HugNLP, such as knowledge-enhanced PLMs, universal information extraction, low-resource mining, and code understanding and generation, etc. The source code will be released on GitHub (https://github.com/wjn1996/HugNLP). △ Less

Submitted 27 February, 2023; originally announced February 2023.

Comments: 8 Pages

arXiv:2302.08659 [pdf, other]

Uncertainty-aware Self-training for Low-resource Neural Sequence Labeling

Authors: Jianing Wang, Chengyu Wang, Jun Huang, Ming Gao, Aoying Zhou

Abstract: Neural sequence labeling (NSL) aims at assigning labels for input language tokens, which covers a broad range of applications, such as named entity recognition (NER) and slot filling, etc. However, the satisfying results achieved by traditional supervised-based approaches heavily depend on the large amounts of human annotation data, which may not be feasible in real-world scenarios due to data pri… ▽ More Neural sequence labeling (NSL) aims at assigning labels for input language tokens, which covers a broad range of applications, such as named entity recognition (NER) and slot filling, etc. However, the satisfying results achieved by traditional supervised-based approaches heavily depend on the large amounts of human annotation data, which may not be feasible in real-world scenarios due to data privacy and computation efficiency issues. This paper presents SeqUST, a novel uncertain-aware self-training framework for NSL to address the labeled data scarcity issue and to effectively utilize unlabeled data. Specifically, we incorporate Monte Carlo (MC) dropout in Bayesian neural network (BNN) to perform uncertainty estimation at the token level and then select reliable language tokens from unlabeled data based on the model confidence and certainty. A well-designed masked sequence labeling task with a noise-robust loss supports robust training, which aims to suppress the problem of noisy pseudo labels. In addition, we develop a Gaussian-based consistency regularization technique to further improve the model robustness on Gaussian-distributed perturbed representations. This effectively alleviates the over-fitting dilemma originating from pseudo-labeled augmented data. Extensive experiments over six benchmarks demonstrate that our SeqUST framework effectively improves the performance of self-training, and consistently outperforms strong baselines by a large margin in low-resource scenarios △ Less

Submitted 16 February, 2023; originally announced February 2023.

Comments: 11 pages, 3 figures

Journal ref: AAAI 2023

arXiv:2302.08203 [pdf, other]

A Superdirective Beamforming Approach with Impedance Coupling and Field Coupling for Compact Antenna Arrays

Authors: Liangcheng Han, Haifan Yin, Mengying Gao, **gcheng Xie

Abstract: In most multiple-input multiple-output (MIMO) communication systems, the antenna spacing is generally no less than half a wavelength. It helps to reduce the mutual coupling and therefore facilitate the system design. The maximum array gain equals the number of antennas in this settings. However, when the antenna spacing is made very small, the array gain of a compact array can be proportional to t… ▽ More In most multiple-input multiple-output (MIMO) communication systems, the antenna spacing is generally no less than half a wavelength. It helps to reduce the mutual coupling and therefore facilitate the system design. The maximum array gain equals the number of antennas in this settings. However, when the antenna spacing is made very small, the array gain of a compact array can be proportional to the square of the number of antennas - a value much larger than the traditional array. To achieve this so-called ``superdirectivity" however, the calculation of the excitation coefficients (beamforming vector) is known to be a challenging problem. In this paper, we address this problem with a novel double coupling-based superdirective beamforming method. In particular, we categorize the antenna coupling effects to impedance coupling and field coupling. By characterizing these two coupling in model, we derive the beamforming vector for superdirective arrays. In order to obtain the field coupling matrix, we propose a spherical wave expansion approach, which is effective in both simulations and realistic scenarios. Moreover, a prototype of the independently controlled superdirective antenna array is developed. Full-wave electromagnetic simulations and real-world experiments validate the effectiveness of our proposed approaches, and superdirectivity is achieved in reality by a compact array with 4 and 5 dipole antennas. △ Less

Submitted 16 February, 2023; originally announced February 2023.

Comments: arXiv admin note: text overlap with arXiv:2204.11547

arXiv:2302.07739 [pdf, other]

Meta-Learning Triplet Network with Adaptive Margins for Few-Shot Named Entity Recognition

Authors: Chengcheng Han, Renyu Zhu, Jun Kuang, FengJiao Chen, Xiang Li, Ming Gao, Xuezhi Cao, Wei Wu

Abstract: Meta-learning methods have been widely used in few-shot named entity recognition (NER), especially prototype-based methods. However, the Other(O) class is difficult to be represented by a prototype vector because there are generally a large number of samples in the class that have miscellaneous semantics. To solve the problem, we propose MeTNet, which generates prototype vectors for entity types o… ▽ More Meta-learning methods have been widely used in few-shot named entity recognition (NER), especially prototype-based methods. However, the Other(O) class is difficult to be represented by a prototype vector because there are generally a large number of samples in the class that have miscellaneous semantics. To solve the problem, we propose MeTNet, which generates prototype vectors for entity types only but not O-class. We design an improved triplet network to map samples and prototype vectors into a low-dimensional space that is easier to be classified and propose an adaptive margin for each entity type. The margin plays as a radius and controls a region with adaptive size in the low-dimensional space. Based on the regions, we propose a new inference procedure to predict the label of a query instance. We conduct extensive experiments in both in-domain and cross-domain settings to show the superiority of MeTNet over other state-of-the-art methods. In particular, we release a Chinese few-shot NER dataset FEW-COMM extracted from a well-known e-commerce platform. To the best of our knowledge, this is the first Chinese few-shot NER dataset. All the datasets and codes are provided at https://github.com/hccngu/MeTNet. △ Less

Submitted 14 February, 2023; originally announced February 2023.

arXiv:2302.05052 [pdf, other]

Debiasing Recommendation by Learning Identifiable Latent Confounders

Authors: Qing Zhang, Xiaoying Zhang, Yang Liu, Hongning Wang, Min Gao, Jiheng Zhang, Ruocheng Guo

Abstract: Recommendation systems aim to predict users' feedback on items not exposed to them. Confounding bias arises due to the presence of unmeasured variables (e.g., the socio-economic status of a user) that can affect both a user's exposure and feedback. Existing methods either (1) make untenable assumptions about these unmeasured variables or (2) directly infer latent confounders from users' exposure… ▽ More Recommendation systems aim to predict users' feedback on items not exposed to them. Confounding bias arises due to the presence of unmeasured variables (e.g., the socio-economic status of a user) that can affect both a user's exposure and feedback. Existing methods either (1) make untenable assumptions about these unmeasured variables or (2) directly infer latent confounders from users' exposure. However, they cannot guarantee the identification of counterfactual feedback, which can lead to biased predictions. In this work, we propose a novel method, i.e., identifiable deconfounder (iDCF), which leverages a set of proxy variables (e.g., observed user features) to resolve the aforementioned non-identification issue. The proposed iDCF is a general deconfounded recommendation framework that applies proximal causal inference to infer the unmeasured confounders and identify the counterfactual feedback with theoretical guarantees. Extensive experiments on various real-world and synthetic datasets verify the proposed method's effectiveness and robustness. △ Less

Submitted 15 June, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

arXiv:2302.03507 [pdf, other]

Meta-Learning Siamese Network for Few-Shot Text Classification

Authors: Chengcheng Han, Yuhe Wang, Yingnan Fu, Xiang Li, Minghui Qiu, Ming Gao, Aoying Zhou

Abstract: Few-shot learning has been used to tackle the problem of label scarcity in text classification, of which meta-learning based methods have shown to be effective, such as the prototypical networks (PROTO). Despite the success of PROTO, there still exist three main problems: (1) ignore the randomness of the sampled support sets when computing prototype vectors; (2) disregard the importance of labeled… ▽ More Few-shot learning has been used to tackle the problem of label scarcity in text classification, of which meta-learning based methods have shown to be effective, such as the prototypical networks (PROTO). Despite the success of PROTO, there still exist three main problems: (1) ignore the randomness of the sampled support sets when computing prototype vectors; (2) disregard the importance of labeled samples; (3) construct meta-tasks in a purely random manner. In this paper, we propose a Meta-Learning Siamese Network, namely, Meta-SN, to address these issues. Specifically, instead of computing prototype vectors from the sampled support sets, Meta-SN utilizes external knowledge (e.g. class names and descriptive texts) for class labels, which is encoded as the low-dimensional embeddings of prototype vectors. In addition, Meta-SN presents a novel sampling strategy for constructing meta-tasks, which gives higher sampling probabilities to hard-to-classify samples. Extensive experiments are conducted on six benchmark datasets to show the clear superiority of Meta-SN over other state-of-the-art models. For reproducibility, all the datasets and codes are provided at https://github.com/hccngu/Meta-SN. △ Less

Submitted 16 March, 2023; v1 submitted 5 February, 2023; originally announced February 2023.

arXiv:2302.00162 [pdf, other]

Continual Segment: Towards a Single, Unified and Accessible Continual Segmentation Model of 143 Whole-body Organs in CT Scans

Authors: Zhanghexuan Ji, Dazhou Guo, Puyang Wang, Ke Yan, Le Lu, Minfeng Xu, **gren Zhou, Qifeng Wang, Jia Ge, Mingchen Gao, Xianghua Ye, Dakai **

Abstract: Deep learning empowers the mainstream medical image segmentation methods. Nevertheless current deep segmentation approaches are not capable of efficiently and effectively adapting and updating the trained models when new incremental segmentation classes (along with new training datasets or not) are required to be added. In real clinical environment, it can be preferred that segmentation models cou… ▽ More Deep learning empowers the mainstream medical image segmentation methods. Nevertheless current deep segmentation approaches are not capable of efficiently and effectively adapting and updating the trained models when new incremental segmentation classes (along with new training datasets or not) are required to be added. In real clinical environment, it can be preferred that segmentation models could be dynamically extended to segment new organs/tumors without the (re-)access to previous training datasets due to obstacles of patient privacy and data storage. This process can be viewed as a continual semantic segmentation (CSS) problem, being understudied for multi-organ segmentation. In this work, we propose a new architectural CSS learning framework to learn a single deep segmentation model for segmenting a total of 143 whole-body organs. Using the encoder/decoder network structure, we demonstrate that a continually-trained then frozen encoder coupled with incrementally-added decoders can extract and preserve sufficiently representative image features for new classes to be subsequently and validly segmented. To maintain a single network model complexity, we trim each decoder progressively using neural architecture search and teacher-student based knowledge distillation. To incorporate with both healthy and pathological organs appearing in different datasets, a novel anomaly-aware and confidence learning module is proposed to merge the overlapped organ predictions, originated from different decoders. Trained and validated on 3D CT scans of 2500+ patients from four datasets, our single network can segment total 143 whole-body organs with very high accuracy, closely reaching the upper bound performance level by training four separate segmentation models (i.e., one model per dataset/task). △ Less

Submitted 3 September, 2023; v1 submitted 31 January, 2023; originally announced February 2023.

arXiv:2301.12649 [pdf, other]

Convergence of uncertainty estimates in Ensemble and Bayesian sparse model discovery

Authors: L. Mars Gao, Urban Fasel, Steven L. Brunton, J. Nathan Kutz

Abstract: Sparse model identification enables nonlinear dynamical system discovery from data. However, the control of false discoveries for sparse model identification is challenging, especially in the low-data and high-noise limit. In this paper, we perform a theoretical study on ensemble sparse model discovery, which shows empirical success in terms of accuracy and robustness to noise. In particular, we a… ▽ More Sparse model identification enables nonlinear dynamical system discovery from data. However, the control of false discoveries for sparse model identification is challenging, especially in the low-data and high-noise limit. In this paper, we perform a theoretical study on ensemble sparse model discovery, which shows empirical success in terms of accuracy and robustness to noise. In particular, we analyse the bootstrap**-based sequential thresholding least-squares estimator. We show that this bootstrap**-based ensembling technique can perform a provably correct variable selection procedure with an exponential convergence rate of the error rate. In addition, we show that the ensemble sparse model discovery method can perform computationally efficient uncertainty estimation, compared to expensive Bayesian uncertainty quantification methods via MCMC. We demonstrate the convergence properties and connection to uncertainty quantification in various numerical studies on synthetic sparse linear regression and sparse model discovery. The experiments on sparse linear regression support that the bootstrap**-based sequential thresholding least-squares method has better performance for sparse variable selection compared to LASSO, thresholding least-squares, and bootstrap**-based LASSO. In the sparse model discovery experiment, we show that the bootstrap**-based sequential thresholding least-squares method can provide valid uncertainty quantification, converging to a delta measure centered around the true value with increased sample sizes. Finally, we highlight the improved robustness to hyperparameter selection under shifting noise and sparsity levels of the bootstrap**-based sequential thresholding least-squares method compared to other sparse regression methods. △ Less

Submitted 26 April, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

Comments: 32 pages, 7 figures

arXiv:2301.12458 [pdf, other]

SeeGera: Self-supervised Semi-implicit Graph Variational Auto-encoders with Masking

Authors: Xiang Li, Tiandi Ye, Caihua Shan, Dongsheng Li, Ming Gao

Abstract: Generative graph self-supervised learning (SSL) aims to learn node representations by reconstructing the input graph data. However, most existing methods focus on unsupervised learning tasks only and very few work has shown its superiority over the state-of-the-art graph contrastive learning (GCL) models, especially on the classification task. While a very recent model has been proposed to bridge… ▽ More Generative graph self-supervised learning (SSL) aims to learn node representations by reconstructing the input graph data. However, most existing methods focus on unsupervised learning tasks only and very few work has shown its superiority over the state-of-the-art graph contrastive learning (GCL) models, especially on the classification task. While a very recent model has been proposed to bridge the gap, its performance on unsupervised learning tasks is still unknown. In this paper, to comprehensively enhance the performance of generative graph SSL against other GCL models on both unsupervised and supervised learning tasks, we propose the SeeGera model, which is based on the family of self-supervised variational graph auto-encoder (VGAE). Specifically, SeeGera adopts the semi-implicit variational inference framework, a hierarchical variational framework, and mainly focuses on feature reconstruction and structure/feature masking. On the one hand, SeeGera co-embeds both nodes and features in the encoder and reconstructs both links and features in the decoder. Since feature embeddings contain rich semantic information on features, they can be combined with node embeddings to provide fine-grained knowledge for feature reconstruction. On the other hand, SeeGera adds an additional layer for structure/feature masking to the hierarchical variational framework, which boosts the model generalizability. We conduct extensive experiments comparing SeeGera with 9 other state-of-the-art competitors. Our results show that SeeGera can compare favorably against other state-of-the-art GCL methods in a variety of unsupervised and supervised learning tasks. △ Less

Submitted 7 February, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

Comments: Accepted by WebConf 2023

arXiv:2301.12324 [pdf, ps, other]

doi 10.35848/1882-0786/acdc02

Cubic C$_{20}$: An intrinsic superconducting carbon allotrope

Authors: Ying Yu, Xun-Wang Yan, Fengjie Ma, Miao Gao, Zhong-Yi Lu

Abstract: Finding intrinsic carbon superconductor is an interesting topic. Based on density functional first-principles calculations, we first study the phonon-mediated superconductivity in a cubic metallic carbon allotrope, namely sc-C$_{20}$, which has been synthesized in experiment. The electron-phonon coupling is accurately computed with Wannier interpolation method. By solving the Eliashberg equations,… ▽ More Finding intrinsic carbon superconductor is an interesting topic. Based on density functional first-principles calculations, we first study the phonon-mediated superconductivity in a cubic metallic carbon allotrope, namely sc-C$_{20}$, which has been synthesized in experiment. The electron-phonon coupling is accurately computed with Wannier interpolation method. By solving the Eliashberg equations, we predict that sc-C$_{20}$ is an intrinsic carbon superconductor, without introducing any guest atoms or do**, whose transition temperature is determined to be about 24 K. Our findings enrich the family of carbon-based superconductors. △ Less

Submitted 24 June, 2023; v1 submitted 28 January, 2023; originally announced January 2023.

Comments: 4 pages, 5 figures

Journal ref: Appl. Phys. Express 16, 063003 (2023)

arXiv:2301.12320 [pdf, ps, other]

doi 10.1103/PhysRevB.107.L180501

Stabilizing a hydrogen-rich superconductor at 1 GPa by the charge-transfer modulated virtual high-pressure effect

Authors: Miao Gao, Peng-Jie Guo, Huan-Cheng Yang, Xun-Wang Yan, Fengjie Ma, Zhong-Yi Lu, Tao Xiang, Hai-Qing Lin

Abstract: Applying pressure around megabar is indispensable in the synthesis of high-temperature superconducting hydrides, such as SH$_3$ and LaH$_{10}$. Stabilizing the high-pressure phase of hydride around ambient condition is a severe challenge. Based on the density-functional theory calculations, we give the first example that the structure of hydride CaBH$_5$ predicted above 280 GPa, can maintain its d… ▽ More Applying pressure around megabar is indispensable in the synthesis of high-temperature superconducting hydrides, such as SH$_3$ and LaH$_{10}$. Stabilizing the high-pressure phase of hydride around ambient condition is a severe challenge. Based on the density-functional theory calculations, we give the first example that the structure of hydride CaBH$_5$ predicted above 280 GPa, can maintain its dynamical stability with pressure down to 1 GPa, by modulating the charge transfer from metal atoms to hydrogen atoms via the replacement of Ca with alkali metal atoms e.g. Cs, in which the [BH$_5$]$^{2-}$ anion shrinks along $c$ axis and expands in the $ab$ plane, experiencing an anisotropic virtual high pressure. This mechanism, namely charge transfer modulated virtual high pressure effect, plays a vital role in enhancing the structural stability and leading to the reemergence of ambient-pressure-forbidden [BH$_5$]$^{2-}$ anion around 1 GPa in CsBH$_5$. Moreover, we find that CsBH$_5$ is a strongly coupled superconductor, with transition temperature as high as 98 K, well above the liquid-nitrogen temperature. Our findings provide a novel mechanism to reduce the critical pressure required by hydrogen-rich compound without changing its crystal structure, and also shed light on searching ambient-pressure high-temperature superconductivity in metal borohydrides. △ Less

Submitted 5 May, 2023; v1 submitted 28 January, 2023; originally announced January 2023.

Comments: accepted for publication as a Letter in Phys. Rev. B

Journal ref: Phys. Rev. B 107, L180501 (2023)

arXiv:2301.10976 [pdf, other]

Soliton pulse pairs at multiple colors in normal dispersion microresonators

Authors: Zhiquan Yuan, Maodong Gao, Yan Yu, Heming Wang, Warren **, Qing-Xin Ji, Avi Feshali, Mario Paniccia, John Bowers, Kerry Vahala

Abstract: Soliton microcombs are hel** to advance the miniaturization of a range of comb systems. These combs mode lock through the formation of short temporal pulses in anomalous dispersion resonators. Here, a new microcomb is demonstrated that mode locks through the formation of pulse pairs in normal-dispersion coupled-ring resonators. Unlike conventional microcombs, pulses in this system cannot exist a… ▽ More Soliton microcombs are hel** to advance the miniaturization of a range of comb systems. These combs mode lock through the formation of short temporal pulses in anomalous dispersion resonators. Here, a new microcomb is demonstrated that mode locks through the formation of pulse pairs in normal-dispersion coupled-ring resonators. Unlike conventional microcombs, pulses in this system cannot exist alone, and instead must phase lock in pairs to form a bright soliton comb. Also, the pulses can form at recurring spectral windows and the pulses in each pair feature different optical spectra. This pairwise mode-locking modality extends to higher dimensions and we demonstrate 3-ring systems in which 3 pulses mode lock through alternating pairwise pulse coupling. The results are demonstrated using the new CMOS-foundry platform that has not previously produced bright solitons on account of its inherent normal dispersion. The ability to generate multi-color pulse pairs over multiple rings is an important new feature for microcombs. It can extend the concept of all-optical soliton buffers and memories to multiple storage rings that multiplex pulses with respect to soliton color and that are spatially addressable. The results also suggest a new platform for the study of quantum combs and topological photonics. △ Less

Submitted 26 January, 2023; originally announced January 2023.

arXiv:2301.10969 [pdf, ps, other]

Engineered zero-dispersion microcombs using CMOS-ready photonics

Authors: Qing-Xin Ji, Warren **, Lue Wu, Yan Yu, Zhiquan Yuan, Wei Zhang, Maodong Gao, Bohan Li, Heming Wang, Chao Xiang, Joel Guo, Avi Feshali, Mario Paniccia, Vladimir S. Ilchenko, Andrey B. Matsko, John Bowers, Kerry Vahala

Abstract: Normal group velocity dispersion (GVD) microcombs offer high comb line power and high pum** efficiency compared to bright pulse microcombs. The recent demonstration of normal GVD microcombs using CMOS-foundry-produced microresonators is an important step towards scalable production. However, the chromatic dispersion of CMOS devices is large and impairs generation of broadband microcombs. Here, w… ▽ More Normal group velocity dispersion (GVD) microcombs offer high comb line power and high pum** efficiency compared to bright pulse microcombs. The recent demonstration of normal GVD microcombs using CMOS-foundry-produced microresonators is an important step towards scalable production. However, the chromatic dispersion of CMOS devices is large and impairs generation of broadband microcombs. Here, we report the development of a microresonator in which GVD is reduced due to a couple-ring resonator configuration. Operating in the turnkey self-injection-locking mode, the resonator is hybridly integrated with a semiconductor laser pump to produce high-power-efficiency combs spanning a bandwidth of 9.9 nm (1.22 THz) centered at 1560 nm, corresponding to 62 comb lines. Fast, linear optical sampling of the comb waveform is used to observe the rich set of near-zero GVD comb behaviors, including soliton molecules, switching waves (platicons) and their hybrids. Tuning of the 20 GHz repetition rate by electrical actuation enables servo locking to a microwave reference, which simultaneously stabilizes the comb repetition rate, offset frequency and temporal waveform. This hybridly integrated system could be used in coherent communications or for ultra-stable microwave signal generation by two-point optical frequency division. △ Less

Submitted 26 January, 2023; originally announced January 2023.

Comments: 8 pages, 4 figures

arXiv:2212.08418 [pdf, other]

rWiFiSLAM: Effective WiFi Ranging based SLAM System in Ambient Environments

Authors: Bo Wei, Mingcen Gao, Chengwen Luo, Sen Wang, ** Zhang

Abstract: In this paper, we propose rWiFiSLAM, an indoor localisation system based on WiFi ranging measurements. Indoor localisation techniques play an important role in mobile robots when they cannot access good quality GPS signals in indoor environments. Indoor localisation also has many other applications, such as rescue, smart buildings, etc. Inertial Measurement Units (IMU) have been used for Pedestria… ▽ More In this paper, we propose rWiFiSLAM, an indoor localisation system based on WiFi ranging measurements. Indoor localisation techniques play an important role in mobile robots when they cannot access good quality GPS signals in indoor environments. Indoor localisation also has many other applications, such as rescue, smart buildings, etc. Inertial Measurement Units (IMU) have been used for Pedestrian Dead Reckoning (PDR) to provide localisation services in the indoor environment as it does not rely on any other signals. Although PDR is a promising technique, it still suffers from unavoidable noise and bias from IMUs in mobile devices. Loop closure is necessary for these scenarios. In this paper, we design an efficient loop closure mechanism based on WiFi ranging measurements along with IMU measurements in a robust pose graph SLAM framework for indoor localisation. One novelty of the proposed method is that we remove the requirement of the full knowledge of the WiFi access point locations, which makes our proposed method feasible for new and/or dynamic environments. We evaluate our designed system in real environments and show the proposed method can achieve sub-meter localisation accuracy and improve the localisation performance by more than 90\% compared with the IMU based PDR. △ Less

Submitted 16 December, 2022; originally announced December 2022.

arXiv:2212.05171 [pdf, other]

ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding

Authors: Le Xue, Mingfei Gao, Chen Xing, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, Ran Xu, Juan Carlos Niebles, Silvio Savarese

Abstract: The recognition capabilities of current state-of-the-art 3D models are limited by datasets with a small number of annotated data and a pre-defined set of categories. In its 2D counterpart, recent advances have shown that similar problems can be significantly alleviated by employing knowledge from other modalities, such as language. Inspired by this, leveraging multimodal information for 3D modalit… ▽ More The recognition capabilities of current state-of-the-art 3D models are limited by datasets with a small number of annotated data and a pre-defined set of categories. In its 2D counterpart, recent advances have shown that similar problems can be significantly alleviated by employing knowledge from other modalities, such as language. Inspired by this, leveraging multimodal information for 3D modality could be promising to improve 3D understanding under the restricted data regime, but this line of research is not well studied. Therefore, we introduce ULIP to learn a unified representation of images, texts, and 3D point clouds by pre-training with object triplets from the three modalities. To overcome the shortage of training triplets, ULIP leverages a pre-trained vision-language model that has already learned a common visual and textual space by training with massive image-text pairs. Then, ULIP learns a 3D representation space aligned with the common image-text space, using a small number of automatically synthesized triplets. ULIP is agnostic to 3D backbone networks and can easily be integrated into any 3D architecture. Experiments show that ULIP effectively improves the performance of multiple recent 3D backbones by simply pre-training them on ShapeNet55 using our framework, achieving state-of-the-art performance in both standard 3D classification and zero-shot 3D classification on ModelNet40 and ScanObjectNN. ULIP also improves the performance of PointMLP by around 3% in 3D classification on ScanObjectNN, and outperforms PointCLIP by 28.8% on top-1 accuracy for zero-shot 3D classification on ModelNet40. Our code and pre-trained models are released at https://github.com/salesforce/ULIP. △ Less

Submitted 12 June, 2023; v1 submitted 9 December, 2022; originally announced December 2022.

Comments: Accepted by CVPR 2023

arXiv:2212.04240 [pdf, ps, other]

A generalization of a lemma of Boccardo and Orsina and application

Authors: Hongya Gao, Meng Gao, Siyu Gao

Abstract: We present a generalization of a technical lemma due to Boccardo and Orsina, and then give an application to regularity of minima for integral functionals noncoercive in the energy space. We present a generalization of a technical lemma due to Boccardo and Orsina, and then give an application to regularity of minima for integral functionals noncoercive in the energy space. △ Less

Submitted 8 December, 2022; originally announced December 2022.

Comments: 12 pages

MSC Class: 49N60; 35J70

arXiv:2212.00532 [pdf, other]

EBHI-Seg: A Novel Enteroscope Biopsy Histopathological Haematoxylin and Eosin Image Dataset for Image Segmentation Tasks

Authors: Liyu Shi, Xiaoyan Li, Weiming Hu, Haoyuan Chen, **g Chen, Zizhen Fan, Minghe Gao, Yujie **g, Guotao Lu, Deguo Ma, Zhiyu Ma, Qingtao Meng, Dechao Tang, Hongzan Sun, Marcin Grzegorzek, Shouliang Qi, Yueyang Teng, Chen Li

Abstract: Background and Purpose: Colorectal cancer is a common fatal malignancy, the fourth most common cancer in men, and the third most common cancer in women worldwide. Timely detection of cancer in its early stages is essential for treating the disease. Currently, there is a lack of datasets for histopathological image segmentation of rectal cancer, which often hampers the assessment accuracy when comp… ▽ More Background and Purpose: Colorectal cancer is a common fatal malignancy, the fourth most common cancer in men, and the third most common cancer in women worldwide. Timely detection of cancer in its early stages is essential for treating the disease. Currently, there is a lack of datasets for histopathological image segmentation of rectal cancer, which often hampers the assessment accuracy when computer technology is used to aid in diagnosis. Methods: This present study provided a new publicly available Enteroscope Biopsy Histopathological Hematoxylin and Eosin Image Dataset for Image Segmentation Tasks (EBHI-Seg). To demonstrate the validity and extensiveness of EBHI-Seg, the experimental results for EBHI-Seg are evaluated using classical machine learning methods and deep learning methods. Results: The experimental results showed that deep learning methods had a better image segmentation performance when utilizing EBHI-Seg. The maximum accuracy of the Dice evaluation metric for the classical machine learning method is 0.948, while the Dice evaluation metric for the deep learning method is 0.965. Conclusion: This publicly available dataset contained 5,170 images of six types of tumor differentiation stages and the corresponding ground truth images. The dataset can provide researchers with new segmentation algorithms for medical diagnosis of colorectal cancer, which can be used in the clinical setting to help doctors and patients. △ Less

Submitted 6 December, 2022; v1 submitted 1 December, 2022; originally announced December 2022.

arXiv:2212.00337 [pdf, other]

Fault Models in Superconducting quantum circuits

Authors: Qifan Huang, Boxi Li, Minbo Gao, Mingsheng Ying

Abstract: Fault models are indispensable for many EDA tasks, so as for design and implementation of quantum hardware. In this article, we propose a fault model for superconducting quantum systems. Our fault model reflects the real fault behavior in control signals and structure of quantum systems. Based on it, we conduct fault simulation on controlled-Z gate and quantum circuits by QuTiP. We provide fidelit… ▽ More Fault models are indispensable for many EDA tasks, so as for design and implementation of quantum hardware. In this article, we propose a fault model for superconducting quantum systems. Our fault model reflects the real fault behavior in control signals and structure of quantum systems. Based on it, we conduct fault simulation on controlled-Z gate and quantum circuits by QuTiP. We provide fidelity benchmarks for incoherent faults and test patterns of minimal test repetitions for coherent faults. Results show that with 34 test repetitions a 10% control noise can be detected, which help to save test time and memory. △ Less

Submitted 1 December, 2022; originally announced December 2022.

Comments: 7 pages, 10 figures

arXiv:2211.15570 [pdf, other]

doi 10.3847/1538-4365/acafeb

GECAM Localization of High Energy Transients and the Systematic Error

Authors: Yi Zhao, Wang-Chen Xue, Shao-Lin Xiong, Yuan-Hao Wang, Jia-Cong Liu, Qi Liuo, Yan-Qiu Zhang, Jian-Chao Sun, Xiao-Yun Zhao, Ce Cai, Shuo Xiao, Yue Huang, Xiao-Bo Li, Zhen Zhang, **-Yuan Liao, Sheng Yang, Rui Qiao, Dong-Ya Guo, Chao Zheng, Qi-Bin Yi, Sheng-Lun Xie, Zhi-Wei Guo, Chao-Yang Li, Chen-Wei Wang, Wen-Jun Tan , et al. (41 additional authors not shown)

Abstract: Gravitational wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) is a pair of microsatellites (i.e. GECAM-A and GECAM-B) dedicated to monitoring gamma-ray transients including gravitational waves high-energy electromagnetic counterparts, Gamma-ray Bursts, Soft Gamma-ray Repeaters, Solar Flares and Terrestrial Gamma-ray Flashes. Since launch in December 2020, GECAM-B has detected… ▽ More Gravitational wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) is a pair of microsatellites (i.e. GECAM-A and GECAM-B) dedicated to monitoring gamma-ray transients including gravitational waves high-energy electromagnetic counterparts, Gamma-ray Bursts, Soft Gamma-ray Repeaters, Solar Flares and Terrestrial Gamma-ray Flashes. Since launch in December 2020, GECAM-B has detected hundreds of astronomical and terrestrial events. For these bursts, localization is the key for burst identification and classification as well as follow-up observations in multi-wavelength. Here, we propose a Bayesian localization method with Poisson data with Gaussian background profile likelihood to localize GECAM bursts based on the burst counts distribution in detectors with different orientations. We demonstrate that this method can work well for all kinds of bursts, especially for extremely short ones. In addition, we propose a new method to estimate the systematic error of localization based on a confidence level test, which can overcome some problems of the existing method in literature. We validate this method by Monte Carlo simulations, and then apply it to a burst sample with accurate location and find that the mean value of the systematic error of GECAM-B localization is $\sim 2.5^{\circ}$. By considering this systematic error, we can obtain a reliable localization probability map for GECAM bursts. Our methods can be applied to other gamma-ray monitors. △ Less

Submitted 23 December, 2022; v1 submitted 28 November, 2022; originally announced November 2022.

Comments: The paper has been accepted by Astrophysical Journal Supplement Series

arXiv:2211.14772 [pdf, ps, other]

doi 10.1088/1674-1056/acae7b

High-temperature ferromagnetism and strong $π$-conjugation feature in two-dimensional manganese tetranitride

Authors: Ming Yan, Z. Y. Xie, Miao Gao

Abstract: Two-dimensional (2D) magnetic materials have attracted tremendous research interest because of the promising application in the next-generation microelectronic devices. Here, by the first-principles calculations, we propose a two-dimensional ferromagnetic material with high Curie temperature, manganese tetranitride MnN$_4$ monolayer, which is a square-planar lattice made up of only one layer of at… ▽ More Two-dimensional (2D) magnetic materials have attracted tremendous research interest because of the promising application in the next-generation microelectronic devices. Here, by the first-principles calculations, we propose a two-dimensional ferromagnetic material with high Curie temperature, manganese tetranitride MnN$_4$ monolayer, which is a square-planar lattice made up of only one layer of atoms. The structure is demonstrated to be stable by the phonon spectra and the molecular dynamic simulations, and the stability is ascribed to the $π$-d conjugation between $π$ orbital of N=N bond and Mn $d$ orbital. More interestingly, the MnN$_4$ monolayer displays robust 2D ferromagnetism, which originates from the strong exchange couplings between Mn atoms due to the $π$-d conjugation. The high critical temperature of 247 K is determined by solving the Heisenberg model with the Monte Carlo method. △ Less

Submitted 27 November, 2022; originally announced November 2022.

Journal ref: Chin. Phys. B 32, 037104 (2023)

arXiv:2211.10927 [pdf, other]

GLT-T: Global-Local Transformer Voting for 3D Single Object Tracking in Point Clouds

Authors: Jiahao Nie, Zhiwei He, Yuxiang Yang, Mingyu Gao, **g Zhang

Abstract: Current 3D single object tracking methods are typically based on VoteNet, a 3D region proposal network. Despite the success, using a single seed point feature as the cue for offset learning in VoteNet prevents high-quality 3D proposals from being generated. Moreover, seed points with different importance are treated equally in the voting process, aggravating this defect. To address these issues, w… ▽ More Current 3D single object tracking methods are typically based on VoteNet, a 3D region proposal network. Despite the success, using a single seed point feature as the cue for offset learning in VoteNet prevents high-quality 3D proposals from being generated. Moreover, seed points with different importance are treated equally in the voting process, aggravating this defect. To address these issues, we propose a novel global-local transformer voting scheme to provide more informative cues and guide the model pay more attention on potential seed points, promoting the generation of high-quality 3D proposals. Technically, a global-local transformer (GLT) module is employed to integrate object- and patch-aware prior into seed point features to effectively form strong feature representation for geometric positions of the seed points, thus providing more robust and accurate cues for offset learning. Subsequently, a simple yet effective training strategy is designed to train the GLT module. We develop an importance prediction branch to learn the potential importance of the seed points and treat the output weights vector as a training constraint term. By incorporating the above components together, we exhibit a superior tracking method GLT-T. Extensive experiments on challenging KITTI and NuScenes benchmarks demonstrate that GLT-T achieves state-of-the-art performance in the 3D single object tracking task. Besides, further ablation studies show the advantages of the proposed global-local transformer voting scheme over the original VoteNet. Code and models will be available at https://github.com/haooozi/GLT-T. △ Less

Submitted 20 November, 2022; originally announced November 2022.

Comments: Accepted to AAAI 2023. The source code and models will be available at https://github.com/haooozi/GLT-T

arXiv:2211.10575 [pdf, other]

Bayesian autoencoders for data-driven discovery of coordinates, governing equations and fundamental constants

Authors: L. Mars Gao, J. Nathan Kutz

Abstract: Recent progress in autoencoder-based sparse identification of nonlinear dynamics (SINDy) under $\ell_1$ constraints allows joint discoveries of governing equations and latent coordinate systems from spatio-temporal data, including simulated video frames. However, it is challenging for $\ell_1$-based sparse inference to perform correct identification for real data due to the noisy measurements and… ▽ More Recent progress in autoencoder-based sparse identification of nonlinear dynamics (SINDy) under $\ell_1$ constraints allows joint discoveries of governing equations and latent coordinate systems from spatio-temporal data, including simulated video frames. However, it is challenging for $\ell_1$-based sparse inference to perform correct identification for real data due to the noisy measurements and often limited sample sizes. To address the data-driven discovery of physics in the low-data and high-noise regimes, we propose Bayesian SINDy autoencoders, which incorporate a hierarchical Bayesian sparsifying prior: Spike-and-slab Gaussian Lasso. Bayesian SINDy autoencoder enables the joint discovery of governing equations and coordinate systems with a theoretically guaranteed uncertainty estimate. To resolve the challenging computational tractability of the Bayesian hierarchical setting, we adapt an adaptive empirical Bayesian method with Stochatic gradient Langevin dynamics (SGLD) which gives a computationally tractable way of Bayesian posterior sampling within our framework. Bayesian SINDy autoencoder achieves better physics discovery with lower data and fewer training epochs, along with valid uncertainty quantification suggested by the experimental studies. The Bayesian SINDy autoencoder can be applied to real video data, with accurate physics discovery which correctly identifies the governing equation and provides a close estimate for standard physics constants like gravity $g$, for example, in videos of a pendulum. △ Less

Submitted 18 November, 2022; originally announced November 2022.

Comments: 28 pages, 11 figures

arXiv:2211.03466 [pdf, other]

Using Deep Mixture-of-Experts to Detect Word Meaning Shift for TempoWiC

Authors: Ze Chen, Kangxu Wang, Zijian Cai, Jiewen Zheng, Jiarong He, Max Gao, Jason Zhang

Abstract: This paper mainly describes the dma submission to the TempoWiC task, which achieves a macro-F1 score of 77.05% and attains the first place in this task. We first explore the impact of different pre-trained language models. Then we adopt data cleaning, data augmentation, and adversarial training strategies to enhance the model generalization and robustness. For further improvement, we integrate POS… ▽ More This paper mainly describes the dma submission to the TempoWiC task, which achieves a macro-F1 score of 77.05% and attains the first place in this task. We first explore the impact of different pre-trained language models. Then we adopt data cleaning, data augmentation, and adversarial training strategies to enhance the model generalization and robustness. For further improvement, we integrate POS information and word semantic representation using a Mixture-of-Experts (MoE) approach. The experimental results show that MoE can overcome the feature overuse issue and combine the context, POS, and word semantic features well. Additionally, we use a model ensemble method for the final prediction, which has been proven effective by many research works. △ Less

Submitted 7 November, 2022; originally announced November 2022.

arXiv:2211.01680 [pdf, other]

doi 10.1103/PhysRevLett.131.136901

Electrically controlling vortices in a neutral exciton polariton condensate at room temperature

Authors: Xiaokun Zhai, Xuekai Ma, Ying Gao, Chunzi Xing, Meini Gao, Haitao Dai, Xiao Wang, Anlian Pan, Stefan Schumacher, Tingge Gao

Abstract: Manipulating bosonic condensates with electric fields is very challenging as the electric fields do not directly interact with the neutral particles of the condensate. Here we demonstrate a simple electric method to tune the vorticity of exciton polariton condensates in a strong coupling liquid crystal (LC) microcavity with CsPbBr$_3$ microplates as active material at room temperature. In such a m… ▽ More Manipulating bosonic condensates with electric fields is very challenging as the electric fields do not directly interact with the neutral particles of the condensate. Here we demonstrate a simple electric method to tune the vorticity of exciton polariton condensates in a strong coupling liquid crystal (LC) microcavity with CsPbBr$_3$ microplates as active material at room temperature. In such a microcavity, the LC molecular director can be electrically modulated giving control over the polariton condensation in different modes. For isotropic non-resonant optical pum** we demonstrate the spontaneous formation of vortices with topological charges of +1, +2, -2, and -1. The topological vortex charge is controlled by a voltage in the range of 1 to 10 V applied to the microcavity sample. This control is achieved by the interplay of a built-in potential gradient, the anisotropy of the optically active perovskite microplates, and the electrically controllable LC molecular director in our system with intentionally broken rotational symmetry. Besides the fundamental interest in the achieved electric polariton vortex control at room temperature, our work paves the way to micron-sized emitters with electric control over the emitted light's phase profile and quantized orbital angular momentum for information processing and integration into photonic circuits. △ Less

Submitted 28 September, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

Journal ref: Phys. Rev. Lett. 131, 136901 (2023)

arXiv:2211.01094 [pdf, other]

doi 10.1007/JHEP05(2023)003

Simultaneous CTEQ-TEA extraction of PDFs and SMEFT parameters from jet and $t{\bar t}$ data

Authors: Jun Gao, MeiSen Gao, T. J. Hobbs, DianYu Liu, XiaoMin Shen

Abstract: Recasting phenomenological Lagrangians in terms of SM effective field theory (SMEFT) provides a valuable means of connecting potential BSM physics at momenta well above the electroweak scale to experimental signatures at lower energies. In this work we jointly fit the Wilson coefficients of SMEFT operators as well as the PDFs in an extension of the CT18 global analysis framework, obtaining self-co… ▽ More Recasting phenomenological Lagrangians in terms of SM effective field theory (SMEFT) provides a valuable means of connecting potential BSM physics at momenta well above the electroweak scale to experimental signatures at lower energies. In this work we jointly fit the Wilson coefficients of SMEFT operators as well as the PDFs in an extension of the CT18 global analysis framework, obtaining self-consistent constraints to possible BSM physics effects. Global fits are boosted with machine-learning techniques in the form of neural networks to ensure efficient scans of the full PDF+SMEFT parameter space. We focus on several operators relevant for top-quark pair and jet production at hadron colliders and obtain constraints on the Wilson coefficients with Lagrange Multiplier scans. We find mild correlations between the extracted Wilson coefficients, PDFs, and other QCD parameters, and see indications that these correlations may become more prominent in future analyses based on data of higher precision. This work serves as a new platform for joint analyses of SM and BSM physics based on the CTEQ-TEA framework. △ Less

Submitted 3 May, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

Comments: 42 pages, 19 figures

arXiv:2210.15255 [pdf, other]

RePAST: A ReRAM-based PIM Accelerator for Second-order Training of DNN

Authors: Yilong Zhao, Li Jiang, Mingyu Gao, Naifeng **g, Chengyang Gu, Qidong Tang, Fangxin Liu, Tao Yang, Xiaoyao Liang

Abstract: The second-order training methods can converge much faster than first-order optimizers in DNN training. This is because the second-order training utilizes the inversion of the second-order information (SOI) matrix to find a more accurate descent direction and step size. However, the huge SOI matrices bring significant computational and memory overheads in the traditional architectures like GPU and… ▽ More The second-order training methods can converge much faster than first-order optimizers in DNN training. This is because the second-order training utilizes the inversion of the second-order information (SOI) matrix to find a more accurate descent direction and step size. However, the huge SOI matrices bring significant computational and memory overheads in the traditional architectures like GPU and CPU. On the other side, the ReRAM-based process-in-memory (PIM) technology is suitable for the second-order training because of the following three reasons: First, PIM's computation happens in memory, which reduces data movement overheads; Second, ReRAM crossbars can compute SOI's inversion in $O\left(1\right)$ time; Third, if architected properly, ReRAM crossbars can perform matrix inversion and vector-matrix multiplications which are important to the second-order training algorithms. Nevertheless, current ReRAM-based PIM techniques still face a key challenge for accelerating the second-order training. The existing ReRAM-based matrix inversion circuitry can only support 8-bit accuracy matrix inversion and the computational precision is not sufficient for the second-order training that needs at least 16-bit accurate matrix inversion. In this work, we propose a method to achieve high-precision matrix inversion based on a proven 8-bit matrix inversion (INV) circuitry and vector-matrix multiplication (VMM) circuitry. We design \archname{}, a ReRAM-based PIM accelerator architecture for the second-order training. Moreover, we propose a software map** scheme for \archname{} to further optimize the performance by fusing VMM and INV crossbar. Experiment shows that \archname{} can achieve an average of 115.8$\times$/11.4$\times$ speedup and 41.9$\times$/12.8$\times$energy saving compared to a GPU counterpart and PipeLayer on large-scale DNNs. △ Less

Submitted 27 October, 2022; originally announced October 2022.

Comments: 13pages, 13 figures

arXiv:2210.13990 [pdf, other]

OSS Mentor A framework for improving developers contributions via deep reinforcement learning

Authors: Jiakuan Fan, Haoyue Wang, Wei Wang, Ming Gao, Shengyu Zhao

Abstract: In open source project governance, there has been a lot of concern about how to measure developers' contributions. However, extremely sparse work has focused on enabling developers to improve their contributions, while it is significant and valuable. In this paper, we introduce a deep reinforcement learning framework named Open Source Software(OSS) Mentor, which can be trained from empirical knowl… ▽ More In open source project governance, there has been a lot of concern about how to measure developers' contributions. However, extremely sparse work has focused on enabling developers to improve their contributions, while it is significant and valuable. In this paper, we introduce a deep reinforcement learning framework named Open Source Software(OSS) Mentor, which can be trained from empirical knowledge and then adaptively help developers improve their contributions. Extensive experiments demonstrate that OSS Mentor significantly outperforms excellent experimental results. Moreover, it is the first time that the presented framework explores deep reinforcement learning techniques to manage open source software, which enables us to design a more robust framework to improve developers' contributions. △ Less

Submitted 24 October, 2022; originally announced October 2022.

arXiv:2210.13708 [pdf, other]

MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library

Authors: Siyi Hu, Yifan Zhong, Minquan Gao, Weixun Wang, Hao Dong, Xiaodan Liang, Zhihui Li, Xiaojun Chang, Yaodong Yang

Abstract: A significant challenge facing researchers in the area of multi-agent reinforcement learning (MARL) pertains to the identification of a library that can offer fast and compatible development for multi-agent tasks and algorithm combinations, while obviating the need to consider compatibility issues. In this paper, we present MARLlib, a library designed to address the aforementioned challenge by lev… ▽ More A significant challenge facing researchers in the area of multi-agent reinforcement learning (MARL) pertains to the identification of a library that can offer fast and compatible development for multi-agent tasks and algorithm combinations, while obviating the need to consider compatibility issues. In this paper, we present MARLlib, a library designed to address the aforementioned challenge by leveraging three key mechanisms: 1) a standardized multi-agent environment wrapper, 2) an agent-level algorithm implementation, and 3) a flexible policy map** strategy. By utilizing these mechanisms, MARLlib can effectively disentangle the intertwined nature of the multi-agent task and the learning process of the algorithm, with the ability to automatically alter the training strategy based on the current task's attributes. The MARLlib library's source code is publicly accessible on GitHub: \url{https://github.com/Replicable-MARL/MARLlib}. △ Less

Submitted 6 November, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

arXiv:2210.10321 [pdf, other]

Efficient Bi-Level Optimization for Recommendation Denoising

Authors: Zongwei Wang, Min Gao, Wentao Li, Junliang Yu, Linxin Guo, Hongzhi Yin

Abstract: The acquisition of explicit user feedback (e.g., ratings) in real-world recommender systems is often hindered by the need for active user involvement. To mitigate this issue, implicit feedback (e.g., clicks) generated during user browsing is exploited as a viable substitute. However, implicit feedback possesses a high degree of noise, which significantly undermines recommendation quality. While ma… ▽ More The acquisition of explicit user feedback (e.g., ratings) in real-world recommender systems is often hindered by the need for active user involvement. To mitigate this issue, implicit feedback (e.g., clicks) generated during user browsing is exploited as a viable substitute. However, implicit feedback possesses a high degree of noise, which significantly undermines recommendation quality. While many methods have been proposed to address this issue by assigning varying weights to implicit feedback, two shortcomings persist: (1) the weight calculation in these methods is iteration-independent, without considering the influence of weights in previous iterations, and (2) the weight calculation often relies on prior knowledge, which may not always be readily available or universally applicable. To overcome these two limitations, we model recommendation denoising as a bi-level optimization problem. The inner optimization aims to derive an effective model for the recommendation, as well as guiding the weight determination, thereby eliminating the need for prior knowledge. The outer optimization leverages gradients of the inner optimization and adjusts the weights in a manner considering the impact of previous weights. To efficiently solve this bi-level optimization problem, we employ a weight generator to avoid the storage of weights and a one-step gradient-matching-based loss to significantly reduce computational time. The experimental results on three benchmark datasets demonstrate that our proposed approach outperforms both state-of-the-art general and denoising recommendation models. The code is available at https://github.com/CoderWZW/BOD. △ Less

Submitted 1 June, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

Comments: 11pages, 5 figures, 6 tables

arXiv:2210.09049 [pdf, other]

SpanProto: A Two-stage Span-based Prototypical Network for Few-shot Named Entity Recognition

Authors: Jianing Wang, Chengcheng Han, Chengyu Wang, Chuanqi Tan, Minghui Qiu, Songfang Huang, Jun Huang, Ming Gao

Abstract: Few-shot Named Entity Recognition (NER) aims to identify named entities with very little annotated data. Previous methods solve this problem based on token-wise classification, which ignores the information of entity boundaries, and inevitably the performance is affected by the massive non-entity tokens. To this end, we propose a seminal span-based prototypical network (SpanProto) that tackles few… ▽ More Few-shot Named Entity Recognition (NER) aims to identify named entities with very little annotated data. Previous methods solve this problem based on token-wise classification, which ignores the information of entity boundaries, and inevitably the performance is affected by the massive non-entity tokens. To this end, we propose a seminal span-based prototypical network (SpanProto) that tackles few-shot NER via a two-stage approach, including span extraction and mention classification. In the span extraction stage, we transform the sequential tags into a global boundary matrix, enabling the model to focus on the explicit boundary information. For mention classification, we leverage prototypical learning to capture the semantic representations for each labeled span and make the model better adapt to novel-class entities. To further improve the model performance, we split out the false positives generated by the span extractor but not labeled in the current episode set, and then present a margin-based loss to separate them from each prototype region. Experiments over multiple benchmarks demonstrate that our model outperforms strong baselines by a large margin. △ Less

Submitted 21 November, 2022; v1 submitted 17 October, 2022; originally announced October 2022.

arXiv:2210.08859 [pdf, other]

Social Biases in Automatic Evaluation Metrics for NLG

Authors: Mingqi Gao, Xiaojun Wan

Abstract: Many studies have revealed that word embeddings, language models, and models for specific downstream tasks in NLP are prone to social biases, especially gender bias. Recently these techniques have been gradually applied to automatic evaluation metrics for text generation. In the paper, we propose an evaluation method based on Word Embeddings Association Test (WEAT) and Sentence Embeddings Associat… ▽ More Many studies have revealed that word embeddings, language models, and models for specific downstream tasks in NLP are prone to social biases, especially gender bias. Recently these techniques have been gradually applied to automatic evaluation metrics for text generation. In the paper, we propose an evaluation method based on Word Embeddings Association Test (WEAT) and Sentence Embeddings Association Test (SEAT) to quantify social biases in evaluation metrics and discover that social biases are also widely present in some model-based automatic evaluation metrics. Moreover, we construct gender-swapped meta-evaluation datasets to explore the potential impact of gender bias in image caption and text summarization tasks. Results show that given gender-neutral references in the evaluation, model-based evaluation metrics may show a preference for the male hypothesis, and the performance of them, i.e. the correlation between evaluation metrics and human judgments, usually has more significant variation after gender swap**. △ Less

Submitted 17 October, 2022; originally announced October 2022.

arXiv:2210.08536 [pdf, other]

Knowledge Prompting in Pre-trained Language Model for Natural Language Understanding

Authors: Jianing Wang, Wenkang Huang, Qiuhui Shi, Hongbin Wang, Minghui Qiu, Xiang Li, Ming Gao

Abstract: Knowledge-enhanced Pre-trained Language Model (PLM) has recently received significant attention, which aims to incorporate factual knowledge into PLMs. However, most existing methods modify the internal structures of fixed types of PLMs by stacking complicated modules, and introduce redundant and irrelevant factual knowledge from knowledge bases (KBs). In this paper, to address these problems, we… ▽ More Knowledge-enhanced Pre-trained Language Model (PLM) has recently received significant attention, which aims to incorporate factual knowledge into PLMs. However, most existing methods modify the internal structures of fixed types of PLMs by stacking complicated modules, and introduce redundant and irrelevant factual knowledge from knowledge bases (KBs). In this paper, to address these problems, we introduce a seminal knowledge prompting paradigm and further propose a knowledge-prompting-based PLM framework KP-PLM. This framework can be flexibly combined with existing mainstream PLMs. Specifically, we first construct a knowledge sub-graph from KBs for each context. Then we design multiple continuous prompts rules and transform the knowledge sub-graph into natural language prompts. To further leverage the factual knowledge from these prompts, we propose two novel knowledge-aware self-supervised tasks including prompt relevance inspection and masked prompt modeling. Extensive experiments on multiple natural language understanding (NLU) tasks show the superiority of KP-PLM over other state-of-the-art methods in both full-resource and low-resource settings. △ Less

Submitted 16 October, 2022; originally announced October 2022.

Comments: 14 pages, 5 figures. This paper has been accepted for the main conference of EMNLP2022 (long paper)

arXiv:2210.04633 [pdf, other]

CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models for Programming Language Attend Code Structure

Authors: Nuo Chen, Qiushi Sun, Renyu Zhu, Xiang Li, Xuesong Lu, Ming Gao

Abstract: Code pre-trained models (CodePTMs) have recently demonstrated significant success in code intelligence. To interpret these models, some probing methods have been applied. However, these methods fail to consider the inherent characteristics of codes. In this paper, to address the problem, we propose a novel probing method CAT-probing to quantitatively interpret how CodePTMs attend code structure. W… ▽ More Code pre-trained models (CodePTMs) have recently demonstrated significant success in code intelligence. To interpret these models, some probing methods have been applied. However, these methods fail to consider the inherent characteristics of codes. In this paper, to address the problem, we propose a novel probing method CAT-probing to quantitatively interpret how CodePTMs attend code structure. We first denoise the input code sequences based on the token types pre-defined by the compilers to filter those tokens whose attention scores are too small. After that, we define a new metric CAT-score to measure the commonality between the token-level attention scores generated in CodePTMs and the pair-wise distances between corresponding AST nodes. The higher the CAT-score, the stronger the ability of CodePTMs to capture code structure. We conduct extensive experiments to integrate CAT-probing with representative CodePTMs for different programming languages. Experimental results show the effectiveness of CAT-probing in CodePTM interpretation. Our codes and data are publicly available at https://github.com/nchen909/CodeAttention. △ Less

Submitted 10 December, 2022; v1 submitted 7 October, 2022; originally announced October 2022.

Comments: Accepted by EMNLP 2022

arXiv:2210.01264 [pdf, other]

doi 10.1364/OE.470143

Flattening laser frequency comb spectra with a high dynamic range, broadband spectral shaper on-a-chip

Authors: Nemanja Jovanovic, Pradip Gatkine, Boqiang Shen, Maodong Gao, Nick Cvetojevic, Katarzyna Lawniczuk, Ronald Broeke, Charles Beichman, Stephanie Leifer, Jeffery Jewell, Gautam Vasisht, Dimitri Mawet

Abstract: Spectral sha** is critical to many fields of science. In astronomy for example, the detection of exoplanets via the Doppler effect hinges on the ability to calibrate a high resolution spectrograph. Laser frequency combs can be used for this, but the wildly varying intensity across the spectrum can make it impossible to optimally utilize the entire comb, leading to a reduced overall precision of… ▽ More Spectral sha** is critical to many fields of science. In astronomy for example, the detection of exoplanets via the Doppler effect hinges on the ability to calibrate a high resolution spectrograph. Laser frequency combs can be used for this, but the wildly varying intensity across the spectrum can make it impossible to optimally utilize the entire comb, leading to a reduced overall precision of calibration. To circumvent this, astronomical applications of laser frequency combs rely on a bulk optic setup which can flatten the output spectrum before sending it to the spectrograph. Such flatteners require complex and expensive optical elements like spatial light modulators and have non-negligible bench top footprints. Here we present an alternative in the form of an all-photonic spectral shaper that can be used to flatten the spectrum of a laser frequency comb. The device consists of a circuit etched into a silicon nitride wafer that supports an arrayed-waveguide grating to disperse the light over hundreds of nanometers in wavelength, followed by Mach-Zehnder interferometers to control the amplitude of each channel, thermo-optic phase modulators to phase the channels and a second arrayed-waveguide grating to recombine the spectrum. The demonstrator device operates from 1400 to 1800 nm (covering the astronomical H band), with twenty 20 nm wide channels. The device allows for nearly 40 dBs of dynamic modulation of the spectrum via the Mach-Zehnders , which is greater than that offered by most spatial light modulators. With a superluminescent diode, we reduced the static spectral variation to ~3 dB, limited by the properties of the components used in the circuit and on a laser frequency comb we managed to reduce the modulation to 5 dBs, sufficient for astronomical applications. △ Less

Submitted 3 October, 2022; originally announced October 2022.

Comments: 15 pages, 10 figures. arXiv admin note: substantial text overlap with arXiv:2209.09455

Journal ref: Opt. Express 30, 36745-36760 (2022)

Showing 151–200 of 514 results for author: Gao, M