Search | arXiv e-print repository

Few-shot Joint Multimodal Aspect-Sentiment Analysis Based on Generative Multimodal Prompt

Authors: Xiaocui Yang, Shi Feng, Daling Wang, Sun Qi, Wenfang Wu, Yifei Zhang, Pengfei Hong, Soujanya Poria

Abstract: We have witnessed the rapid proliferation of multimodal data on numerous social media platforms. Conventional studies typically require massive labeled data to train models for Multimodal Aspect-Based Sentiment Analysis (MABSA). However, collecting and annotating fine-grained multimodal data for MABSA is tough. To alleviate the above issue, we perform three MABSA-related tasks with quite a small n… ▽ More We have witnessed the rapid proliferation of multimodal data on numerous social media platforms. Conventional studies typically require massive labeled data to train models for Multimodal Aspect-Based Sentiment Analysis (MABSA). However, collecting and annotating fine-grained multimodal data for MABSA is tough. To alleviate the above issue, we perform three MABSA-related tasks with quite a small number of labeled multimodal samples. We first build diverse and comprehensive multimodal few-shot datasets according to the data distribution. To capture the specific prompt for each aspect term in a few-shot scenario, we propose a novel Generative Multimodal Prompt (GMP) model for MABSA, which includes the Multimodal Encoder module and the N-Stream Decoders module. We further introduce a subtask to predict the number of aspect terms in each instance to construct the multimodal prompt. Extensive experiments on two datasets demonstrate that our approach outperforms strong baselines on two MABSA-related tasks in the few-shot setting. △ Less

Submitted 18 May, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

Comments: 13 pages, 7 figures, 6 tables, ACL 2023 Long Paper (Findings)

arXiv:2305.05503 [pdf, other]

BadCS: A Backdoor Attack Framework for Code search

Authors: Shiyi Qi, Yuanhang Yang, Shuzhzeng Gao, Cuiyun Gao, Zenglin Xu

Abstract: With the development of deep learning (DL), DL-based code search models have achieved state-of-the-art performance and have been widely used by developers during software development. However, the security issue, e.g., recommending vulnerable code, has not received sufficient attention, which will bring potential harm to software development. Poisoning-based backdoor attack has proven effective in… ▽ More With the development of deep learning (DL), DL-based code search models have achieved state-of-the-art performance and have been widely used by developers during software development. However, the security issue, e.g., recommending vulnerable code, has not received sufficient attention, which will bring potential harm to software development. Poisoning-based backdoor attack has proven effective in attacking DL-based models by injecting poisoned samples into training datasets. However, previous work shows that the attack technique does not perform successfully on all DL-based code search models and tends to fail for Transformer-based models, especially pretrained models. Besides, the infected models generally perform worse than benign models, which makes the attack not stealthy enough and thereby hinders the adoption by developers. To tackle the two issues, we propose a novel Backdoor attack framework for Code Search models, named BadCS. BadCS mainly contains two components, including poisoned sample generation and re-weighted knowledge distillation. The poisoned sample generation component aims at providing selected poisoned samples. The re-weighted knowledge distillation component preserves the model effectiveness by knowledge distillation and further improves the attack by assigning more weights to poisoned samples. Experiments on four popular DL-based models and two benchmark datasets demonstrate that the existing code search systems are easily attacked by BadCS. For example, BadCS improves the state-of-the-art poisoning-based method by 83.03%-99.98% and 75.98%-99.90% on Python and Java datasets, respectively. Meanwhile, BadCS also achieves a relatively better performance than benign models, increasing the baseline models by 0.49% and 0.46% on average, respectively. △ Less

Submitted 9 May, 2023; originally announced May 2023.

arXiv:2305.03899 [pdf, other]

NL-CS Net: Deep Learning with Non-Local Prior for Image Compressive Sensing

Authors: Shuai Bian, Shouliang Qi, Chen Li, Yudong Yao, Yueyang Teng

Abstract: Deep learning has been applied to compressive sensing (CS) of images successfully in recent years. However, existing network-based methods are often trained as the black box, in which the lack of prior knowledge is often the bottleneck for further performance improvement. To overcome this drawback, this paper proposes a novel CS method using non-local prior which combines the interpretability of t… ▽ More Deep learning has been applied to compressive sensing (CS) of images successfully in recent years. However, existing network-based methods are often trained as the black box, in which the lack of prior knowledge is often the bottleneck for further performance improvement. To overcome this drawback, this paper proposes a novel CS method using non-local prior which combines the interpretability of the traditional optimization methods with the speed of network-based methods, called NL-CS Net. We unroll each phase from iteration of the augmented Lagrangian method solving non-local and sparse regularized optimization problem by a network. NL-CS Net is composed of the up-sampling module and the recovery module. In the up-sampling module, we use learnable up-sampling matrix instead of a predefined one. In the recovery module, patch-wise non-local network is employed to capture long-range feature correspondences. Important parameters involved (e.g. sampling matrix, nonlinear transforms, shrinkage thresholds, step size, $etc.$) are learned end-to-end, rather than hand-crafted. Furthermore, to facilitate practical implementation, orthogonal and binary constraints on the sampling matrix are simultaneously adopted. Extensive experiments on natural images and magnetic resonance imaging (MRI) demonstrate that the proposed method outperforms the state-of-the-art methods while maintaining great interpretability and speed. △ Less

Submitted 5 May, 2023; originally announced May 2023.

Comments: 21pages,6figures

ACM Class: I.4.7

arXiv:2304.00275 [pdf, ps, other]

Automated Formation Control Synthesis from Temporal Logic Specifications

Authors: Shuhao Qi, Zengjie Zhang, Sofie Haesaert, Zhiyong Sun

Abstract: In many practical scenarios, multi-robot systems are envisioned to support humans in executing complicated tasks within structured environments, such as search-and-rescue tasks. We propose a framework for a multi-robot swarm to fulfill complex tasks represented by temporal logic specifications. Given temporal logic specifications on the swarm formation and navigation, we develop a controller with… ▽ More In many practical scenarios, multi-robot systems are envisioned to support humans in executing complicated tasks within structured environments, such as search-and-rescue tasks. We propose a framework for a multi-robot swarm to fulfill complex tasks represented by temporal logic specifications. Given temporal logic specifications on the swarm formation and navigation, we develop a controller with runtime safety and convergence guarantees that drive the swarm to formally satisfy the specification. In addition, the synthesized controller will autonomously switch formations as necessary and react to uncontrollable events from the environment. The efficacy of the proposed framework is validated with a simulation study on the navigation of multiple quadrotor robots. △ Less

Submitted 15 September, 2023; v1 submitted 1 April, 2023; originally announced April 2023.

arXiv:2303.09058 [pdf, other]

SVDE: Scalable Value-Decomposition Exploration for Cooperative Multi-Agent Reinforcement Learning

Authors: Shuhan Qi, Shuhao Zhang, Qiang Wang, Jiajia Zhang, **g Xiao, Xuan Wang

Abstract: Value-decomposition methods, which reduce the difficulty of a multi-agent system by decomposing the joint state-action space into local observation-action spaces, have become popular in cooperative multi-agent reinforcement learning (MARL). However, value-decomposition methods still have the problems of tremendous sample consumption for training and lack of active exploration. In this paper, we pr… ▽ More Value-decomposition methods, which reduce the difficulty of a multi-agent system by decomposing the joint state-action space into local observation-action spaces, have become popular in cooperative multi-agent reinforcement learning (MARL). However, value-decomposition methods still have the problems of tremendous sample consumption for training and lack of active exploration. In this paper, we propose a scalable value-decomposition exploration (SVDE) method, which includes a scalable training mechanism, intrinsic reward design, and explorative experience replay. The scalable training mechanism asynchronously decouples strategy learning with environmental interaction, so as to accelerate sample generation in a MapReduce manner. For the problem of lack of exploration, an intrinsic reward design and explorative experience replay are proposed, so as to enhance exploration to produce diverse samples and filter non-novel samples, respectively. Empirically, our method achieves the best performance on almost all maps compared to other popular algorithms in a set of StarCraft II micromanagement games. A data-efficiency experiment also shows the acceleration of SVDE for sample collection and policy convergence, and we demonstrate the effectiveness of factors in SVDE through a set of ablation experiments. △ Less

Submitted 15 March, 2023; originally announced March 2023.

Comments: 13 pages, 9 figures

arXiv:2303.06169 [pdf]

MOELA: A Multi-Objective Evolutionary/Learning Design Space Exploration Framework for 3D Heterogeneous Manycore Platforms

Authors: Sirui Qi, Yingheng Li, Sudeep Pasricha, Ryan Gary Kim

Abstract: To enable emerging applications such as deep machine learning and graph processing, 3D network-on-chip (NoC) enabled heterogeneous manycore platforms that can integrate many processing elements (PEs) are needed. However, designing such complex systems with multiple objectives can be challenging due to the huge associated design space and long evaluation times. To optimize such systems, we propose… ▽ More To enable emerging applications such as deep machine learning and graph processing, 3D network-on-chip (NoC) enabled heterogeneous manycore platforms that can integrate many processing elements (PEs) are needed. However, designing such complex systems with multiple objectives can be challenging due to the huge associated design space and long evaluation times. To optimize such systems, we propose a new multi-objective design space exploration framework called MOELA that combines the benefits of evolutionary-based search with a learning-based local search to quickly determine PE and communication link placement to optimize multiple objectives (e.g., latency, throughput, and energy) in 3D NoC enabled heterogeneous manycore systems. Compared to state-of-the-art approaches, MOELA increases the speed of finding solutions by up to 128x, leads to a better Pareto Hypervolume (PHV) by up to 12.14x and improves energy-delay-product (EDP) by up to 7.7% in a 5-objective scenario. △ Less

Submitted 10 March, 2023; originally announced March 2023.

arXiv:2303.06083 [pdf, other]

doi 10.3847/1538-3881/acc389

Searching for Compact Object Candidates from LAMOST Time-Domain Survey of Four K2 Plates

Authors: Senyu Qi, Wei-Min Gu, Tuan Yi, Zhi-Xiang Zhang, Song Wang, Jifeng Liu

Abstract: The time-domain (TD) surveys of the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) yield high-cadence radial velocities, paving a new avenue to study binary systems including compact objects. In this work, we explore LAMOST TD spectroscopic data of four K2 plates and present a sample of six single-lined spectroscopic binaries that may contain compact objects. We conduct analyse… ▽ More The time-domain (TD) surveys of the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) yield high-cadence radial velocities, paving a new avenue to study binary systems including compact objects. In this work, we explore LAMOST TD spectroscopic data of four K2 plates and present a sample of six single-lined spectroscopic binaries that may contain compact objects. We conduct analyses using phase-resolved radial velocity measurements of the visible star, to characterize each source and to infer the properties of invisible companion. By fitting the radial velocity curves for the six targets, we obtain accurate orbital periods, ranging from $\sim$ (0.6-6) days, and radial velocity semi-amplitudes, ranging from $\sim$ (50-130) km s$^{-1}$. We calculate the mass function of the unseen companions to be between 0.08 and 0.17 $M_{\odot}$. Based on the mass function and the estimated stellar parameters of the visible star, we determine the minimum mass of the hidden star. Three targets, J034813, J063350, and J064850, show ellipsoidal variability in the light curves from K2, ZTF, and TESS surveys. Therefore, we can put constraints on the mass of the invisible star using the ellipsoidal variability. We identify no X-ray counterparts for these targets except for J085120, of which the X-ray emission can be ascribed to stellar activity. We note that the nature of these six candidates is worth further characterization utilizing multi-wavelength follow-up observations. △ Less

Submitted 10 March, 2023; originally announced March 2023.

Comments: 13 pages, 6 figures, accepted for publication in The Astronomical Journal

arXiv:2303.04404 [pdf, other]

MiddleNet: A Unified, High-Performance NFV and Middlebox Framework with eBPF and DPDK

Authors: Shixiong Qi, Ziteng Zeng, Leslie Monis, K. K. Ramakrishnan

Abstract: Traditional network resident functions (e.g., firewalls, network address translation) and middleboxes (caches, load balancers) have moved from purpose-built appliances to software-based components. However, L2/L3 network functions (NFs) are being implemented on Network Function Virtualization (NFV) platforms that extensively exploit kernel-bypass technology. They often use DPDK for zero-copy deliv… ▽ More Traditional network resident functions (e.g., firewalls, network address translation) and middleboxes (caches, load balancers) have moved from purpose-built appliances to software-based components. However, L2/L3 network functions (NFs) are being implemented on Network Function Virtualization (NFV) platforms that extensively exploit kernel-bypass technology. They often use DPDK for zero-copy delivery and high performance. On the other hand, L4/L7 middleboxes, which have a greater emphasis on functionality, take advantage of a full-fledged kernel-based system. L2/L3 NFs and L4/L7 middleboxes continue to be handled by distinct platforms on different nodes. This paper proposes MiddleNet that develops a unified network resident function framework that supports L2/L3 NFs and L4/L7 middleboxes. MiddleNet supports function chains that are essential in both NFV and middlebox environments. MiddleNet uses the Data Plane Development Kit (DPDK) library for zero-copy packet delivery without interrupt-based processing, to enable the "bump-in-the-wire" L2/L3 processing performance required of NFV. To support L4/L7 middlebox functionality, MiddleNet utilizes a consolidated, kernel-based protocol stack for processing, avoiding a dedicated protocol stack for each function. MiddleNet fully exploits the event-driven capabilities of the extended Berkeley Packet Filter (eBPF) and seamlessly integrates it with shared memory for high-performance communication in L4/L7 middlebox function chains. The overheads for MiddleNet in L4/L7 are strictly load-proportional, without needing the dedicated CPU cores of DPDK-based approaches. MiddleNet supports flow-dependent packet processing by leveraging Single Root I/O Virtualization (SR-IOV) to dynamically select the packet processing needed (Layers 2 - 7). Our experimental results show that MiddleNet achieves high performance in such a unified environment. △ Less

Submitted 30 March, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

arXiv:2303.00369 [pdf, other]

Indescribable Multi-modal Spatial Evaluator

Authors: Lingke Kong, X. Sharon Qi, Qi** Shen, Jiacheng Wang, **gyi Zhang, Yanle Hu, Qichao Zhou

Abstract: Multi-modal image registration spatially aligns two images with different distributions. One of its major challenges is that images acquired from different imaging machines have different imaging distributions, making it difficult to focus only on the spatial aspect of the images and ignore differences in distributions. In this study, we developed a self-supervised approach, Indescribable Multi-mo… ▽ More Multi-modal image registration spatially aligns two images with different distributions. One of its major challenges is that images acquired from different imaging machines have different imaging distributions, making it difficult to focus only on the spatial aspect of the images and ignore differences in distributions. In this study, we developed a self-supervised approach, Indescribable Multi-model Spatial Evaluator (IMSE), to address multi-modal image registration. IMSE creates an accurate multi-modal spatial evaluator to measure spatial differences between two images, and then optimizes registration by minimizing the error predicted of the evaluator. To optimize IMSE performance, we also proposed a new style enhancement method called Shuffle Remap which randomizes the image distribution into multiple segments, and then randomly disorders and remaps these segments, so that the distribution of the original image is changed. Shuffle Remap can help IMSE to predict the difference in spatial location from unseen target distributions. Our results show that IMSE outperformed the existing methods for registration using T1-T2 and CT-MRI datasets. IMSE also can be easily integrated into the traditional registration process, and can provide a convenient way to evaluate and visualize registration results. IMSE also has the potential to be used as a new paradigm for image-to-image translation. Our code is available at https://github.com/Kid-Liet/IMSE. △ Less

Submitted 1 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

Comments: Accepted by CVPR2023

arXiv:2302.11851 [pdf]

Theoretical Evaluation of the Capacity-Achieving Distribution for IM-DD Fiber-Optic Channels

Authors: Dongdong Zou, Wei Wang, Sui Qi, Fan Li, Zhaohui Li

Abstract: The capacity and capacity-achieving distribution for intensity-modulation and direct-detection (IM-DD) fiber-optic channels is theoretically investigated. Different from coherent fiber-optic channels, we indicate that the capacity-achieving distribution of IM-DD systems should be discussed separately in two cases: 1) IM-DD systems without optical amplifier, which are constrained in peak power; 2)… ▽ More The capacity and capacity-achieving distribution for intensity-modulation and direct-detection (IM-DD) fiber-optic channels is theoretically investigated. Different from coherent fiber-optic channels, we indicate that the capacity-achieving distribution of IM-DD systems should be discussed separately in two cases: 1) IM-DD systems without optical amplifier, which are constrained in peak power; 2) IM-DD systems with optical amplifier, which are the average power constraint (APC) system. For the two models, the maximum mutual information achieving distribution, instead of the maximum input entropy achieving distribution, is numerically computed by the iterative Blahut-Arimoto (BA) algorithm. For the IM-DD system under peak power constraint (PPC), a dynamic-assignment BA algorithm is applied to find the capacity-achieving distribution with minimum cardinality. It is observed that the maximum difference between the minimum input cardinality and capacity is around 0.8 bits. For a fixed support input cardinality, although the observed sha** gain is small and only appears in low peak-signal-to-noise ratio (PSNR) regions in the PPC IM-DD system, the probabilistic sha** technique can also be used to introduce rate adaptation to the system by adjusting the sha** and FEC overheads since the capacity-achieving distribution is symmetric. In the IM-DD system under APC, a modified BA algorithm is investigated to solve for the capacity and capacity-achieving distribution, and a significant sha** gain is observed. For PAM8 and PAM16 modulation formats, 0.294 bits/symbol and 0.531 bits/symbol sha** gain can be obtained at the SNR of 20dB. Furthermore, since the capacity-achieving distribution is asymmetric in this case, a practical discussion of the PS technique is also presented. △ Less

Submitted 23 February, 2023; originally announced February 2023.

arXiv:2301.07409 [pdf, other]

doi 10.1109/TPAMI.2024.3386985

Representing Noisy Image Without Denoising

Authors: Shuren Qi, Yushu Zhang, Chao Wang, Tao Xiang, Xiaochun Cao, Yong Xiang

Abstract: A long-standing topic in artificial intelligence is the effective recognition of patterns from noisy images. In this regard, the recent data-driven paradigm considers 1) improving the representation robustness by adding noisy samples in training phase (i.e., data augmentation) or 2) pre-processing the noisy image by learning to solve the inverse problem (i.e., image denoising). However, such metho… ▽ More A long-standing topic in artificial intelligence is the effective recognition of patterns from noisy images. In this regard, the recent data-driven paradigm considers 1) improving the representation robustness by adding noisy samples in training phase (i.e., data augmentation) or 2) pre-processing the noisy image by learning to solve the inverse problem (i.e., image denoising). However, such methods generally exhibit inefficient process and unstable result, limiting their practical applications. In this paper, we explore a non-learning paradigm that aims to derive robust representation directly from noisy images, without the denoising as pre-processing. Here, the noise-robust representation is designed as Fractional-order Moments in Radon space (FMR), with also beneficial properties of orthogonality and rotation invariance. Unlike earlier integer-order methods, our work is a more generic design taking such classical methods as special cases, and the introduced fractional-order parameter offers time-frequency analysis capability that is not available in classical methods. Formally, both implicit and explicit paths for constructing the FMR are discussed in detail. Extensive simulation experiments and an image security application are provided to demonstrate the uniqueness and usefulness of our FMR, especially for noise robustness, rotation invariance, and time-frequency discriminability. △ Less

Submitted 19 June, 2024; v1 submitted 18 January, 2023; originally announced January 2023.

Comments: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

arXiv:2301.02423 [pdf, other]

Learning Personalized Brain Functional Connectivity of MDD Patients from Multiple Sites via Federated Bayesian Networks

Authors: Shuai Liu, Xiao Guo, Shun Qi, Huaning Wang, Xiangyu Chang

Abstract: Identifying functional connectivity biomarkers of major depressive disorder (MDD) patients is essential to advance understanding of the disorder mechanisms and early intervention. However, due to the small sample size and the high dimension of available neuroimaging data, the performance of existing methods is often limited. Multi-site data could enhance the statistical power and sample size, whil… ▽ More Identifying functional connectivity biomarkers of major depressive disorder (MDD) patients is essential to advance understanding of the disorder mechanisms and early intervention. However, due to the small sample size and the high dimension of available neuroimaging data, the performance of existing methods is often limited. Multi-site data could enhance the statistical power and sample size, while they are often subject to inter-site heterogeneity and data-sharing policies. In this paper, we propose a federated joint estimator, NOTEARS-PFL, for simultaneous learning of multiple Bayesian networks (BNs) with continuous optimization, to identify disease-induced alterations in MDD patients. We incorporate information shared between sites and site-specific information into the proposed federated learning framework to learn personalized BN structures by introducing the group fused lasso penalty. We develop the alternating direction method of multipliers, where in the local update step, the neuroimaging data is processed at each local site. Then the learned network structures are transmitted to the center for the global update. In particular, we derive a closed-form expression for the local update step and use the iterative proximal projection method to deal with the group fused lasso penalty in the global update step. We evaluate the performance of the proposed method on both synthetic and real-world multi-site rs-fMRI datasets. The results suggest that the proposed NOTEARS-PFL yields superior effectiveness and accuracy than the comparable methods. △ Less

Submitted 6 January, 2023; originally announced January 2023.

arXiv:2212.14295 [pdf, ps, other]

doi 10.1103/PhysRevA.107.042412

Generating entangled states from coherent states in circuit-QED

Authors: Shi-fan Qi, Jun **g

Abstract: Entangled states are self-evidently important to a wide range of applications in quantum communication and quantum information processing. We propose an efficient and convenient two-step protocol for generating Bell states and NOON states of two microwave resonators from merely coherent states. In particular, we derive an effective Hamiltonian for resonators coupled to a superconducting $Λ$-type q… ▽ More Entangled states are self-evidently important to a wide range of applications in quantum communication and quantum information processing. We propose an efficient and convenient two-step protocol for generating Bell states and NOON states of two microwave resonators from merely coherent states. In particular, we derive an effective Hamiltonian for resonators coupled to a superconducting $Λ$-type qutrit in the dispersive regime. By the excitation-number-dependent Stark shifts of the qutrit transition frequencies, we are able to individually control the amplitudes of specified Fock states of the resonators associated with relevant qutrit transition, using carefully tailored microwave drive signals. Thereby an arbitrary bipartite entangled state in Fock space can be generated by a typical evolution-and-measurement procedure. We analysis the undesired state transitions and the robustness of our protocol against the systematic errors from the microwave driving intensity and frequency, the quantum decoherence of all components, and the crosstalk of two resonators. In addition, we demonstrate that our protocol can be extended to a similar scenario with a $Ξ$-type qutrit. △ Less

Submitted 10 April, 2023; v1 submitted 29 December, 2022; originally announced December 2022.

Comments: 13 pages, 11 figures, 1 table

Journal ref: Physical Review A 107, 042412 (2023)

arXiv:2212.07651 [pdf, other]

Two-stage Contextual Transformer-based Convolutional Neural Network for Airway Extraction from CT Images

Authors: Yanan Wu, Shuiqing Zhao, Shouliang Qi, Jie Feng, Haowen Pang, Runsheng Chang, Long Bai, Mengqi Li, Shuyue Xia, Wei Qian, Hongliang Ren

Abstract: Accurate airway extraction from computed tomography (CT) images is a critical step for planning navigation bronchoscopy and quantitative assessment of airway-related chronic obstructive pulmonary disease (COPD). The existing methods are challenging to sufficiently segment the airway, especially the high-generation airway, with the constraint of the limited label and cannot meet the clinical use in… ▽ More Accurate airway extraction from computed tomography (CT) images is a critical step for planning navigation bronchoscopy and quantitative assessment of airway-related chronic obstructive pulmonary disease (COPD). The existing methods are challenging to sufficiently segment the airway, especially the high-generation airway, with the constraint of the limited label and cannot meet the clinical use in COPD. We propose a novel two-stage 3D contextual transformer-based U-Net for airway segmentation using CT images. The method consists of two stages, performing initial and refined airway segmentation. The two-stage model shares the same subnetwork with different airway masks as input. Contextual transformer block is performed both in the encoder and decoder path of the subnetwork to finish high-quality airway segmentation effectively. In the first stage, the total airway mask and CT images are provided to the subnetwork, and the intrapulmonary airway mask and corresponding CT scans to the subnetwork in the second stage. Then the predictions of the two-stage method are merged as the final prediction. Extensive experiments were performed on in-house and multiple public datasets. Quantitative and qualitative analysis demonstrate that our proposed method extracted much more branches and lengths of the tree while accomplishing state-of-the-art airway segmentation performance. The code is available at https://github.com/zhaozsq/airway_segmentation. △ Less

Submitted 15 December, 2022; originally announced December 2022.

arXiv:2212.05758 [pdf, other]

BEV-MAE: Bird's Eye View Masked Autoencoders for Point Cloud Pre-training in Autonomous Driving Scenarios

Authors: Zhiwei Lin, Yongtao Wang, Shengxiang Qi, Nan Dong, Ming-Hsuan Yang

Abstract: Existing LiDAR-based 3D object detection methods for autonomous driving scenarios mainly adopt the training-from-scratch paradigm. Unfortunately, this paradigm heavily relies on large-scale labeled data, whose collection can be expensive and time-consuming. Self-supervised pre-training is an effective and desirable way to alleviate this dependence on extensive annotated data. In this work, we pres… ▽ More Existing LiDAR-based 3D object detection methods for autonomous driving scenarios mainly adopt the training-from-scratch paradigm. Unfortunately, this paradigm heavily relies on large-scale labeled data, whose collection can be expensive and time-consuming. Self-supervised pre-training is an effective and desirable way to alleviate this dependence on extensive annotated data. In this work, we present BEV-MAE, an efficient masked autoencoder pre-training framework for LiDAR-based 3D object detection in autonomous driving. Specifically, we propose a bird's eye view (BEV) guided masking strategy to guide the 3D encoder learning feature representation in a BEV perspective and avoid complex decoder design during pre-training. Furthermore, we introduce a learnable point token to maintain a consistent receptive field size of the 3D encoder with fine-tuning for masked point cloud inputs. Based on the property of outdoor point clouds in autonomous driving scenarios, i.e., the point clouds of distant objects are more sparse, we propose point density prediction to enable the 3D encoder to learn location information, which is essential for object detection. Experimental results show that BEV-MAE surpasses prior state-of-the-art self-supervised methods and achieves a favorably pre-training efficiency. Furthermore, based on TransFusion-L, BEV-MAE achieves new state-of-the-art LiDAR-based 3D object detection results, with 73.6 NDS and 69.6 mAP on the nuScenes benchmark. The source code will be released at https://github.com/VDIGPKU/BEV-MAE △ Less

Submitted 20 January, 2024; v1 submitted 12 December, 2022; originally announced December 2022.

Comments: Accepted at AAAI 2024

arXiv:2212.00532 [pdf, other]

EBHI-Seg: A Novel Enteroscope Biopsy Histopathological Haematoxylin and Eosin Image Dataset for Image Segmentation Tasks

Authors: Liyu Shi, Xiaoyan Li, Weiming Hu, Haoyuan Chen, **g Chen, Zizhen Fan, Minghe Gao, Yujie **g, Guotao Lu, Deguo Ma, Zhiyu Ma, Qingtao Meng, Dechao Tang, Hongzan Sun, Marcin Grzegorzek, Shouliang Qi, Yueyang Teng, Chen Li

Abstract: Background and Purpose: Colorectal cancer is a common fatal malignancy, the fourth most common cancer in men, and the third most common cancer in women worldwide. Timely detection of cancer in its early stages is essential for treating the disease. Currently, there is a lack of datasets for histopathological image segmentation of rectal cancer, which often hampers the assessment accuracy when comp… ▽ More Background and Purpose: Colorectal cancer is a common fatal malignancy, the fourth most common cancer in men, and the third most common cancer in women worldwide. Timely detection of cancer in its early stages is essential for treating the disease. Currently, there is a lack of datasets for histopathological image segmentation of rectal cancer, which often hampers the assessment accuracy when computer technology is used to aid in diagnosis. Methods: This present study provided a new publicly available Enteroscope Biopsy Histopathological Hematoxylin and Eosin Image Dataset for Image Segmentation Tasks (EBHI-Seg). To demonstrate the validity and extensiveness of EBHI-Seg, the experimental results for EBHI-Seg are evaluated using classical machine learning methods and deep learning methods. Results: The experimental results showed that deep learning methods had a better image segmentation performance when utilizing EBHI-Seg. The maximum accuracy of the Dice evaluation metric for the classical machine learning method is 0.948, while the Dice evaluation metric for the deep learning method is 0.965. Conclusion: This publicly available dataset contained 5,170 images of six types of tumor differentiation stages and the corresponding ground truth images. The dataset can provide researchers with new segmentation algorithms for medical diagnosis of colorectal cancer, which can be used in the clinical setting to help doctors and patients. △ Less

Submitted 6 December, 2022; v1 submitted 1 December, 2022; originally announced December 2022.

arXiv:2211.16703 [pdf, other]

An Efficient Split Fine-tuning Framework for Edge and Cloud Collaborative Learning

Authors: Shaohuai Shi, Qing Yang, Yang Xiang, Shuhan Qi, Xuan Wang

Abstract: To enable the pre-trained models to be fine-tuned with local data on edge devices without sharing data with the cloud, we design an efficient split fine-tuning (SFT) framework for edge and cloud collaborative learning. We propose three novel techniques in this framework. First, we propose a matrix decomposition-based method to compress the intermediate output of a neural network to reduce the comm… ▽ More To enable the pre-trained models to be fine-tuned with local data on edge devices without sharing data with the cloud, we design an efficient split fine-tuning (SFT) framework for edge and cloud collaborative learning. We propose three novel techniques in this framework. First, we propose a matrix decomposition-based method to compress the intermediate output of a neural network to reduce the communication volume between the edge device and the cloud server. Second, we eliminate particular links in the model without affecting the convergence performance in fine-tuning. Third, we implement our system atop PyTorch to allow users to easily extend their existing training scripts to enjoy the efficient edge and cloud collaborative learning. Experiments results on 9 NLP datasets show that our framework can reduce the communication traffic by 96 times with little impact on the model accuracy. △ Less

Submitted 29 November, 2022; originally announced November 2022.

Comments: 7 pages

arXiv:2211.15972 [pdf, ps, other]

Interlayer ferromagnetism and insulator-metal transition in element-doped CrI3 thin films

Authors: Shiyang Sun, Xuyan Chen, Xuqi Li, Huihui Zhang, Haidan Sang, Shifei Qi, Zhenhua Qiao

Abstract: The exploration of magnetism in two-dimensional layered materials has attracted extensive research interest. For the monoclinic phase CrI3 with interlayer antiferromagnetism, finding a static and robust way of realizing the intrinsic interlayer ferromagnetic coupling is desirable. In this Letter, we study the electronic structure and magnetic properties of the nonmagnetic element (e.g., O, S, Se,… ▽ More The exploration of magnetism in two-dimensional layered materials has attracted extensive research interest. For the monoclinic phase CrI3 with interlayer antiferromagnetism, finding a static and robust way of realizing the intrinsic interlayer ferromagnetic coupling is desirable. In this Letter, we study the electronic structure and magnetic properties of the nonmagnetic element (e.g., O, S, Se, N, P, As and C) doped bi- and triple-layer CrI3 systems via first-principles calculations. Our results demonstrate that O, P, S, As, and Se doped CrI3 bilayer can realize interlayer ferromagnetism. Further analysis shows that the interlayer ferromagnetic coupling in the doped few-layer CrI3 is closely related to the formation of localized spin-polarized state. This finding indicates that insulated interlayer ferromagnetism can be realized at high do** concentration (larger than 8.33%). When the do** concentration is less than 8.33%, but larger than 2.08%, an insulator-metal phase transition can occur since the localized spin-polarized states percolate to form contiguous grids in few-layer CrI3. △ Less

Submitted 29 November, 2022; originally announced November 2022.

arXiv:2211.11016 [pdf]

doi 10.1088/1361-6463/acd560

The activated scaling behavior of quantum Griffiths singularity in two-dimensional superconductors

Authors: Zihan Cui, Longxin Pan, **gchao Fang, Shichao Qi, Ying Xing, Haiwen Liu, Yi Liu, Jian Wang

Abstract: Quantum Griffiths singularity is characterized by the divergence of the dynamical critical exponent with the activated scaling law and has been widely observed in various two-dimensional superconductors. Recently, the direct activated scaling analysis with the irrelevant correction has been proposed and successfully used to analyze the experimental data of crystalline PdTe2 and polycrystalline \b{… ▽ More Quantum Griffiths singularity is characterized by the divergence of the dynamical critical exponent with the activated scaling law and has been widely observed in various two-dimensional superconductors. Recently, the direct activated scaling analysis with the irrelevant correction has been proposed and successfully used to analyze the experimental data of crystalline PdTe2 and polycrystalline \b{eta}-W films, which provides new evidence of quantum Griffiths singularity. Here we show that the direct activated scaling analysis is applicable to the experimental data in different superconducting films, including tri-layer Ga films and LaAlO3/SrTiO3 interface superconductor. When taking the irrelevant correction into account, we calculate the corrected sheet resistance at ultralow temperatures. The scaling behavior of the corrected resistance in a comparably large temperature regime and the theoretical fitting of the phase boundary give unambiguous evidence of quantum Griffiths singularity. Compared to the previous method based on the finite size scaling, the direct activated scaling analysis represents a more direct and precise way to analyze the experimental data of quantum Griffiths singularity in diverse two-dimensional superconductors. △ Less

Submitted 20 November, 2022; originally announced November 2022.

arXiv:2211.08914 [pdf, other]

Dual Class-Aware Contrastive Federated Semi-Supervised Learning

Authors: Qi Guo, Yong Qi, Saiyu Qi, Di Wu

Abstract: Federated semi-supervised learning (FSSL), facilitates labeled clients and unlabeled clients jointly training a global model without sharing private data. Existing FSSL methods predominantly employ pseudo-labeling and consistency regularization to exploit the knowledge of unlabeled data, achieving notable success in raw data utilization. However, these training processes are hindered by large devi… ▽ More Federated semi-supervised learning (FSSL), facilitates labeled clients and unlabeled clients jointly training a global model without sharing private data. Existing FSSL methods predominantly employ pseudo-labeling and consistency regularization to exploit the knowledge of unlabeled data, achieving notable success in raw data utilization. However, these training processes are hindered by large deviations between uploaded local models of labeled and unlabeled clients, as well as confirmation bias introduced by noisy pseudo-labels, both of which negatively affect the global model's performance. In this paper, we present a novel FSSL method called Dual Class-aware Contrastive Federated Semi-Supervised Learning (DCCFSSL). This method accounts for both the local class-aware distribution of each client's data and the global class-aware distribution of all clients' data within the feature space. By implementing a dual class-aware contrastive module, DCCFSSL establishes a unified training objective for different clients to tackle large deviations and incorporates contrastive information in the feature space to mitigate confirmation bias. Moreover, DCCFSSL introduces an authentication-reweighted aggregation technique to improve the server's aggregation robustness. Our comprehensive experiments show that DCCFSSL outperforms current state-of-the-art methods on three benchmark datasets and surpasses the FedAvg with relabeled unlabeled clients on CIFAR-10, CIFAR-100, and STL-10 datasets. To our knowledge, we are the first to present an FSSL method that utilizes only 10\% labeled clients, while still achieving superior performance compared to standard federated supervised learning, which uses all clients with labeled data. △ Less

Submitted 7 May, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

arXiv:2210.17023 [pdf]

doi 10.1038/s41467-023-42988-8

Rotational symmetry breaking in superconducting nickelate Nd0.8Sr0.2NiO2 films

Authors: Haoran Ji, Yanan Li, Yi Liu, Xiang Ding, Zheyuan Xie, Shichao Qi, Liang Qiao, Yi-feng Yang, Guang-Ming Zhang, Jian Wang

Abstract: The infinite-layer nickelates, isostructural to the high-Tc superconductor cuprates, have risen as a promising platform to host unconventional superconductivity and stimulated growing interests in the condensed matter community. Despite numerous researches, the superconducting pairing symmetry of the nickelate superconductors, the fundamental characteristic of a superconducting state, is still und… ▽ More The infinite-layer nickelates, isostructural to the high-Tc superconductor cuprates, have risen as a promising platform to host unconventional superconductivity and stimulated growing interests in the condensed matter community. Despite numerous researches, the superconducting pairing symmetry of the nickelate superconductors, the fundamental characteristic of a superconducting state, is still under debate. Moreover, the strong electronic correlation in the nickelates may give rise to a rich phase diagram, where the underlying interplay between the superconductivity and other emerging quantum states with broken symmetry is awaiting exploration. Here, we study the angular dependence of the transport properties on the infinite-layer nickelate Nd0.8Sr0.2NiO2 superconducting films with Corbino-disk configuration. The azimuthal angular dependence of the magnetoresistance (R(φ)) manifests the rotational symmetry breaking from isotropy to four-fold (C4) anisotropy with increasing magnetic field, revealing a symmetry breaking phase transition. Approaching the low temperature and large magnetic field regime, an additional two-fold (C2) symmetric component in the R(φ) curves and an anomalous upturn of the temperature-dependent critical field are observed simultaneously, suggesting the emergence of an exotic electronic phase. Our work uncovers the evolution of the quantum states with different rotational symmetries and provides deep insight into the global phase diagram of the nickelate superconductors. △ Less

Submitted 30 October, 2022; originally announced October 2022.

Journal ref: Nat Commun 14, 7155 (2023)

arXiv:2210.06235 [pdf, other]

Listening to Users' Voice: Automatic Summarization of Helpful App Reviews

Authors: Cuiyun Gao, Yaoxian Li, Shuhan Qi, Yang Liu, Xuan Wang, Zibin Zheng, Qing Liao

Abstract: App reviews are crowdsourcing knowledge of user experience with the apps, providing valuable information for app release planning, such as major bugs to fix and important features to add. There exist prior explorations on app review mining for release planning, however, most of the studies strongly rely on pre-defined classes or manually-annotated reviews. Also, the new review characteristic, i.e.… ▽ More App reviews are crowdsourcing knowledge of user experience with the apps, providing valuable information for app release planning, such as major bugs to fix and important features to add. There exist prior explorations on app review mining for release planning, however, most of the studies strongly rely on pre-defined classes or manually-annotated reviews. Also, the new review characteristic, i.e., the number of users who rated the review as helpful, which can help capture important reviews, has not been considered previously. In the paper, we propose a novel framework, named SOLAR, aiming at accurately summarizing helpful user reviews to developers. The framework mainly contains three modules: The review helpfulness prediction module, topic-sentiment modeling module, and multi-factor ranking module. The review helpfulness prediction module assesses the helpfulness of reviews, i.e., whether the review is useful for developers. The topic-sentiment modeling module groups the topics of the helpful reviews and also predicts the associated sentiment, and the multi-factor ranking module aims at prioritizing semantically representative reviews for each topic as the review summary. Experiments on five popular apps indicate that SOLAR is effective for review summarization and promising for facilitating app release planning. △ Less

Submitted 12 October, 2022; originally announced October 2022.

arXiv:2210.05261 [pdf, other]

Once is Enough: A Light-Weight Cross-Attention for Fast Sentence Pair Modeling

Authors: Yuanhang Yang, Shiyi Qi, Chuanyi Liu, Qifan Wang, Cuiyun Gao, Zenglin Xu

Abstract: Transformer-based models have achieved great success on sentence pair modeling tasks, such as answer selection and natural language inference (NLI). These models generally perform cross-attention over input pairs, leading to prohibitive computational costs. Recent studies propose dual-encoder and late interaction architectures for faster computation. However, the balance between the expressive of… ▽ More Transformer-based models have achieved great success on sentence pair modeling tasks, such as answer selection and natural language inference (NLI). These models generally perform cross-attention over input pairs, leading to prohibitive computational costs. Recent studies propose dual-encoder and late interaction architectures for faster computation. However, the balance between the expressive of cross-attention and computation speedup still needs better coordinated. To this end, this paper introduces a novel paradigm MixEncoder for efficient sentence pair modeling. MixEncoder involves a light-weight cross-attention mechanism. It conducts query encoding only once while modeling the query-candidate interaction in parallel. Extensive experiments conducted on four tasks demonstrate that our MixEncoder can speed up sentence pairing by over 113x while achieving comparable performance as the more expensive cross-attention models. △ Less

Submitted 22 October, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

Comments: Accepted to EMNLP 2023

arXiv:2210.04685 [pdf, other]

doi 10.1007/s11433-023-2247-x

The Nearest Neutron Star Candidate in a Binary Revealed by Optical Time-domain Surveys

Authors: Ling-Lin Zheng, Mouyuan Sun, Wei-Min Gu, Tuan Yi, Zhi-Xiang Zhang, Pei Wang, Junfeng Wang, Jianfeng Wu, Shan-Shan Weng, Song Wang, Sen-Yu Qi, Jia Zhang, Chun-Qian Li, Jian-Rong Shi, Yong Shao, Xiang-Dong Li, **-Bo Fu, Fan Yang, Zhongrui Bai, Yu Bai, Haotong Zhang, Jifeng Liu

Abstract: The near-Earth (within $\sim 100$ pc) supernova explosions in the past several million years can cause the global deposition of radioactive elements (e.g., $^{60}$Fe) on Earth. The remnants of such supernovae are too old to be easily identified. It is therefore of great interest to search for million-year-old near-Earth neutron stars or black holes, the products of supernovae. However, neutron sta… ▽ More The near-Earth (within $\sim 100$ pc) supernova explosions in the past several million years can cause the global deposition of radioactive elements (e.g., $^{60}$Fe) on Earth. The remnants of such supernovae are too old to be easily identified. It is therefore of great interest to search for million-year-old near-Earth neutron stars or black holes, the products of supernovae. However, neutron stars and black holes are challenging to find even in our Solar neighbourhood if they are not radio pulsars or X-ray/$γ$-ray emitters. Here we report the discovery of one of the nearest ($127.7 \pm 0.3$ pc) neutron star candidates in a detached single-lined spectroscopic binary LAMOST J235456.73+335625.9 (hereafter J2354). Utilizing the time-resolved ground-based spectroscopy and space photometry, we find that J2354 hosts an unseen compact object with $M_{\mathrm{inv}}$ being $1.4 \sim 1.6\ M_{\odot}$. The follow-up Swift ultraviolet (UV) and X-ray observations suggest that the UV and X-ray emission is produced by the visible star rather than the compact object. Hence, J2354 probably harbours a neutron star rather than a hot ultramassive white dwarf. Two-hour exceptionally sensitive radio follow-up observations with Five-hundred-meter Aperture Spherical radio Telescope fail to reveal any pulsating radio signals at the $6σ$ flux upper limit of $12.5\ μ\mathrm{Jy}$. Therefore, the neutron star candidate in J2354 can only be revealed via our time-resolved observations. Interestingly, the distance between J2354 and our Earth can be as close as $\sim 50$ pc around $2.5$ Myrs ago, as revealed by the Gaia kinematics. Our discovery demonstrates a promising way to unveil the hidden near-Earth neutron stars in binaries by exploring the optical time domain, thereby facilitating understanding of the metal-enrichment history in our Solar neighbourhood. △ Less

Submitted 27 November, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

Comments: 26 pages, 13 figures

Journal ref: Sci. China-Phys. Mech. Astron. 66, 129512 (2023)

arXiv:2210.04552 [pdf]

Intrinsic motivation, Need for cognition, Grit, Growth Mindset and Academic Achievement in High School Students: Latent Profiles and Its Predictive Effects

Authors: Jun Wu, Shuoli Qi, Yueshan Zhong

Abstract: Recent efforts to identify non-cognitive predictors of academic achievement have especially focused on self-constructs, whose measurement is concerned with a specific domain (e.g., mathematics). However, other important factors, such as character and motivation, have received less attention. Additionally, the predictive accuracy of non-cognitive factors lacks evidence from subjects including Engli… ▽ More Recent efforts to identify non-cognitive predictors of academic achievement have especially focused on self-constructs, whose measurement is concerned with a specific domain (e.g., mathematics). However, other important factors, such as character and motivation, have received less attention. Additionally, the predictive accuracy of non-cognitive factors lacks evidence from subjects including English and Science. In this study, we take a person-centered approach and focus on students' intrinsic motivation, need for cognition, grit, and growth mindset. We mainly focus on how these factors predict students' mathematics, English, and science grades between 9th grade and 12th grade. 2,308 samples from high school students in Boston (Female = 1,237; aged from 13 to 17). The research results indicated that: (1) four latent profiles of students emerged: High in grit students (n = 997, 43.2%, higher scores of grit); Moderate students (n = 905, 38.3%, moderate in all scores); High in intrinsic motivation students (n = 252, 11.8%, higher scores of intrinsic motivation); Low in grit students (n = 154, 6.7%, lower scores of grit); (2) students' gender, race, maternal education level, and social-economic ranking predicted the profiles; and (3) four profiles of students had a significant predictive effect on Mathematics, Science and English scores in both 9th grade and 12th grade. We discussed the importance of character education for adolescents and motivation for learning in high school. △ Less

Submitted 10 October, 2022; originally announced October 2022.

Comments: 24 pages, 2 tables, 2 figures

MSC Class: 62P15 (Primary) ACM Class: G.2

arXiv:2209.13924 [pdf, other]

doi 10.3847/1538-4357/ac853f

A White Dwarf-Main Sequence Binary Unveiled by Time-Domain Observations from LAMOST and TESS

Authors: Ling-Lin Zheng, Wei-Min Gu, Mouyuan Sun, Zhixiang Zhang, Tuan Yi, Jianfeng Wu, Junfeng Wang, **-Bo Fu, Sen-Yu Qi, Fan Yang, Song Wang, Liang Wang, Zhongrui Bai, Haotong Zhang, Chun-Qian Li, Jian-Rong Shi, Weikai Zong, Yu Bai, Jifeng Liu

Abstract: We report a single-lined white dwarf-main sequence binary system, LAMOST J172900.17+652952.8, which is discovered by LAMOST's medium resolution time-domain surveys. The radial velocity semi-amplitude and orbital period of the optical visible star are measured by using the Palomar 200-inch telescope follow-up observations and the light curves from TESS. Thus the mass function of the invisible candi… ▽ More We report a single-lined white dwarf-main sequence binary system, LAMOST J172900.17+652952.8, which is discovered by LAMOST's medium resolution time-domain surveys. The radial velocity semi-amplitude and orbital period of the optical visible star are measured by using the Palomar 200-inch telescope follow-up observations and the light curves from TESS. Thus the mass function of the invisible candidate white dwarf is derived, $f(M_{\rm{2}}) = 0.120\,\pm\,0.003\,M_{\odot}$. The mass of the visible star is measured based on the spectral energy distribution fitting, $M_{\mathrm{1}}$ = $0.81^{+0.07}_{-0.06}\,M_{\odot}$. Hence, the mass of its invisible companion is $M_{\rm{2}}\,\gtrsim\,0.63\,M_{\odot}$. The companion ought to be a compact object rather than a main-sequence star owing to the mass ratio $q = M_{\rm{2}} / M_{\rm 1} \gtrsim 0.78$ and the single-lined spectra. The compact object is likely to be a white dwarf except for small inclination angle, $i\,\lesssim\,40^{\circ}$. By using the GALEX NUV flux, the effective temperature of the white dwarf candidate is constrained as $T_{\rm eff}^{\rm WD}\,\lesssim\,12000-13500$ K. It is difficult to detect white dwarfs which are outshone by their bright companions via single-epoch optical spectroscopic surveys. Therefore, the optical time-domain surveys can play an important role in unveiling invisible white dwarfs and other compact objects in binaries. △ Less

Submitted 28 September, 2022; originally announced September 2022.

Comments: 15 pages, 6 figures, ApJ, 936, 33

arXiv:2209.12141 [pdf, other]

doi 10.1038/s41550-022-01766-0

A dynamically discovered and characterized non-accreting neutron star -- M dwarf binary candidate

Authors: Tuan Yi, Wei-Min Gu, Zhi-Xiang Zhang, Ling-Lin Zheng, Mouyuan Sun, Junfeng Wang, Zhongrui Bai, Pei Wang, Jianfeng Wu, Yu Bai, Song Wang, Haotong Zhang, Yize Dong, Yong Shao, Xiang-Dong Li, Jia Zhang, Yang Huang, Fan Yang, Qingzheng Yu, Hui-Jun Mu, **-Bo Fu, Senyu Qi, **g Guo, Xuan Fang, Chuanjie Zheng , et al. (4 additional authors not shown)

Abstract: Optical time-domain surveys can unveil and characterize exciting but less-explored non-accreting and/or non-beaming neutron stars (NS) in binaries. Here we report the discovery of such a NS candidate using the LAMOST spectroscopic survey. The candidate, designated LAMOST J112306.9+400736 (hereafter J1123), is in a single-lined spectroscopic binary containing an optically visible M star. The star's… ▽ More Optical time-domain surveys can unveil and characterize exciting but less-explored non-accreting and/or non-beaming neutron stars (NS) in binaries. Here we report the discovery of such a NS candidate using the LAMOST spectroscopic survey. The candidate, designated LAMOST J112306.9+400736 (hereafter J1123), is in a single-lined spectroscopic binary containing an optically visible M star. The star's large radial velocity variation and ellipsoidal variations indicate a relatively massive unseen companion. Utilizing follow-up spectroscopy from the Palomar 200-inch telescope and high-precision photometry from TESS, we measure a companion mass of $1.24_{-0.03}^{+0.03}~M_{\odot}$. Main-sequence stars with this mass are ruled out, leaving a NS or a massive white dwarf (WD). Although a massive WD cannot be ruled out, the lack of UV excess radiation from the companion supports the NS hypothesis. Deep radio observations with FAST yielded no detections of either pulsed or persistent emission. J1123 is not detected in numerous X-ray and gamma-ray surveys. These non-detections suggest that the NS candidate is not presently accreting and pulsing. Our work exemplifies the capability of discovering compact objects in non-accreting close binaries by synergizing the optical time-domain spectroscopy and high-cadence photometry. △ Less

Submitted 25 September, 2022; originally announced September 2022.

Comments: 53 pages, 15 figures, publication in Nature Astronomy

arXiv:2208.10731 [pdf, other]

FedMCSA: Personalized Federated Learning via Model Components Self-Attention

Authors: Qi Guo, Yong Qi, Saiyu Qi, Di Wu, Qian Li

Abstract: Federated learning (FL) facilitates multiple clients to jointly train a machine learning model without sharing their private data. However, Non-IID data of clients presents a tough challenge for FL. Existing personalized FL approaches rely heavily on the default treatment of one complete model as a basic unit and ignore the significance of different layers on Non-IID data of clients. In this work,… ▽ More Federated learning (FL) facilitates multiple clients to jointly train a machine learning model without sharing their private data. However, Non-IID data of clients presents a tough challenge for FL. Existing personalized FL approaches rely heavily on the default treatment of one complete model as a basic unit and ignore the significance of different layers on Non-IID data of clients. In this work, we propose a new framework, federated model components self-attention (FedMCSA), to handle Non-IID data in FL, which employs model components self-attention mechanism to granularly promote cooperation between different clients. This mechanism facilitates collaboration between similar model components while reducing interference between model components with large differences. We conduct extensive experiments to demonstrate that FedMCSA outperforms the previous methods on four benchmark datasets. Furthermore, we empirically show the effectiveness of the model components self-attention mechanism, which is complementary to existing personalized FL and can significantly improve the performance of FL. △ Less

Submitted 23 August, 2022; originally announced August 2022.

Comments: The first submission of this work is to AAAI2022 in 20210829. Now the new submission for review

arXiv:2208.02976 [pdf, ps, other]

doi 10.1103/PhysRevA.107.013702

Floquet generation of magnonic NOON state

Authors: Shi-fan Qi, Jun **g

Abstract: We propose a concise and deterministic protocol to generate NOON states in a hybrid system consisting of a superconducting qubit, a circuit resonator mode, and two magnonic modes, based on Floquet engineering. In particular, we construct a time-reversal-symmetry broken Hamiltonian for chiral state propagation of the three continuous-variable modes depending on qubit state, by the time modulation o… ▽ More We propose a concise and deterministic protocol to generate NOON states in a hybrid system consisting of a superconducting qubit, a circuit resonator mode, and two magnonic modes, based on Floquet engineering. In particular, we construct a time-reversal-symmetry broken Hamiltonian for chiral state propagation of the three continuous-variable modes depending on qubit state, by the time modulation over qubit-resonator interaction and magnon frequency. Then an arbitrary magnonic NOON state can be generated by a typical preparing-and-measurement procedure. We analyze the robustness of our protocol against the systematic errors in the qubit-magnon coupling strength, the Floquet-driving intensity, the frequency mismatch of the magnons, and the counter-rotating interactions. We can obtain a high-fidelity NOON state in the presence of the quantum dissipation on all components. △ Less

Submitted 4 January, 2023; v1 submitted 4 August, 2022; originally announced August 2022.

Journal ref: Phys. Rev. A 107, 013702 (2023)

arXiv:2207.09135 [pdf, other]

doi 10.1109/TIFS.2023.3234861

Shrinking the Semantic Gap: Spatial Pooling of Local Moment Invariants for Copy-Move Forgery Detection

Authors: Chao Wang, Zhiqiu Huang, Shuren Qi, Yaoshen Yu, Guohua Shen, Yushu Zhang

Abstract: Copy-move forgery is a manipulation of copying and pasting specific patches from and to an image, with potentially illegal or unethical uses. Recent advances in the forensic methods for copy-move forgery have shown increasing success in detection accuracy and robustness. However, for images with high self-similarity or strong signal corruption, the existing algorithms often exhibit inefficient pro… ▽ More Copy-move forgery is a manipulation of copying and pasting specific patches from and to an image, with potentially illegal or unethical uses. Recent advances in the forensic methods for copy-move forgery have shown increasing success in detection accuracy and robustness. However, for images with high self-similarity or strong signal corruption, the existing algorithms often exhibit inefficient processes and unreliable results. This is mainly due to the inherent semantic gap between low-level visual representation and high-level semantic concept. In this paper, we present a very first study of trying to mitigate the semantic gap problem in copy-move forgery detection, with spatial pooling of local moment invariants for midlevel image representation. Our detection method expands the traditional works on two aspects: 1) we introduce the bag-of-visual-words model into this field for the first time, may meaning a new perspective of forensic study; 2) we propose a word-to-phrase feature description and matching pipeline, covering the spatial structure and visual saliency information of digital images. Extensive experimental results show the superior performance of our framework over state-of-the-art algorithms in overcoming the related problems caused by the semantic gap. △ Less

Submitted 17 January, 2023; v1 submitted 19 July, 2022; originally announced July 2022.

Comments: Accepted by IEEE Transactions on Information Forensics and Security, 2023, https://ieeexplore.ieee.org/document/10007894

Journal ref: IEEE Transactions on Information Forensics and Security, vol. 18, pp. 1064-1079, 2023

arXiv:2207.05434 [pdf, other]

doi 10.3847/1538-4357/ac9b4c

Searching for compact objects in binaries with Gaia DR3

Authors: **-Bo Fu, Wei-Min Gu, Zhi-Xiang Zhang, Tuan Yi, Sen-Yu Qi, Ling-Lin Zheng, Jifeng Liu

Abstract: We search for compact objects in binaries based on Gaia DR3. A sample of ten targets is derived under the conditions: radial velocity variable, low temperature ($T_{\rm eff} < 6000$ K), high mass function ($f(M_2) > 1 M_\odot$), and ellipsoidal-like light curves. Two targets have LAMOST spectroscopic observations, one of which is a double-lined spectroscopic binary. The observational data of seven… ▽ More We search for compact objects in binaries based on Gaia DR3. A sample of ten targets is derived under the conditions: radial velocity variable, low temperature ($T_{\rm eff} < 6000$ K), high mass function ($f(M_2) > 1 M_\odot$), and ellipsoidal-like light curves. Two targets have LAMOST spectroscopic observations, one of which is a double-lined spectroscopic binary. The observational data of seven targets are not self-consistent, since their photometric periods are even shorter than the theoretical minimum orbital periods calculated by the stellar parameters from Gaia DR3. After excluding these seven inconsistent targets and another target contaminated by a near-bright star, the remaining two targets may contain compact objects worth follow-up observations. This work may serve as an example to demonstrate the feasibility of searching for compact objects in the massive Gaia data. △ Less

Submitted 18 October, 2022; v1 submitted 12 July, 2022; originally announced July 2022.

Comments: 15 pages, 8 figures, accepted for publication in ApJ

arXiv:2207.04285 [pdf, other]

A Closer Look into Transformer-Based Code Intelligence Through Code Transformation: Challenges and Opportunities

Authors: Yaoxian Li, Shiyi Qi, Cuiyun Gao, Yun Peng, David Lo, Zenglin Xu, Michael R. Lyu

Abstract: Transformer-based models have demonstrated state-of-the-art performance in many intelligent coding tasks such as code comment generation and code completion. Previous studies show that deep learning models are sensitive to the input variations, but few studies have systematically studied the robustness of Transformer under perturbed input code. In this work, we empirically study the effect of sema… ▽ More Transformer-based models have demonstrated state-of-the-art performance in many intelligent coding tasks such as code comment generation and code completion. Previous studies show that deep learning models are sensitive to the input variations, but few studies have systematically studied the robustness of Transformer under perturbed input code. In this work, we empirically study the effect of semantic-preserving code transformation on the performance of Transformer. Specifically, 24 and 27 code transformation strategies are implemented for two popular programming languages, Java and Python, respectively. For facilitating analysis, the strategies are grouped into five categories: block transformation, insertion/deletion transformation, grammatical statement transformation, grammatical token transformation, and identifier transformation. Experiments on three popular code intelligence tasks, including code completion, code summarization and code search, demonstrate insertion/deletion transformation and identifier transformation show the greatest impact on the performance of Transformer. Our results also suggest that Transformer based on abstract syntax trees (ASTs) shows more robust performance than the model based on only code sequence under most code transformations. Besides, the design of positional encoding can impact the robustness of Transformer under code transformation. Based on our findings, we distill some insights about the challenges and opportunities for Transformer-based code intelligence. △ Less

Submitted 9 July, 2022; originally announced July 2022.

arXiv:2207.01426 [pdf, other]

Dynamic Contrastive Distillation for Image-Text Retrieval

Authors: Jun Rao, Liang Ding, Shuhan Qi, Meng Fang, Yang Liu, Li Shen, Dacheng Tao

Abstract: Although the vision-and-language pretraining (VLP) equipped cross-modal image-text retrieval (ITR) has achieved remarkable progress in the past two years, it suffers from a major drawback: the ever-increasing size of VLP models restricts its deployment to real-world search scenarios (where the high latency is unacceptable). To alleviate this problem, we present a novel plug-in dynamic contrastive… ▽ More Although the vision-and-language pretraining (VLP) equipped cross-modal image-text retrieval (ITR) has achieved remarkable progress in the past two years, it suffers from a major drawback: the ever-increasing size of VLP models restricts its deployment to real-world search scenarios (where the high latency is unacceptable). To alleviate this problem, we present a novel plug-in dynamic contrastive distillation (DCD) framework to compress the large VLP models for the ITR task. Technically, we face the following two challenges: 1) the typical uni-modal metric learning approach is difficult to directly apply to the cross-modal tasks, due to the limited GPU memory to optimize too many negative samples during handling cross-modal fusion features. 2) it is inefficient to static optimize the student network from different hard samples, which have different effects on distillation learning and student network optimization. We try to overcome these challenges from two points. First, to achieve multi-modal contrastive learning, and balance the training costs and effects, we propose to use a teacher network to estimate the difficult samples for students, making the students absorb the powerful knowledge from pre-trained teachers, and master the knowledge from hard samples. Second, to dynamic learn from hard sample pairs, we propose dynamic distillation to dynamically learn samples of different difficulties, from the perspective of better balancing the difficulty of knowledge and students' self-learning ability. We successfully apply our proposed DCD strategy to two state-of-the-art vision-language pretrained models, i.e. ViLT and METER. Extensive experiments on MS-COCO and Flickr30K benchmarks show the effectiveness and efficiency of our DCD framework. Encouragingly, we can speed up the inference at least 129$\times$ compared to the existing ITR models. △ Less

Submitted 4 July, 2022; originally announced July 2022.

arXiv:2206.09540 [pdf, ps, other]

doi 10.1103/PhysRevA.106.033711

Chiral current in Floquet cavity-magnonics

Authors: Shi-fan Qi, Jun **g

Abstract: Floquet engineering can induce complex collective behaviour and interesting synthetic gauge-field in quantum systems through temporal modulation of system parameters by periodic drives. Using a Floquet drive on frequencies of the magnon modes, we realize a chiral state-transfer in a cavity-magnonic system. The time-reversal symmetry is broken in such a promising platform for coherent information p… ▽ More Floquet engineering can induce complex collective behaviour and interesting synthetic gauge-field in quantum systems through temporal modulation of system parameters by periodic drives. Using a Floquet drive on frequencies of the magnon modes, we realize a chiral state-transfer in a cavity-magnonic system. The time-reversal symmetry is broken in such a promising platform for coherent information processing. In particular, the photon mode is adiabatically eliminated in the large-detuning regime and the magnon modes under conditional longitudinal drives can be indirectly coupled to each other with a phase-modulated interaction. The effective Hamiltonian is then used to generate chiral currents in a circular loop, whose dynamics is evaluated to measure the symmetry of the system Hamiltonian. Beyond the dynamics limited in the manifold with a fixed number of excitations, our protocol applies to the continuous-variable systems with arbitrary states. Also it is found to be robust against the systematic errors in the photon-magnon coupling strength and Kerr nonlinearity. △ Less

Submitted 10 October, 2022; v1 submitted 19 June, 2022; originally announced June 2022.

Journal ref: Physical Review A 106,033711 (2022)

arXiv:2205.15308 [pdf, other]

Parameter-Efficient and Student-Friendly Knowledge Distillation

Authors: Jun Rao, Xv Meng, Liang Ding, Shuhan Qi, Dacheng Tao

Abstract: Knowledge distillation (KD) has been extensively employed to transfer the knowledge from a large teacher model to the smaller students, where the parameters of the teacher are fixed (or partially) during training. Recent studies show that this mode may cause difficulties in knowledge transfer due to the mismatched model capacities. To alleviate the mismatch problem, teacher-student joint training… ▽ More Knowledge distillation (KD) has been extensively employed to transfer the knowledge from a large teacher model to the smaller students, where the parameters of the teacher are fixed (or partially) during training. Recent studies show that this mode may cause difficulties in knowledge transfer due to the mismatched model capacities. To alleviate the mismatch problem, teacher-student joint training methods, e.g., online distillation, have been proposed, but it always requires expensive computational cost. In this paper, we present a parameter-efficient and student-friendly knowledge distillation method, namely PESF-KD, to achieve efficient and sufficient knowledge transfer by updating relatively few partial parameters. Technically, we first mathematically formulate the mismatch as the sharpness gap between their predictive distributions, where we show such a gap can be narrowed with the appropriate smoothness of the soft label. Then, we introduce an adapter module for the teacher and only update the adapter to obtain soft labels with appropriate smoothness. Experiments on a variety of benchmarks show that PESF-KD can significantly reduce the training cost while obtaining competitive results compared to advanced online distillation methods. Code will be released upon acceptance. △ Less

Submitted 28 May, 2022; originally announced May 2022.

arXiv:2205.13522 [pdf, other]

Dynamically Relative Position Encoding-Based Transformer for Automatic Code Edit

Authors: Shiyi Qi, Yaoxian Li, Cuiyun Gao, Xiaohong Su, Shuzheng Gao, Zibin Zheng, Chuanyi Liu

Abstract: Adapting Deep Learning (DL) techniques to automate non-trivial coding activities, such as code documentation and defect detection, has been intensively studied recently. Learning to predict code changes is one of the popular and essential investigations. Prior studies have shown that DL techniques such as Neural Machine Translation (NMT) can benefit meaningful code changes, including bug fixing an… ▽ More Adapting Deep Learning (DL) techniques to automate non-trivial coding activities, such as code documentation and defect detection, has been intensively studied recently. Learning to predict code changes is one of the popular and essential investigations. Prior studies have shown that DL techniques such as Neural Machine Translation (NMT) can benefit meaningful code changes, including bug fixing and code refactoring. However, NMT models may encounter bottleneck when modeling long sequences, thus are limited in accurately predicting code changes. In this work, we design a Transformer-based approach, considering that Transformer has proven effective in capturing long-term dependencies. Specifically, we propose a novel model named DTrans. For better incorporating the local structure of code, i.e., statement-level information in this paper, DTrans is designed with dynamically relative position encoding in the multi-head attention of Transformer. Experiments on benchmark datasets demonstrate that DTrans can more accurately generate patches than the state-of-the-art methods, increasing the performance by at least 5.45\%-46.57\% in terms of the exact match metric on different datasets. Moreover, DTrans can locate the lines to change with 1.75\%-24.21\% higher accuracy than the existing methods. △ Less

Submitted 31 July, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

arXiv:2205.05248 [pdf, other]

Efficient Distributed Framework for Collaborative Multi-Agent Reinforcement Learning

Authors: Shuhan Qi, Shuhao Zhang, Xiaohan Hou, Jiajia Zhang, Xuan Wang, **g Xiao

Abstract: Multi-agent reinforcement learning for incomplete information environments has attracted extensive attention from researchers. However, due to the slow sample collection and poor sample exploration, there are still some problems in multi-agent reinforcement learning, such as unstable model iteration and low training efficiency. Moreover, most of the existing distributed framework are proposed for… ▽ More Multi-agent reinforcement learning for incomplete information environments has attracted extensive attention from researchers. However, due to the slow sample collection and poor sample exploration, there are still some problems in multi-agent reinforcement learning, such as unstable model iteration and low training efficiency. Moreover, most of the existing distributed framework are proposed for single-agent reinforcement learning and not suitable for multi-agent. In this paper, we design an distributed MARL framework based on the actor-work-learner architecture. In this framework, multiple asynchronous environment interaction modules can be deployed simultaneously, which greatly improves the sample collection speed and sample diversity. Meanwhile, to make full use of computing resources, we decouple the model iteration from environment interaction, and thus accelerate the policy iteration. Finally, we verified the effectiveness of propose framework in MaCA military simulation environment and the SMAC 3D realtime strategy gaming environment with imcomplete information characteristics. △ Less

Submitted 10 May, 2022; originally announced May 2022.

Comments: 9 pages, 20 figures

arXiv:2204.10973 [pdf, other]

Detecting Recolored Image by Spatial Correlation

Authors: Yushu Zhang, Nuo Chen, Shuren Qi, Mingfu Xue, Xiaochun Cao

Abstract: Image forensics, aiming to ensure the authenticity of the image, has made great progress in dealing with common image manipulation such as copy-move, splicing, and inpainting in the past decades. However, only a few researchers pay attention to an emerging editing technique called image recoloring, which can manipulate the color values of an image to give it a new style. To prevent it from being u… ▽ More Image forensics, aiming to ensure the authenticity of the image, has made great progress in dealing with common image manipulation such as copy-move, splicing, and inpainting in the past decades. However, only a few researchers pay attention to an emerging editing technique called image recoloring, which can manipulate the color values of an image to give it a new style. To prevent it from being used maliciously, the previous approaches address the conventional recoloring from the perspective of inter-channel correlation and illumination consistency. In this paper, we try to explore a solution from the perspective of the spatial correlation, which exhibits the generic detection capability for both conventional and deep learning-based recoloring. Through theoretical and numerical analysis, we find that the recoloring operation will inevitably destroy the spatial correlation between pixels, implying a new prior of statistical discriminability. Based on such fact, we generate a set of spatial correlation features and learn the informative representation from the set via a convolutional neural network. To train our network, we use three recoloring methods to generate a large-scale and high-quality data set. Extensive experimental results in two recoloring scenes demonstrate that the spatial correlation features are highly discriminative. Our method achieves the state-of-the-art detection accuracy on multiple benchmark datasets and exhibits well generalization for unknown types of recoloring methods. △ Less

Submitted 22 April, 2022; originally announced April 2022.

Comments: 11 pages, 13 figures

arXiv:2204.08382 [pdf, ps, other]

Subspace Nonnegative Matrix Factorization for Feature Representation

Authors: Junhang Li, Jiao Wei, Can Tong, Tingting Shen, Yuchen Liu, Chen Li, Shouliang Qi, Yudong Yao, Yueyang Teng

Abstract: Traditional nonnegative matrix factorization (NMF) learns a new feature representation on the whole data space, which means treating all features equally. However, a subspace is often sufficient for accurate representation in practical applications, and redundant features can be invalid or even harmful. For example, if a camera has some sensors destroyed, then the corresponding pixels in the photo… ▽ More Traditional nonnegative matrix factorization (NMF) learns a new feature representation on the whole data space, which means treating all features equally. However, a subspace is often sufficient for accurate representation in practical applications, and redundant features can be invalid or even harmful. For example, if a camera has some sensors destroyed, then the corresponding pixels in the photos from this camera are not helpful to identify the content, which means only the subspace consisting of remaining pixels is worthy of attention. This paper proposes a new NMF method by introducing adaptive weights to identify key features in the original space so that only a subspace involves generating the new representation. Two strategies are proposed to achieve this: the fuzzier weighted technique and entropy regularized weighted technique, both of which result in an iterative solution with a simple form. Experimental results on several real-world datasets demonstrated that the proposed methods can generate a more accurate feature representation than existing methods. The code developed in this study is available at https://github.com/WNMF1/FWNMF-ERWNMF. △ Less

Submitted 18 April, 2022; originally announced April 2022.

arXiv:2203.14373 [pdf]

doi 10.1021/acsami.2c23251

Sustainable and effective antimicrobial surface based on cellulose thin films

Authors: Shaojun Qi, Ioannis Kiratzis, Pavan Adoni, Zania Stamataki, Aneesa Nabi, David Waugh, Javier Rodriguez Rodriguez, Stuart Clarke, Peter J Fryer, Zhenyu J Zhang

Abstract: In the present work, we developed a sustainable and effective antimicrobial surface film based on Micro-Fibrillated Cellulose. The resulting porous cellulose thin film is barely noticeable to human eyes due to its sub-micron thickness, of which the coverage, porosity and microstructure can be modulated by the formulations developed. Using goniometers and a quartz crystal microbalance (QCM), we obs… ▽ More In the present work, we developed a sustainable and effective antimicrobial surface film based on Micro-Fibrillated Cellulose. The resulting porous cellulose thin film is barely noticeable to human eyes due to its sub-micron thickness, of which the coverage, porosity and microstructure can be modulated by the formulations developed. Using goniometers and a quartz crystal microbalance (QCM), we observed a threefold reduction in water contact angles and accelerated (more than 50% faster) water evaporation kinetics on the cellulose film. The thin film exhibits not only a rapid inactivation effect against SARS-CoV-2 in 5 minutes, following deposition of the virus loaded droplets, but also an exceptional ability to reduce contact transfer of liquid, e.g. respiratory droplets, onto surfaces such as artificial skin by more than 90%. It also exhibits excellent antimicrobial performance in inhibiting the growth of both gram-negative and gram-positive bacteria (E.coli and S.epidermidis) due to the excellent porosity and hydrophilicity. Additionally, the cellulose film shows nearly 100% resistance to skin scra** in dry condition thanks to its strong attachment to the substrate, whilst good removability once wetted, suggesting its practical suitability for daily use. Importantly, the coating can be formed on solid substrates readily by spraying and requires solely a simple formulation of a plant-based cellulose material with no additives, rendering it a scalable, affordable and green solution for antimicrobial surfaces. Implementing such cellulose films could thus play a significant role in controlling future pan- and epidemics, in particularly during the first phase when appropriate medication needs to be developed. △ Less

Submitted 27 March, 2022; originally announced March 2022.

arXiv:2203.06879 [pdf, ps, other]

doi 10.1007/s11433-022-2017-6

Planckian Dissipation and non-Ginzburg-Landau Type Upper Critical Field in Bi2201

Authors: Qihao Zang, Zhengyan Zhu, Zuyu Xu, Shichao Qi, Haoran Ji, Yiwen Li, Jian Wang, Huiqian Luo, Hua-Bing Wang, Hai-Hu Wen

Abstract: Resistivity and Hall effect measurements have been carried out on a micro-fabricated bridge of Bi2201 single crystal at low temperatures down to 0.4 K under high magnetic fields. When superconductivity is crashed by a high magnetic field, the recovered "normal state" resistivity still shows a linear temperature dependence in low temperature region. Combining with the effective mass and the charge… ▽ More Resistivity and Hall effect measurements have been carried out on a micro-fabricated bridge of Bi2201 single crystal at low temperatures down to 0.4 K under high magnetic fields. When superconductivity is crashed by a high magnetic field, the recovered "normal state" resistivity still shows a linear temperature dependence in low temperature region. Combining with the effective mass and the charge carrier density, we get a linear scattering rate $1/τ= αk_{B} T/\hbar$ with $0.77<α<1.16$, which gives a strong evidence of the Planckian dissipation. Furthermore, our results reveal a new type of temperature dependence of upper critical field, $H_{c2}(T)=H^*\sqrt{(1-t)/(t+0.154)}$, which is totally different from the expectation of the Ginzburg-Landau theory, and suggests uncondensed Cooper pairs above $H_{c2}(T)$ line. △ Less

Submitted 22 February, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

Comments: 8 pages, 4 figures

Journal ref: Sci. China-Phys. Mech. Astron. 66, 237412 (2023)

arXiv:2203.03853 [pdf, other]

Where Does the Performance Improvement Come From? -- A Reproducibility Concern about Image-Text Retrieval

Authors: Jun Rao, Fei Wang, Liang Ding, Shuhan Qi, Yibing Zhan, Weifeng Liu, Dacheng Tao

Abstract: This article aims to provide the information retrieval community with some reflections on recent advances in retrieval learning by analyzing the reproducibility of image-text retrieval models. Due to the increase of multimodal data over the last decade, image-text retrieval has steadily become a major research direction in the field of information retrieval. Numerous researchers train and evaluate… ▽ More This article aims to provide the information retrieval community with some reflections on recent advances in retrieval learning by analyzing the reproducibility of image-text retrieval models. Due to the increase of multimodal data over the last decade, image-text retrieval has steadily become a major research direction in the field of information retrieval. Numerous researchers train and evaluate image-text retrieval algorithms using benchmark datasets such as MS-COCO and Flickr30k. Research in the past has mostly focused on performance, with multiple state-of-the-art methodologies being suggested in a variety of ways. According to their assertions, these techniques provide improved modality interactions and hence more precise multimodal representations. In contrast to previous works, we focus on the reproducibility of the approaches and the examination of the elements that lead to improved performance by pretrained and nonpretrained models in retrieving images and text. To be more specific, we first examine the related reproducibility concerns and explain why our focus is on image-text retrieval tasks. Second, we systematically summarize the current paradigm of image-text retrieval models and the stated contributions of those approaches. Third, we analyze various aspects of the reproduction of pretrained and nonpretrained retrieval models. To complete this, we conducted ablation experiments and obtained some influencing factors that affect retrieval recall more than the improvement claimed in the original paper. Finally, we present some reflections and challenges that the retrieval community should consider in the future. Our source code is publicly available at https://github.com/WangFei-2019/Image-text-Retrieval. △ Less

Submitted 27 August, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

Comments: SIGIR 2022

arXiv:2203.00913 [pdf, other]

doi 10.1109/TPAMI.2022.3204971

A Principled Design of Image Representation: Towards Forensic Tasks

Authors: Shuren Qi, Yushu Zhang, Chao Wang, Jiantao Zhou, Xiaochun Cao

Abstract: Image forensics is a rising topic as the trustworthy multimedia content is critical for modern society. Like other vision-related applications, forensic analysis relies heavily on the proper image representation. Despite the importance, current theoretical understanding for such representation remains limited, with varying degrees of neglect for its key role. For this gap, we attempt to investigat… ▽ More Image forensics is a rising topic as the trustworthy multimedia content is critical for modern society. Like other vision-related applications, forensic analysis relies heavily on the proper image representation. Despite the importance, current theoretical understanding for such representation remains limited, with varying degrees of neglect for its key role. For this gap, we attempt to investigate the forensic-oriented image representation as a distinct problem, from the perspectives of theory, implementation, and application. Our work starts from the abstraction of basic principles that the representation for forensics should satisfy, especially revealing the criticality of robustness, interpretability, and coverage. At the theoretical level, we propose a new representation framework for forensics, called Dense Invariant Representation (DIR), which is characterized by stable description with mathematical guarantees. At the implementation level, the discrete calculation problems of DIR are discussed, and the corresponding accurate and fast solutions are designed with generic nature and constant complexity. We demonstrate the above arguments on the dense-domain pattern detection and matching experiments, providing comparison results with state-of-the-art descriptors. Also, at the application level, the proposed DIR is initially explored in passive and active forensics, namely copy-move forgery detection and perceptual hashing, exhibiting the benefits in fulfilling the requirements of such forensic tasks. △ Less

Submitted 6 October, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

Comments: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, https://ieeexplore.ieee.org/document/9881995/

arXiv:2202.09538 [pdf, other]

Improving the Level of Autism Discrimination through GraphRNN Link Prediction

Authors: Haonan Sun, Qiang He, Shouliang Qi, Yudong Yao, Yueyang Teng

Abstract: Dataset is the key of deep learning in Autism disease research. However, due to the few quantity and heterogeneity of samples in current dataset, for example ABIDE (Autism Brain Imaging Data Exchange), the recognition research is not effective enough. Previous studies mostly focused on optimizing feature selection methods and data reinforcement to improve accuracy. This paper is based on the latte… ▽ More Dataset is the key of deep learning in Autism disease research. However, due to the few quantity and heterogeneity of samples in current dataset, for example ABIDE (Autism Brain Imaging Data Exchange), the recognition research is not effective enough. Previous studies mostly focused on optimizing feature selection methods and data reinforcement to improve accuracy. This paper is based on the latter technique, which learns the edge distribution of real brain network through GraphRNN, and generates the synthetic data which has incentive effect on the discriminant model. The experimental results show that the combination of original and synthetic data greatly improves the discrimination of the neural network. For instance, the most significant effect is the 50-layer ResNet, and the best generation model is GraphRNN, which improves the accuracy by 32.51% compared with the model reference experiment without generation data reinforcement. Because the generated data comes from the learned edge connection distribution of Autism patients and typical controls functional connectivity, but it has better effect than the original data, which has constructive significance for further understanding of disease mechanism and development. △ Less

Submitted 19 February, 2022; originally announced February 2022.

arXiv:2201.12536 [pdf, ps, other]

doi 10.1103/PhysRevA.105.053710

Accelerated adiabatic passage in cavity magnomechanics

Authors: Shi-fan Qi, Jun **g

Abstract: Cavity magnomechanics provides a readily-controllable hybrid system, that consisted of cavity mode, magnon mode, and phonon mode, for quantum state manipulation. To implement a fast-and-robust state transfer between the hybrid photon-magnon mode and the phonon mode, we propose two accelerated adiabatic-passage protocols individually based on the counterdiabatic Hamiltonian for transitionless quant… ▽ More Cavity magnomechanics provides a readily-controllable hybrid system, that consisted of cavity mode, magnon mode, and phonon mode, for quantum state manipulation. To implement a fast-and-robust state transfer between the hybrid photon-magnon mode and the phonon mode, we propose two accelerated adiabatic-passage protocols individually based on the counterdiabatic Hamiltonian for transitionless quantum driving and the Levis-Riesenfeld invariant for inverse engineering. Both the counterdiabatic Hamiltonian and the Levis-Riesenfeld invariant generally apply to the continuous-variable systems with arbitrary target states. It is interesting to find that our counterdiabatic Hamiltonian can be constructed in terms of the creation and annihilation operators rather than the system-eigenstates and their time-derivatives. Our protocol can be optimized with respect to the stability against the systematic errors of coupling strength and frequency detuning. It contributes to a quantum memory for photonic and magnonic quantum information. We also discuss the effects from dissipation and the counter-rotating interactions. △ Less

Submitted 29 January, 2022; originally announced January 2022.

Report number: Phys. Rev. A 105, 053710

Journal ref: Phys. Rev. A 105, 053710 (2022)

arXiv:2112.08232 [pdf]

doi 10.1088/1361-6560/ac7193

RA V-Net: Deep learning network for automated liver segmentation

Authors: Zhiqi Lee, Sumin Qi, Chongchong Fan, Ziwei Xie

Abstract: Accurate segmentation of the liver is a prerequisite for the diagnosis of disease. Automated segmentation is an important application of computer-aided detection and diagnosis of liver disease. In recent years, automated processing of medical images has gained breakthroughs. However, the low contrast of abdominal scan CT images and the complexity of liver morphology make accurate automatic segment… ▽ More Accurate segmentation of the liver is a prerequisite for the diagnosis of disease. Automated segmentation is an important application of computer-aided detection and diagnosis of liver disease. In recent years, automated processing of medical images has gained breakthroughs. However, the low contrast of abdominal scan CT images and the complexity of liver morphology make accurate automatic segmentation challenging. In this paper, we propose RA V-Net, which is an improved medical image automatic segmentation model based on U-Net. It has the following three main innovations. CofRes Module (Composite Original Feature Residual Module) is proposed. With more complex convolution layers and skip connections to make it obtain a higher level of image feature extraction capability and prevent gradient disappearance or explosion. AR Module (Attention Recovery Module) is proposed to reduce the computational effort of the model. In addition, the spatial features between the data pixels of the encoding and decoding modules are sensed by adjusting the channels and LSTM convolution. Finally, the image features are effectively retained. CA Module (Channel Attention Module) is introduced, which used to extract relevant channels with dependencies and strengthen them by matrix dot product, while weakening irrelevant channels without dependencies. The purpose of channel attention is achieved. The attention mechanism provided by LSTM convolution and CA Module are strong guarantees for the performance of the neural network. The accuracy of U-Net network: 0.9862, precision: 0.9118, DSC: 0.8547, JSC: 0.82. The evaluation metrics of RA V-Net, accuracy: 0.9968, precision: 0.9597, DSC: 0.9654, JSC: 0.9414. The most representative metric for the segmentation effect is DSC, which improves 0.1107 over U-Net, and JSC improves 0.1214. △ Less

Submitted 15 December, 2021; v1 submitted 15 December, 2021; originally announced December 2021.

arXiv:2111.15488 [pdf]

doi 10.1103/PhysRevLett.132.226003

High-Temperature Anomalous Metal States in Iron-Based Interface Superconductors

Authors: Yanan Li, Haiwen Liu, Haoran Ji, Chengcheng Ji, Shichao Qi, Xiaotong Jiao, Wenfeng Dong, Yi Sun, Wenhao Zhang, Zihan Cui, Minghu Pan, Nitin Samarth, Lili Wang, X. C. Xie, Qi-Kun Xue, Yi Liu, Jian Wang

Abstract: The nature of the anomalous metal state has been a major puzzle in condensed matter physics for more than three decades. Here, we report systematic investigation and modulation of the anomalous metal states in high-temperature interface superconductor FeSe films on SrTiO3 substrate. Remarkably, under zero magnetic field, the anomalous metal state persists up to 20 K in pristine FeSe films, an exce… ▽ More The nature of the anomalous metal state has been a major puzzle in condensed matter physics for more than three decades. Here, we report systematic investigation and modulation of the anomalous metal states in high-temperature interface superconductor FeSe films on SrTiO3 substrate. Remarkably, under zero magnetic field, the anomalous metal state persists up to 20 K in pristine FeSe films, an exceptionally high temperature standing out from previous observations. In stark contrast, for the FeSe films with nano-hole arrays, the characteristic temperature of the anomalous metal state is considerably reduced. We demonstrate that the observed anomalous metal states originate from the quantum tunneling of vortices adjusted by the Ohmic dissipation. Our work offers a perspective for understanding the origin and modulation of the anomalous metal states in two-dimensional bosonic systems. △ Less

Submitted 4 June, 2024; v1 submitted 30 November, 2021; originally announced November 2021.

Journal ref: Physical Review Letters 132, 226003 (2024)

arXiv:2111.14007 [pdf, other]

An Entropy Weighted Nonnegative Matrix Factorization Algorithm for Feature Representation

Authors: Jiao Wei, Can Tong, Bingxue Wu, Qiang He, Shouliang Qi, Yudong Yao, Yueyang Teng

Abstract: Nonnegative matrix factorization (NMF) has been widely used to learn low-dimensional representations of data. However, NMF pays the same attention to all attributes of a data point, which inevitably leads to inaccurate representation. For example, in a human-face data set, if an image contains a hat on the head, the hat should be removed or the importance of its corresponding attributes should be… ▽ More Nonnegative matrix factorization (NMF) has been widely used to learn low-dimensional representations of data. However, NMF pays the same attention to all attributes of a data point, which inevitably leads to inaccurate representation. For example, in a human-face data set, if an image contains a hat on the head, the hat should be removed or the importance of its corresponding attributes should be decreased during matrix factorizing. This paper proposes a new type of NMF called entropy weighted NMF (EWNMF), which uses an optimizable weight for each attribute of each data point to emphasize their importance. This process is achieved by adding an entropy regularizer to the cost function and then using the Lagrange multiplier method to solve the problem. Experimental results with several data sets demonstrate the feasibility and effectiveness of the proposed method. We make our code available at https://github.com/Poisson-EM/Entropy-weighted-NMF. △ Less

Submitted 27 November, 2021; originally announced November 2021.

Report number: 40020211125

arXiv:2110.07723 [pdf, other]

EMDS-7: Environmental Microorganism Image Dataset Seventh Version for Multiple Object Detection Evaluation

Authors: Hechen Yang, Chen Li, Xin Zhao, Bencheng Cai, Jiawei Zhang, **li Ma, Peng Zhao, Ao Chen, Tao Jiang, Hongzan Sun, Yueyang Teng, Shouliang Qi, Tao Jiang, Marcin Grzegorzek

Abstract: The Environmental Microorganism Image Dataset Seventh Version (EMDS-7) is a microscopic image data set, including the original Environmental Microorganism images (EMs) and the corresponding object labeling files in ".XML" format file. The EMDS-7 data set consists of 41 types of EMs, which has a total of 2365 images and 13216 labeled objects. The EMDS-7 database mainly focuses on the object detecti… ▽ More The Environmental Microorganism Image Dataset Seventh Version (EMDS-7) is a microscopic image data set, including the original Environmental Microorganism images (EMs) and the corresponding object labeling files in ".XML" format file. The EMDS-7 data set consists of 41 types of EMs, which has a total of 2365 images and 13216 labeled objects. The EMDS-7 database mainly focuses on the object detection. In order to prove the effectiveness of EMDS-7, we select the most commonly used deep learning methods (Faster-RCNN, YOLOv3, YOLOv4, SSD and RetinaNet) and evaluation indices for testing and evaluation. EMDS-7 is freely published for non-commercial purpose at: https://figshare.com/articles/dataset/EMDS-7_DataSet/16869571 △ Less

Submitted 28 October, 2021; v1 submitted 10 October, 2021; originally announced October 2021.

arXiv:2110.06531 [pdf, ps, other]

doi 10.1103/PhysRevA.105.022624

Generation of Bell and GHZ states from a hybrid qubit-photon-magnon system

Authors: Shi-fan Qi, Jun **g

Abstract: We propose a level-resolved protocol in a hybrid architecture for connecting a superconducting qubit and a magnon mode contained within a microwave cavity (resonator) to generate the local and global entangled states, enabling a wide range of applications in quantum communication, quantum metrology, and quantum information processing. Exploiting the high-degree of controllability in such a hybrid… ▽ More We propose a level-resolved protocol in a hybrid architecture for connecting a superconducting qubit and a magnon mode contained within a microwave cavity (resonator) to generate the local and global entangled states, enabling a wide range of applications in quantum communication, quantum metrology, and quantum information processing. Exploiting the high-degree of controllability in such a hybrid qubit-photon-magnon system, we derive effective Hamiltonians at the second- or the third-order resonant points by virtue of the strong counter-rotating interactions between the resonator and the qubit and between the resonator and the magnon. Consequently, we can efficiently generate the Bell states of the photon-magnon and the qubit-magnon subsystems and the Greenberger-Horne-Zeilinger state of the whole hybrid system. We also check the robustness of our protocol against the environmental noise by the Lindblad master equation. Our work makes this hybrid platform of high-degree of controllability a high-fidelity candidate for the realization of the maximally-entangled multiple states. △ Less

Submitted 13 October, 2021; originally announced October 2021.

Journal ref: Phys. Rev. A.105.022624 (2022)

Showing 51–100 of 190 results for author: Qi, S