-
Few-shot Joint Multimodal Aspect-Sentiment Analysis Based on Generative Multimodal Prompt
Authors:
Xiaocui Yang,
Shi Feng,
Daling Wang,
Sun Qi,
Wenfang Wu,
Yifei Zhang,
Pengfei Hong,
Soujanya Poria
Abstract:
We have witnessed the rapid proliferation of multimodal data on numerous social media platforms. Conventional studies typically require massive labeled data to train models for Multimodal Aspect-Based Sentiment Analysis (MABSA). However, collecting and annotating fine-grained multimodal data for MABSA is tough. To alleviate the above issue, we perform three MABSA-related tasks with quite a small n…
▽ More
We have witnessed the rapid proliferation of multimodal data on numerous social media platforms. Conventional studies typically require massive labeled data to train models for Multimodal Aspect-Based Sentiment Analysis (MABSA). However, collecting and annotating fine-grained multimodal data for MABSA is tough. To alleviate the above issue, we perform three MABSA-related tasks with quite a small number of labeled multimodal samples. We first build diverse and comprehensive multimodal few-shot datasets according to the data distribution. To capture the specific prompt for each aspect term in a few-shot scenario, we propose a novel Generative Multimodal Prompt (GMP) model for MABSA, which includes the Multimodal Encoder module and the N-Stream Decoders module. We further introduce a subtask to predict the number of aspect terms in each instance to construct the multimodal prompt. Extensive experiments on two datasets demonstrate that our approach outperforms strong baselines on two MABSA-related tasks in the few-shot setting.
△ Less
Submitted 18 May, 2023; v1 submitted 17 May, 2023;
originally announced May 2023.
-
BadCS: A Backdoor Attack Framework for Code search
Authors:
Shiyi Qi,
Yuanhang Yang,
Shuzhzeng Gao,
Cuiyun Gao,
Zenglin Xu
Abstract:
With the development of deep learning (DL), DL-based code search models have achieved state-of-the-art performance and have been widely used by developers during software development. However, the security issue, e.g., recommending vulnerable code, has not received sufficient attention, which will bring potential harm to software development. Poisoning-based backdoor attack has proven effective in…
▽ More
With the development of deep learning (DL), DL-based code search models have achieved state-of-the-art performance and have been widely used by developers during software development. However, the security issue, e.g., recommending vulnerable code, has not received sufficient attention, which will bring potential harm to software development. Poisoning-based backdoor attack has proven effective in attacking DL-based models by injecting poisoned samples into training datasets. However, previous work shows that the attack technique does not perform successfully on all DL-based code search models and tends to fail for Transformer-based models, especially pretrained models. Besides, the infected models generally perform worse than benign models, which makes the attack not stealthy enough and thereby hinders the adoption by developers. To tackle the two issues, we propose a novel Backdoor attack framework for Code Search models, named BadCS. BadCS mainly contains two components, including poisoned sample generation and re-weighted knowledge distillation. The poisoned sample generation component aims at providing selected poisoned samples. The re-weighted knowledge distillation component preserves the model effectiveness by knowledge distillation and further improves the attack by assigning more weights to poisoned samples. Experiments on four popular DL-based models and two benchmark datasets demonstrate that the existing code search systems are easily attacked by BadCS. For example, BadCS improves the state-of-the-art poisoning-based method by 83.03%-99.98% and 75.98%-99.90% on Python and Java datasets, respectively. Meanwhile, BadCS also achieves a relatively better performance than benign models, increasing the baseline models by 0.49% and 0.46% on average, respectively.
△ Less
Submitted 9 May, 2023;
originally announced May 2023.
-
NL-CS Net: Deep Learning with Non-Local Prior for Image Compressive Sensing
Authors:
Shuai Bian,
Shouliang Qi,
Chen Li,
Yudong Yao,
Yueyang Teng
Abstract:
Deep learning has been applied to compressive sensing (CS) of images successfully in recent years. However, existing network-based methods are often trained as the black box, in which the lack of prior knowledge is often the bottleneck for further performance improvement. To overcome this drawback, this paper proposes a novel CS method using non-local prior which combines the interpretability of t…
▽ More
Deep learning has been applied to compressive sensing (CS) of images successfully in recent years. However, existing network-based methods are often trained as the black box, in which the lack of prior knowledge is often the bottleneck for further performance improvement. To overcome this drawback, this paper proposes a novel CS method using non-local prior which combines the interpretability of the traditional optimization methods with the speed of network-based methods, called NL-CS Net. We unroll each phase from iteration of the augmented Lagrangian method solving non-local and sparse regularized optimization problem by a network. NL-CS Net is composed of the up-sampling module and the recovery module. In the up-sampling module, we use learnable up-sampling matrix instead of a predefined one. In the recovery module, patch-wise non-local network is employed to capture long-range feature correspondences. Important parameters involved (e.g. sampling matrix, nonlinear transforms, shrinkage thresholds, step size, $etc.$) are learned end-to-end, rather than hand-crafted. Furthermore, to facilitate practical implementation, orthogonal and binary constraints on the sampling matrix are simultaneously adopted. Extensive experiments on natural images and magnetic resonance imaging (MRI) demonstrate that the proposed method outperforms the state-of-the-art methods while maintaining great interpretability and speed.
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Automated Formation Control Synthesis from Temporal Logic Specifications
Authors:
Shuhao Qi,
Zengjie Zhang,
Sofie Haesaert,
Zhiyong Sun
Abstract:
In many practical scenarios, multi-robot systems are envisioned to support humans in executing complicated tasks within structured environments, such as search-and-rescue tasks. We propose a framework for a multi-robot swarm to fulfill complex tasks represented by temporal logic specifications. Given temporal logic specifications on the swarm formation and navigation, we develop a controller with…
▽ More
In many practical scenarios, multi-robot systems are envisioned to support humans in executing complicated tasks within structured environments, such as search-and-rescue tasks. We propose a framework for a multi-robot swarm to fulfill complex tasks represented by temporal logic specifications. Given temporal logic specifications on the swarm formation and navigation, we develop a controller with runtime safety and convergence guarantees that drive the swarm to formally satisfy the specification. In addition, the synthesized controller will autonomously switch formations as necessary and react to uncontrollable events from the environment. The efficacy of the proposed framework is validated with a simulation study on the navigation of multiple quadrotor robots.
△ Less
Submitted 15 September, 2023; v1 submitted 1 April, 2023;
originally announced April 2023.
-
SVDE: Scalable Value-Decomposition Exploration for Cooperative Multi-Agent Reinforcement Learning
Authors:
Shuhan Qi,
Shuhao Zhang,
Qiang Wang,
Jiajia Zhang,
**g Xiao,
Xuan Wang
Abstract:
Value-decomposition methods, which reduce the difficulty of a multi-agent system by decomposing the joint state-action space into local observation-action spaces, have become popular in cooperative multi-agent reinforcement learning (MARL). However, value-decomposition methods still have the problems of tremendous sample consumption for training and lack of active exploration. In this paper, we pr…
▽ More
Value-decomposition methods, which reduce the difficulty of a multi-agent system by decomposing the joint state-action space into local observation-action spaces, have become popular in cooperative multi-agent reinforcement learning (MARL). However, value-decomposition methods still have the problems of tremendous sample consumption for training and lack of active exploration. In this paper, we propose a scalable value-decomposition exploration (SVDE) method, which includes a scalable training mechanism, intrinsic reward design, and explorative experience replay. The scalable training mechanism asynchronously decouples strategy learning with environmental interaction, so as to accelerate sample generation in a MapReduce manner. For the problem of lack of exploration, an intrinsic reward design and explorative experience replay are proposed, so as to enhance exploration to produce diverse samples and filter non-novel samples, respectively. Empirically, our method achieves the best performance on almost all maps compared to other popular algorithms in a set of StarCraft II micromanagement games. A data-efficiency experiment also shows the acceleration of SVDE for sample collection and policy convergence, and we demonstrate the effectiveness of factors in SVDE through a set of ablation experiments.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
MOELA: A Multi-Objective Evolutionary/Learning Design Space Exploration Framework for 3D Heterogeneous Manycore Platforms
Authors:
Sirui Qi,
Yingheng Li,
Sudeep Pasricha,
Ryan Gary Kim
Abstract:
To enable emerging applications such as deep machine learning and graph processing, 3D network-on-chip (NoC) enabled heterogeneous manycore platforms that can integrate many processing elements (PEs) are needed. However, designing such complex systems with multiple objectives can be challenging due to the huge associated design space and long evaluation times. To optimize such systems, we propose…
▽ More
To enable emerging applications such as deep machine learning and graph processing, 3D network-on-chip (NoC) enabled heterogeneous manycore platforms that can integrate many processing elements (PEs) are needed. However, designing such complex systems with multiple objectives can be challenging due to the huge associated design space and long evaluation times. To optimize such systems, we propose a new multi-objective design space exploration framework called MOELA that combines the benefits of evolutionary-based search with a learning-based local search to quickly determine PE and communication link placement to optimize multiple objectives (e.g., latency, throughput, and energy) in 3D NoC enabled heterogeneous manycore systems. Compared to state-of-the-art approaches, MOELA increases the speed of finding solutions by up to 128x, leads to a better Pareto Hypervolume (PHV) by up to 12.14x and improves energy-delay-product (EDP) by up to 7.7% in a 5-objective scenario.
△ Less
Submitted 10 March, 2023;
originally announced March 2023.
-
Searching for Compact Object Candidates from LAMOST Time-Domain Survey of Four K2 Plates
Authors:
Senyu Qi,
Wei-Min Gu,
Tuan Yi,
Zhi-Xiang Zhang,
Song Wang,
Jifeng Liu
Abstract:
The time-domain (TD) surveys of the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) yield high-cadence radial velocities, paving a new avenue to study binary systems including compact objects. In this work, we explore LAMOST TD spectroscopic data of four K2 plates and present a sample of six single-lined spectroscopic binaries that may contain compact objects. We conduct analyse…
▽ More
The time-domain (TD) surveys of the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) yield high-cadence radial velocities, paving a new avenue to study binary systems including compact objects. In this work, we explore LAMOST TD spectroscopic data of four K2 plates and present a sample of six single-lined spectroscopic binaries that may contain compact objects. We conduct analyses using phase-resolved radial velocity measurements of the visible star, to characterize each source and to infer the properties of invisible companion. By fitting the radial velocity curves for the six targets, we obtain accurate orbital periods, ranging from $\sim$ (0.6-6) days, and radial velocity semi-amplitudes, ranging from $\sim$ (50-130) km s$^{-1}$. We calculate the mass function of the unseen companions to be between 0.08 and 0.17 $M_{\odot}$. Based on the mass function and the estimated stellar parameters of the visible star, we determine the minimum mass of the hidden star. Three targets, J034813, J063350, and J064850, show ellipsoidal variability in the light curves from K2, ZTF, and TESS surveys. Therefore, we can put constraints on the mass of the invisible star using the ellipsoidal variability. We identify no X-ray counterparts for these targets except for J085120, of which the X-ray emission can be ascribed to stellar activity. We note that the nature of these six candidates is worth further characterization utilizing multi-wavelength follow-up observations.
△ Less
Submitted 10 March, 2023;
originally announced March 2023.
-
MiddleNet: A Unified, High-Performance NFV and Middlebox Framework with eBPF and DPDK
Authors:
Shixiong Qi,
Ziteng Zeng,
Leslie Monis,
K. K. Ramakrishnan
Abstract:
Traditional network resident functions (e.g., firewalls, network address translation) and middleboxes (caches, load balancers) have moved from purpose-built appliances to software-based components. However, L2/L3 network functions (NFs) are being implemented on Network Function Virtualization (NFV) platforms that extensively exploit kernel-bypass technology. They often use DPDK for zero-copy deliv…
▽ More
Traditional network resident functions (e.g., firewalls, network address translation) and middleboxes (caches, load balancers) have moved from purpose-built appliances to software-based components. However, L2/L3 network functions (NFs) are being implemented on Network Function Virtualization (NFV) platforms that extensively exploit kernel-bypass technology. They often use DPDK for zero-copy delivery and high performance. On the other hand, L4/L7 middleboxes, which have a greater emphasis on functionality, take advantage of a full-fledged kernel-based system.
L2/L3 NFs and L4/L7 middleboxes continue to be handled by distinct platforms on different nodes. This paper proposes MiddleNet that develops a unified network resident function framework that supports L2/L3 NFs and L4/L7 middleboxes. MiddleNet supports function chains that are essential in both NFV and middlebox environments. MiddleNet uses the Data Plane Development Kit (DPDK) library for zero-copy packet delivery without interrupt-based processing, to enable the "bump-in-the-wire" L2/L3 processing performance required of NFV. To support L4/L7 middlebox functionality, MiddleNet utilizes a consolidated, kernel-based protocol stack for processing, avoiding a dedicated protocol stack for each function. MiddleNet fully exploits the event-driven capabilities of the extended Berkeley Packet Filter (eBPF) and seamlessly integrates it with shared memory for high-performance communication in L4/L7 middlebox function chains. The overheads for MiddleNet in L4/L7 are strictly load-proportional, without needing the dedicated CPU cores of DPDK-based approaches. MiddleNet supports flow-dependent packet processing by leveraging Single Root I/O Virtualization (SR-IOV) to dynamically select the packet processing needed (Layers 2 - 7). Our experimental results show that MiddleNet achieves high performance in such a unified environment.
△ Less
Submitted 30 March, 2023; v1 submitted 8 March, 2023;
originally announced March 2023.
-
Indescribable Multi-modal Spatial Evaluator
Authors:
Lingke Kong,
X. Sharon Qi,
Qi** Shen,
Jiacheng Wang,
**gyi Zhang,
Yanle Hu,
Qichao Zhou
Abstract:
Multi-modal image registration spatially aligns two images with different distributions. One of its major challenges is that images acquired from different imaging machines have different imaging distributions, making it difficult to focus only on the spatial aspect of the images and ignore differences in distributions. In this study, we developed a self-supervised approach, Indescribable Multi-mo…
▽ More
Multi-modal image registration spatially aligns two images with different distributions. One of its major challenges is that images acquired from different imaging machines have different imaging distributions, making it difficult to focus only on the spatial aspect of the images and ignore differences in distributions. In this study, we developed a self-supervised approach, Indescribable Multi-model Spatial Evaluator (IMSE), to address multi-modal image registration. IMSE creates an accurate multi-modal spatial evaluator to measure spatial differences between two images, and then optimizes registration by minimizing the error predicted of the evaluator. To optimize IMSE performance, we also proposed a new style enhancement method called Shuffle Remap which randomizes the image distribution into multiple segments, and then randomly disorders and remaps these segments, so that the distribution of the original image is changed. Shuffle Remap can help IMSE to predict the difference in spatial location from unseen target distributions. Our results show that IMSE outperformed the existing methods for registration using T1-T2 and CT-MRI datasets. IMSE also can be easily integrated into the traditional registration process, and can provide a convenient way to evaluate and visualize registration results. IMSE also has the potential to be used as a new paradigm for image-to-image translation. Our code is available at https://github.com/Kid-Liet/IMSE.
△ Less
Submitted 1 March, 2023; v1 submitted 1 March, 2023;
originally announced March 2023.
-
Theoretical Evaluation of the Capacity-Achieving Distribution for IM-DD Fiber-Optic Channels
Authors:
Dongdong Zou,
Wei Wang,
Sui Qi,
Fan Li,
Zhaohui Li
Abstract:
The capacity and capacity-achieving distribution for intensity-modulation and direct-detection (IM-DD) fiber-optic channels is theoretically investigated. Different from coherent fiber-optic channels, we indicate that the capacity-achieving distribution of IM-DD systems should be discussed separately in two cases: 1) IM-DD systems without optical amplifier, which are constrained in peak power; 2)…
▽ More
The capacity and capacity-achieving distribution for intensity-modulation and direct-detection (IM-DD) fiber-optic channels is theoretically investigated. Different from coherent fiber-optic channels, we indicate that the capacity-achieving distribution of IM-DD systems should be discussed separately in two cases: 1) IM-DD systems without optical amplifier, which are constrained in peak power; 2) IM-DD systems with optical amplifier, which are the average power constraint (APC) system. For the two models, the maximum mutual information achieving distribution, instead of the maximum input entropy achieving distribution, is numerically computed by the iterative Blahut-Arimoto (BA) algorithm. For the IM-DD system under peak power constraint (PPC), a dynamic-assignment BA algorithm is applied to find the capacity-achieving distribution with minimum cardinality. It is observed that the maximum difference between the minimum input cardinality and capacity is around 0.8 bits. For a fixed support input cardinality, although the observed sha** gain is small and only appears in low peak-signal-to-noise ratio (PSNR) regions in the PPC IM-DD system, the probabilistic sha** technique can also be used to introduce rate adaptation to the system by adjusting the sha** and FEC overheads since the capacity-achieving distribution is symmetric. In the IM-DD system under APC, a modified BA algorithm is investigated to solve for the capacity and capacity-achieving distribution, and a significant sha** gain is observed. For PAM8 and PAM16 modulation formats, 0.294 bits/symbol and 0.531 bits/symbol sha** gain can be obtained at the SNR of 20dB. Furthermore, since the capacity-achieving distribution is asymmetric in this case, a practical discussion of the PS technique is also presented.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Representing Noisy Image Without Denoising
Authors:
Shuren Qi,
Yushu Zhang,
Chao Wang,
Tao Xiang,
Xiaochun Cao,
Yong Xiang
Abstract:
A long-standing topic in artificial intelligence is the effective recognition of patterns from noisy images. In this regard, the recent data-driven paradigm considers 1) improving the representation robustness by adding noisy samples in training phase (i.e., data augmentation) or 2) pre-processing the noisy image by learning to solve the inverse problem (i.e., image denoising). However, such metho…
▽ More
A long-standing topic in artificial intelligence is the effective recognition of patterns from noisy images. In this regard, the recent data-driven paradigm considers 1) improving the representation robustness by adding noisy samples in training phase (i.e., data augmentation) or 2) pre-processing the noisy image by learning to solve the inverse problem (i.e., image denoising). However, such methods generally exhibit inefficient process and unstable result, limiting their practical applications. In this paper, we explore a non-learning paradigm that aims to derive robust representation directly from noisy images, without the denoising as pre-processing. Here, the noise-robust representation is designed as Fractional-order Moments in Radon space (FMR), with also beneficial properties of orthogonality and rotation invariance. Unlike earlier integer-order methods, our work is a more generic design taking such classical methods as special cases, and the introduced fractional-order parameter offers time-frequency analysis capability that is not available in classical methods. Formally, both implicit and explicit paths for constructing the FMR are discussed in detail. Extensive simulation experiments and an image security application are provided to demonstrate the uniqueness and usefulness of our FMR, especially for noise robustness, rotation invariance, and time-frequency discriminability.
△ Less
Submitted 19 June, 2024; v1 submitted 18 January, 2023;
originally announced January 2023.
-
Learning Personalized Brain Functional Connectivity of MDD Patients from Multiple Sites via Federated Bayesian Networks
Authors:
Shuai Liu,
Xiao Guo,
Shun Qi,
Huaning Wang,
Xiangyu Chang
Abstract:
Identifying functional connectivity biomarkers of major depressive disorder (MDD) patients is essential to advance understanding of the disorder mechanisms and early intervention. However, due to the small sample size and the high dimension of available neuroimaging data, the performance of existing methods is often limited. Multi-site data could enhance the statistical power and sample size, whil…
▽ More
Identifying functional connectivity biomarkers of major depressive disorder (MDD) patients is essential to advance understanding of the disorder mechanisms and early intervention. However, due to the small sample size and the high dimension of available neuroimaging data, the performance of existing methods is often limited. Multi-site data could enhance the statistical power and sample size, while they are often subject to inter-site heterogeneity and data-sharing policies. In this paper, we propose a federated joint estimator, NOTEARS-PFL, for simultaneous learning of multiple Bayesian networks (BNs) with continuous optimization, to identify disease-induced alterations in MDD patients. We incorporate information shared between sites and site-specific information into the proposed federated learning framework to learn personalized BN structures by introducing the group fused lasso penalty. We develop the alternating direction method of multipliers, where in the local update step, the neuroimaging data is processed at each local site. Then the learned network structures are transmitted to the center for the global update. In particular, we derive a closed-form expression for the local update step and use the iterative proximal projection method to deal with the group fused lasso penalty in the global update step. We evaluate the performance of the proposed method on both synthetic and real-world multi-site rs-fMRI datasets. The results suggest that the proposed NOTEARS-PFL yields superior effectiveness and accuracy than the comparable methods.
△ Less
Submitted 6 January, 2023;
originally announced January 2023.
-
Generating entangled states from coherent states in circuit-QED
Authors:
Shi-fan Qi,
Jun **g
Abstract:
Entangled states are self-evidently important to a wide range of applications in quantum communication and quantum information processing. We propose an efficient and convenient two-step protocol for generating Bell states and NOON states of two microwave resonators from merely coherent states. In particular, we derive an effective Hamiltonian for resonators coupled to a superconducting $Λ$-type q…
▽ More
Entangled states are self-evidently important to a wide range of applications in quantum communication and quantum information processing. We propose an efficient and convenient two-step protocol for generating Bell states and NOON states of two microwave resonators from merely coherent states. In particular, we derive an effective Hamiltonian for resonators coupled to a superconducting $Λ$-type qutrit in the dispersive regime. By the excitation-number-dependent Stark shifts of the qutrit transition frequencies, we are able to individually control the amplitudes of specified Fock states of the resonators associated with relevant qutrit transition, using carefully tailored microwave drive signals. Thereby an arbitrary bipartite entangled state in Fock space can be generated by a typical evolution-and-measurement procedure. We analysis the undesired state transitions and the robustness of our protocol against the systematic errors from the microwave driving intensity and frequency, the quantum decoherence of all components, and the crosstalk of two resonators. In addition, we demonstrate that our protocol can be extended to a similar scenario with a $Ξ$-type qutrit.
△ Less
Submitted 10 April, 2023; v1 submitted 29 December, 2022;
originally announced December 2022.
-
Two-stage Contextual Transformer-based Convolutional Neural Network for Airway Extraction from CT Images
Authors:
Yanan Wu,
Shuiqing Zhao,
Shouliang Qi,
Jie Feng,
Haowen Pang,
Runsheng Chang,
Long Bai,
Mengqi Li,
Shuyue Xia,
Wei Qian,
Hongliang Ren
Abstract:
Accurate airway extraction from computed tomography (CT) images is a critical step for planning navigation bronchoscopy and quantitative assessment of airway-related chronic obstructive pulmonary disease (COPD). The existing methods are challenging to sufficiently segment the airway, especially the high-generation airway, with the constraint of the limited label and cannot meet the clinical use in…
▽ More
Accurate airway extraction from computed tomography (CT) images is a critical step for planning navigation bronchoscopy and quantitative assessment of airway-related chronic obstructive pulmonary disease (COPD). The existing methods are challenging to sufficiently segment the airway, especially the high-generation airway, with the constraint of the limited label and cannot meet the clinical use in COPD. We propose a novel two-stage 3D contextual transformer-based U-Net for airway segmentation using CT images. The method consists of two stages, performing initial and refined airway segmentation. The two-stage model shares the same subnetwork with different airway masks as input. Contextual transformer block is performed both in the encoder and decoder path of the subnetwork to finish high-quality airway segmentation effectively. In the first stage, the total airway mask and CT images are provided to the subnetwork, and the intrapulmonary airway mask and corresponding CT scans to the subnetwork in the second stage. Then the predictions of the two-stage method are merged as the final prediction. Extensive experiments were performed on in-house and multiple public datasets. Quantitative and qualitative analysis demonstrate that our proposed method extracted much more branches and lengths of the tree while accomplishing state-of-the-art airway segmentation performance. The code is available at https://github.com/zhaozsq/airway_segmentation.
△ Less
Submitted 15 December, 2022;
originally announced December 2022.
-
BEV-MAE: Bird's Eye View Masked Autoencoders for Point Cloud Pre-training in Autonomous Driving Scenarios
Authors:
Zhiwei Lin,
Yongtao Wang,
Shengxiang Qi,
Nan Dong,
Ming-Hsuan Yang
Abstract:
Existing LiDAR-based 3D object detection methods for autonomous driving scenarios mainly adopt the training-from-scratch paradigm. Unfortunately, this paradigm heavily relies on large-scale labeled data, whose collection can be expensive and time-consuming. Self-supervised pre-training is an effective and desirable way to alleviate this dependence on extensive annotated data. In this work, we pres…
▽ More
Existing LiDAR-based 3D object detection methods for autonomous driving scenarios mainly adopt the training-from-scratch paradigm. Unfortunately, this paradigm heavily relies on large-scale labeled data, whose collection can be expensive and time-consuming. Self-supervised pre-training is an effective and desirable way to alleviate this dependence on extensive annotated data. In this work, we present BEV-MAE, an efficient masked autoencoder pre-training framework for LiDAR-based 3D object detection in autonomous driving. Specifically, we propose a bird's eye view (BEV) guided masking strategy to guide the 3D encoder learning feature representation in a BEV perspective and avoid complex decoder design during pre-training. Furthermore, we introduce a learnable point token to maintain a consistent receptive field size of the 3D encoder with fine-tuning for masked point cloud inputs. Based on the property of outdoor point clouds in autonomous driving scenarios, i.e., the point clouds of distant objects are more sparse, we propose point density prediction to enable the 3D encoder to learn location information, which is essential for object detection. Experimental results show that BEV-MAE surpasses prior state-of-the-art self-supervised methods and achieves a favorably pre-training efficiency. Furthermore, based on TransFusion-L, BEV-MAE achieves new state-of-the-art LiDAR-based 3D object detection results, with 73.6 NDS and 69.6 mAP on the nuScenes benchmark. The source code will be released at https://github.com/VDIGPKU/BEV-MAE
△ Less
Submitted 20 January, 2024; v1 submitted 12 December, 2022;
originally announced December 2022.
-
EBHI-Seg: A Novel Enteroscope Biopsy Histopathological Haematoxylin and Eosin Image Dataset for Image Segmentation Tasks
Authors:
Liyu Shi,
Xiaoyan Li,
Weiming Hu,
Haoyuan Chen,
**g Chen,
Zizhen Fan,
Minghe Gao,
Yujie **g,
Guotao Lu,
Deguo Ma,
Zhiyu Ma,
Qingtao Meng,
Dechao Tang,
Hongzan Sun,
Marcin Grzegorzek,
Shouliang Qi,
Yueyang Teng,
Chen Li
Abstract:
Background and Purpose: Colorectal cancer is a common fatal malignancy, the fourth most common cancer in men, and the third most common cancer in women worldwide. Timely detection of cancer in its early stages is essential for treating the disease. Currently, there is a lack of datasets for histopathological image segmentation of rectal cancer, which often hampers the assessment accuracy when comp…
▽ More
Background and Purpose: Colorectal cancer is a common fatal malignancy, the fourth most common cancer in men, and the third most common cancer in women worldwide. Timely detection of cancer in its early stages is essential for treating the disease. Currently, there is a lack of datasets for histopathological image segmentation of rectal cancer, which often hampers the assessment accuracy when computer technology is used to aid in diagnosis. Methods: This present study provided a new publicly available Enteroscope Biopsy Histopathological Hematoxylin and Eosin Image Dataset for Image Segmentation Tasks (EBHI-Seg). To demonstrate the validity and extensiveness of EBHI-Seg, the experimental results for EBHI-Seg are evaluated using classical machine learning methods and deep learning methods. Results: The experimental results showed that deep learning methods had a better image segmentation performance when utilizing EBHI-Seg. The maximum accuracy of the Dice evaluation metric for the classical machine learning method is 0.948, while the Dice evaluation metric for the deep learning method is 0.965. Conclusion: This publicly available dataset contained 5,170 images of six types of tumor differentiation stages and the corresponding ground truth images. The dataset can provide researchers with new segmentation algorithms for medical diagnosis of colorectal cancer, which can be used in the clinical setting to help doctors and patients.
△ Less
Submitted 6 December, 2022; v1 submitted 1 December, 2022;
originally announced December 2022.
-
An Efficient Split Fine-tuning Framework for Edge and Cloud Collaborative Learning
Authors:
Shaohuai Shi,
Qing Yang,
Yang Xiang,
Shuhan Qi,
Xuan Wang
Abstract:
To enable the pre-trained models to be fine-tuned with local data on edge devices without sharing data with the cloud, we design an efficient split fine-tuning (SFT) framework for edge and cloud collaborative learning. We propose three novel techniques in this framework. First, we propose a matrix decomposition-based method to compress the intermediate output of a neural network to reduce the comm…
▽ More
To enable the pre-trained models to be fine-tuned with local data on edge devices without sharing data with the cloud, we design an efficient split fine-tuning (SFT) framework for edge and cloud collaborative learning. We propose three novel techniques in this framework. First, we propose a matrix decomposition-based method to compress the intermediate output of a neural network to reduce the communication volume between the edge device and the cloud server. Second, we eliminate particular links in the model without affecting the convergence performance in fine-tuning. Third, we implement our system atop PyTorch to allow users to easily extend their existing training scripts to enjoy the efficient edge and cloud collaborative learning. Experiments results on 9 NLP datasets show that our framework can reduce the communication traffic by 96 times with little impact on the model accuracy.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Interlayer ferromagnetism and insulator-metal transition in element-doped CrI3 thin films
Authors:
Shiyang Sun,
Xuyan Chen,
Xuqi Li,
Huihui Zhang,
Haidan Sang,
Shifei Qi,
Zhenhua Qiao
Abstract:
The exploration of magnetism in two-dimensional layered materials has attracted extensive research interest. For the monoclinic phase CrI3 with interlayer antiferromagnetism, finding a static and robust way of realizing the intrinsic interlayer ferromagnetic coupling is desirable. In this Letter, we study the electronic structure and magnetic properties of the nonmagnetic element (e.g., O, S, Se,…
▽ More
The exploration of magnetism in two-dimensional layered materials has attracted extensive research interest. For the monoclinic phase CrI3 with interlayer antiferromagnetism, finding a static and robust way of realizing the intrinsic interlayer ferromagnetic coupling is desirable. In this Letter, we study the electronic structure and magnetic properties of the nonmagnetic element (e.g., O, S, Se, N, P, As and C) doped bi- and triple-layer CrI3 systems via first-principles calculations. Our results demonstrate that O, P, S, As, and Se doped CrI3 bilayer can realize interlayer ferromagnetism. Further analysis shows that the interlayer ferromagnetic coupling in the doped few-layer CrI3 is closely related to the formation of localized spin-polarized state. This finding indicates that insulated interlayer ferromagnetism can be realized at high do** concentration (larger than 8.33%). When the do** concentration is less than 8.33%, but larger than 2.08%, an insulator-metal phase transition can occur since the localized spin-polarized states percolate to form contiguous grids in few-layer CrI3.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
The activated scaling behavior of quantum Griffiths singularity in two-dimensional superconductors
Authors:
Zihan Cui,
Longxin Pan,
**gchao Fang,
Shichao Qi,
Ying Xing,
Haiwen Liu,
Yi Liu,
Jian Wang
Abstract:
Quantum Griffiths singularity is characterized by the divergence of the dynamical critical exponent with the activated scaling law and has been widely observed in various two-dimensional superconductors. Recently, the direct activated scaling analysis with the irrelevant correction has been proposed and successfully used to analyze the experimental data of crystalline PdTe2 and polycrystalline \b{…
▽ More
Quantum Griffiths singularity is characterized by the divergence of the dynamical critical exponent with the activated scaling law and has been widely observed in various two-dimensional superconductors. Recently, the direct activated scaling analysis with the irrelevant correction has been proposed and successfully used to analyze the experimental data of crystalline PdTe2 and polycrystalline \b{eta}-W films, which provides new evidence of quantum Griffiths singularity. Here we show that the direct activated scaling analysis is applicable to the experimental data in different superconducting films, including tri-layer Ga films and LaAlO3/SrTiO3 interface superconductor. When taking the irrelevant correction into account, we calculate the corrected sheet resistance at ultralow temperatures. The scaling behavior of the corrected resistance in a comparably large temperature regime and the theoretical fitting of the phase boundary give unambiguous evidence of quantum Griffiths singularity. Compared to the previous method based on the finite size scaling, the direct activated scaling analysis represents a more direct and precise way to analyze the experimental data of quantum Griffiths singularity in diverse two-dimensional superconductors.
△ Less
Submitted 20 November, 2022;
originally announced November 2022.
-
Dual Class-Aware Contrastive Federated Semi-Supervised Learning
Authors:
Qi Guo,
Yong Qi,
Saiyu Qi,
Di Wu
Abstract:
Federated semi-supervised learning (FSSL), facilitates labeled clients and unlabeled clients jointly training a global model without sharing private data. Existing FSSL methods predominantly employ pseudo-labeling and consistency regularization to exploit the knowledge of unlabeled data, achieving notable success in raw data utilization. However, these training processes are hindered by large devi…
▽ More
Federated semi-supervised learning (FSSL), facilitates labeled clients and unlabeled clients jointly training a global model without sharing private data. Existing FSSL methods predominantly employ pseudo-labeling and consistency regularization to exploit the knowledge of unlabeled data, achieving notable success in raw data utilization. However, these training processes are hindered by large deviations between uploaded local models of labeled and unlabeled clients, as well as confirmation bias introduced by noisy pseudo-labels, both of which negatively affect the global model's performance. In this paper, we present a novel FSSL method called Dual Class-aware Contrastive Federated Semi-Supervised Learning (DCCFSSL). This method accounts for both the local class-aware distribution of each client's data and the global class-aware distribution of all clients' data within the feature space. By implementing a dual class-aware contrastive module, DCCFSSL establishes a unified training objective for different clients to tackle large deviations and incorporates contrastive information in the feature space to mitigate confirmation bias. Moreover, DCCFSSL introduces an authentication-reweighted aggregation technique to improve the server's aggregation robustness. Our comprehensive experiments show that DCCFSSL outperforms current state-of-the-art methods on three benchmark datasets and surpasses the FedAvg with relabeled unlabeled clients on CIFAR-10, CIFAR-100, and STL-10 datasets. To our knowledge, we are the first to present an FSSL method that utilizes only 10\% labeled clients, while still achieving superior performance compared to standard federated supervised learning, which uses all clients with labeled data.
△ Less
Submitted 7 May, 2023; v1 submitted 16 November, 2022;
originally announced November 2022.
-
Rotational symmetry breaking in superconducting nickelate Nd0.8Sr0.2NiO2 films
Authors:
Haoran Ji,
Yanan Li,
Yi Liu,
Xiang Ding,
Zheyuan Xie,
Shichao Qi,
Liang Qiao,
Yi-feng Yang,
Guang-Ming Zhang,
Jian Wang
Abstract:
The infinite-layer nickelates, isostructural to the high-Tc superconductor cuprates, have risen as a promising platform to host unconventional superconductivity and stimulated growing interests in the condensed matter community. Despite numerous researches, the superconducting pairing symmetry of the nickelate superconductors, the fundamental characteristic of a superconducting state, is still und…
▽ More
The infinite-layer nickelates, isostructural to the high-Tc superconductor cuprates, have risen as a promising platform to host unconventional superconductivity and stimulated growing interests in the condensed matter community. Despite numerous researches, the superconducting pairing symmetry of the nickelate superconductors, the fundamental characteristic of a superconducting state, is still under debate. Moreover, the strong electronic correlation in the nickelates may give rise to a rich phase diagram, where the underlying interplay between the superconductivity and other emerging quantum states with broken symmetry is awaiting exploration. Here, we study the angular dependence of the transport properties on the infinite-layer nickelate Nd0.8Sr0.2NiO2 superconducting films with Corbino-disk configuration. The azimuthal angular dependence of the magnetoresistance (R(φ)) manifests the rotational symmetry breaking from isotropy to four-fold (C4) anisotropy with increasing magnetic field, revealing a symmetry breaking phase transition. Approaching the low temperature and large magnetic field regime, an additional two-fold (C2) symmetric component in the R(φ) curves and an anomalous upturn of the temperature-dependent critical field are observed simultaneously, suggesting the emergence of an exotic electronic phase. Our work uncovers the evolution of the quantum states with different rotational symmetries and provides deep insight into the global phase diagram of the nickelate superconductors.
△ Less
Submitted 30 October, 2022;
originally announced October 2022.
-
Listening to Users' Voice: Automatic Summarization of Helpful App Reviews
Authors:
Cuiyun Gao,
Yaoxian Li,
Shuhan Qi,
Yang Liu,
Xuan Wang,
Zibin Zheng,
Qing Liao
Abstract:
App reviews are crowdsourcing knowledge of user experience with the apps, providing valuable information for app release planning, such as major bugs to fix and important features to add. There exist prior explorations on app review mining for release planning, however, most of the studies strongly rely on pre-defined classes or manually-annotated reviews. Also, the new review characteristic, i.e.…
▽ More
App reviews are crowdsourcing knowledge of user experience with the apps, providing valuable information for app release planning, such as major bugs to fix and important features to add. There exist prior explorations on app review mining for release planning, however, most of the studies strongly rely on pre-defined classes or manually-annotated reviews. Also, the new review characteristic, i.e., the number of users who rated the review as helpful, which can help capture important reviews, has not been considered previously.
In the paper, we propose a novel framework, named SOLAR, aiming at accurately summarizing helpful user reviews to developers. The framework mainly contains three modules: The review helpfulness prediction module, topic-sentiment modeling module, and multi-factor ranking module. The review helpfulness prediction module assesses the helpfulness of reviews, i.e., whether the review is useful for developers. The topic-sentiment modeling module groups the topics of the helpful reviews and also predicts the associated sentiment, and the multi-factor ranking module aims at prioritizing semantically representative reviews for each topic as the review summary. Experiments on five popular apps indicate that SOLAR is effective for review summarization and promising for facilitating app release planning.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
Once is Enough: A Light-Weight Cross-Attention for Fast Sentence Pair Modeling
Authors:
Yuanhang Yang,
Shiyi Qi,
Chuanyi Liu,
Qifan Wang,
Cuiyun Gao,
Zenglin Xu
Abstract:
Transformer-based models have achieved great success on sentence pair modeling tasks, such as answer selection and natural language inference (NLI). These models generally perform cross-attention over input pairs, leading to prohibitive computational costs. Recent studies propose dual-encoder and late interaction architectures for faster computation. However, the balance between the expressive of…
▽ More
Transformer-based models have achieved great success on sentence pair modeling tasks, such as answer selection and natural language inference (NLI). These models generally perform cross-attention over input pairs, leading to prohibitive computational costs. Recent studies propose dual-encoder and late interaction architectures for faster computation. However, the balance between the expressive of cross-attention and computation speedup still needs better coordinated. To this end, this paper introduces a novel paradigm MixEncoder for efficient sentence pair modeling. MixEncoder involves a light-weight cross-attention mechanism. It conducts query encoding only once while modeling the query-candidate interaction in parallel. Extensive experiments conducted on four tasks demonstrate that our MixEncoder can speed up sentence pairing by over 113x while achieving comparable performance as the more expensive cross-attention models.
△ Less
Submitted 22 October, 2023; v1 submitted 11 October, 2022;
originally announced October 2022.
-
The Nearest Neutron Star Candidate in a Binary Revealed by Optical Time-domain Surveys
Authors:
Ling-Lin Zheng,
Mouyuan Sun,
Wei-Min Gu,
Tuan Yi,
Zhi-Xiang Zhang,
Pei Wang,
Junfeng Wang,
Jianfeng Wu,
Shan-Shan Weng,
Song Wang,
Sen-Yu Qi,
Jia Zhang,
Chun-Qian Li,
Jian-Rong Shi,
Yong Shao,
Xiang-Dong Li,
**-Bo Fu,
Fan Yang,
Zhongrui Bai,
Yu Bai,
Haotong Zhang,
Jifeng Liu
Abstract:
The near-Earth (within $\sim 100$ pc) supernova explosions in the past several million years can cause the global deposition of radioactive elements (e.g., $^{60}$Fe) on Earth. The remnants of such supernovae are too old to be easily identified. It is therefore of great interest to search for million-year-old near-Earth neutron stars or black holes, the products of supernovae. However, neutron sta…
▽ More
The near-Earth (within $\sim 100$ pc) supernova explosions in the past several million years can cause the global deposition of radioactive elements (e.g., $^{60}$Fe) on Earth. The remnants of such supernovae are too old to be easily identified. It is therefore of great interest to search for million-year-old near-Earth neutron stars or black holes, the products of supernovae. However, neutron stars and black holes are challenging to find even in our Solar neighbourhood if they are not radio pulsars or X-ray/$γ$-ray emitters. Here we report the discovery of one of the nearest ($127.7 \pm 0.3$ pc) neutron star candidates in a detached single-lined spectroscopic binary LAMOST J235456.73+335625.9 (hereafter J2354). Utilizing the time-resolved ground-based spectroscopy and space photometry, we find that J2354 hosts an unseen compact object with $M_{\mathrm{inv}}$ being $1.4 \sim 1.6\ M_{\odot}$. The follow-up Swift ultraviolet (UV) and X-ray observations suggest that the UV and X-ray emission is produced by the visible star rather than the compact object. Hence, J2354 probably harbours a neutron star rather than a hot ultramassive white dwarf. Two-hour exceptionally sensitive radio follow-up observations with Five-hundred-meter Aperture Spherical radio Telescope fail to reveal any pulsating radio signals at the $6σ$ flux upper limit of $12.5\ μ\mathrm{Jy}$. Therefore, the neutron star candidate in J2354 can only be revealed via our time-resolved observations. Interestingly, the distance between J2354 and our Earth can be as close as $\sim 50$ pc around $2.5$ Myrs ago, as revealed by the Gaia kinematics. Our discovery demonstrates a promising way to unveil the hidden near-Earth neutron stars in binaries by exploring the optical time domain, thereby facilitating understanding of the metal-enrichment history in our Solar neighbourhood.
△ Less
Submitted 27 November, 2023; v1 submitted 7 October, 2022;
originally announced October 2022.
-
Intrinsic motivation, Need for cognition, Grit, Growth Mindset and Academic Achievement in High School Students: Latent Profiles and Its Predictive Effects
Authors:
Jun Wu,
Shuoli Qi,
Yueshan Zhong
Abstract:
Recent efforts to identify non-cognitive predictors of academic achievement have especially focused on self-constructs, whose measurement is concerned with a specific domain (e.g., mathematics). However, other important factors, such as character and motivation, have received less attention. Additionally, the predictive accuracy of non-cognitive factors lacks evidence from subjects including Engli…
▽ More
Recent efforts to identify non-cognitive predictors of academic achievement have especially focused on self-constructs, whose measurement is concerned with a specific domain (e.g., mathematics). However, other important factors, such as character and motivation, have received less attention. Additionally, the predictive accuracy of non-cognitive factors lacks evidence from subjects including English and Science. In this study, we take a person-centered approach and focus on students' intrinsic motivation, need for cognition, grit, and growth mindset. We mainly focus on how these factors predict students' mathematics, English, and science grades between 9th grade and 12th grade. 2,308 samples from high school students in Boston (Female = 1,237; aged from 13 to 17). The research results indicated that: (1) four latent profiles of students emerged: High in grit students (n = 997, 43.2%, higher scores of grit); Moderate students (n = 905, 38.3%, moderate in all scores); High in intrinsic motivation students (n = 252, 11.8%, higher scores of intrinsic motivation); Low in grit students (n = 154, 6.7%, lower scores of grit); (2) students' gender, race, maternal education level, and social-economic ranking predicted the profiles; and (3) four profiles of students had a significant predictive effect on Mathematics, Science and English scores in both 9th grade and 12th grade. We discussed the importance of character education for adolescents and motivation for learning in high school.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
A White Dwarf-Main Sequence Binary Unveiled by Time-Domain Observations from LAMOST and TESS
Authors:
Ling-Lin Zheng,
Wei-Min Gu,
Mouyuan Sun,
Zhixiang Zhang,
Tuan Yi,
Jianfeng Wu,
Junfeng Wang,
**-Bo Fu,
Sen-Yu Qi,
Fan Yang,
Song Wang,
Liang Wang,
Zhongrui Bai,
Haotong Zhang,
Chun-Qian Li,
Jian-Rong Shi,
Weikai Zong,
Yu Bai,
Jifeng Liu
Abstract:
We report a single-lined white dwarf-main sequence binary system, LAMOST J172900.17+652952.8, which is discovered by LAMOST's medium resolution time-domain surveys. The radial velocity semi-amplitude and orbital period of the optical visible star are measured by using the Palomar 200-inch telescope follow-up observations and the light curves from TESS. Thus the mass function of the invisible candi…
▽ More
We report a single-lined white dwarf-main sequence binary system, LAMOST J172900.17+652952.8, which is discovered by LAMOST's medium resolution time-domain surveys. The radial velocity semi-amplitude and orbital period of the optical visible star are measured by using the Palomar 200-inch telescope follow-up observations and the light curves from TESS. Thus the mass function of the invisible candidate white dwarf is derived, $f(M_{\rm{2}}) = 0.120\,\pm\,0.003\,M_{\odot}$. The mass of the visible star is measured based on the spectral energy distribution fitting, $M_{\mathrm{1}}$ = $0.81^{+0.07}_{-0.06}\,M_{\odot}$. Hence, the mass of its invisible companion is $M_{\rm{2}}\,\gtrsim\,0.63\,M_{\odot}$. The companion ought to be a compact object rather than a main-sequence star owing to the mass ratio $q = M_{\rm{2}} / M_{\rm 1} \gtrsim 0.78$ and the single-lined spectra. The compact object is likely to be a white dwarf except for small inclination angle, $i\,\lesssim\,40^{\circ}$. By using the GALEX NUV flux, the effective temperature of the white dwarf candidate is constrained as $T_{\rm eff}^{\rm WD}\,\lesssim\,12000-13500$ K. It is difficult to detect white dwarfs which are outshone by their bright companions via single-epoch optical spectroscopic surveys. Therefore, the optical time-domain surveys can play an important role in unveiling invisible white dwarfs and other compact objects in binaries.
△ Less
Submitted 28 September, 2022;
originally announced September 2022.
-
A dynamically discovered and characterized non-accreting neutron star -- M dwarf binary candidate
Authors:
Tuan Yi,
Wei-Min Gu,
Zhi-Xiang Zhang,
Ling-Lin Zheng,
Mouyuan Sun,
Junfeng Wang,
Zhongrui Bai,
Pei Wang,
Jianfeng Wu,
Yu Bai,
Song Wang,
Haotong Zhang,
Yize Dong,
Yong Shao,
Xiang-Dong Li,
Jia Zhang,
Yang Huang,
Fan Yang,
Qingzheng Yu,
Hui-Jun Mu,
**-Bo Fu,
Senyu Qi,
**g Guo,
Xuan Fang,
Chuanjie Zheng
, et al. (4 additional authors not shown)
Abstract:
Optical time-domain surveys can unveil and characterize exciting but less-explored non-accreting and/or non-beaming neutron stars (NS) in binaries. Here we report the discovery of such a NS candidate using the LAMOST spectroscopic survey. The candidate, designated LAMOST J112306.9+400736 (hereafter J1123), is in a single-lined spectroscopic binary containing an optically visible M star. The star's…
▽ More
Optical time-domain surveys can unveil and characterize exciting but less-explored non-accreting and/or non-beaming neutron stars (NS) in binaries. Here we report the discovery of such a NS candidate using the LAMOST spectroscopic survey. The candidate, designated LAMOST J112306.9+400736 (hereafter J1123), is in a single-lined spectroscopic binary containing an optically visible M star. The star's large radial velocity variation and ellipsoidal variations indicate a relatively massive unseen companion. Utilizing follow-up spectroscopy from the Palomar 200-inch telescope and high-precision photometry from TESS, we measure a companion mass of $1.24_{-0.03}^{+0.03}~M_{\odot}$. Main-sequence stars with this mass are ruled out, leaving a NS or a massive white dwarf (WD). Although a massive WD cannot be ruled out, the lack of UV excess radiation from the companion supports the NS hypothesis. Deep radio observations with FAST yielded no detections of either pulsed or persistent emission. J1123 is not detected in numerous X-ray and gamma-ray surveys. These non-detections suggest that the NS candidate is not presently accreting and pulsing. Our work exemplifies the capability of discovering compact objects in non-accreting close binaries by synergizing the optical time-domain spectroscopy and high-cadence photometry.
△ Less
Submitted 25 September, 2022;
originally announced September 2022.
-
FedMCSA: Personalized Federated Learning via Model Components Self-Attention
Authors:
Qi Guo,
Yong Qi,
Saiyu Qi,
Di Wu,
Qian Li
Abstract:
Federated learning (FL) facilitates multiple clients to jointly train a machine learning model without sharing their private data. However, Non-IID data of clients presents a tough challenge for FL. Existing personalized FL approaches rely heavily on the default treatment of one complete model as a basic unit and ignore the significance of different layers on Non-IID data of clients. In this work,…
▽ More
Federated learning (FL) facilitates multiple clients to jointly train a machine learning model without sharing their private data. However, Non-IID data of clients presents a tough challenge for FL. Existing personalized FL approaches rely heavily on the default treatment of one complete model as a basic unit and ignore the significance of different layers on Non-IID data of clients. In this work, we propose a new framework, federated model components self-attention (FedMCSA), to handle Non-IID data in FL, which employs model components self-attention mechanism to granularly promote cooperation between different clients. This mechanism facilitates collaboration between similar model components while reducing interference between model components with large differences. We conduct extensive experiments to demonstrate that FedMCSA outperforms the previous methods on four benchmark datasets. Furthermore, we empirically show the effectiveness of the model components self-attention mechanism, which is complementary to existing personalized FL and can significantly improve the performance of FL.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
Floquet generation of magnonic NOON state
Authors:
Shi-fan Qi,
Jun **g
Abstract:
We propose a concise and deterministic protocol to generate NOON states in a hybrid system consisting of a superconducting qubit, a circuit resonator mode, and two magnonic modes, based on Floquet engineering. In particular, we construct a time-reversal-symmetry broken Hamiltonian for chiral state propagation of the three continuous-variable modes depending on qubit state, by the time modulation o…
▽ More
We propose a concise and deterministic protocol to generate NOON states in a hybrid system consisting of a superconducting qubit, a circuit resonator mode, and two magnonic modes, based on Floquet engineering. In particular, we construct a time-reversal-symmetry broken Hamiltonian for chiral state propagation of the three continuous-variable modes depending on qubit state, by the time modulation over qubit-resonator interaction and magnon frequency. Then an arbitrary magnonic NOON state can be generated by a typical preparing-and-measurement procedure. We analyze the robustness of our protocol against the systematic errors in the qubit-magnon coupling strength, the Floquet-driving intensity, the frequency mismatch of the magnons, and the counter-rotating interactions. We can obtain a high-fidelity NOON state in the presence of the quantum dissipation on all components.
△ Less
Submitted 4 January, 2023; v1 submitted 4 August, 2022;
originally announced August 2022.
-
Shrinking the Semantic Gap: Spatial Pooling of Local Moment Invariants for Copy-Move Forgery Detection
Authors:
Chao Wang,
Zhiqiu Huang,
Shuren Qi,
Yaoshen Yu,
Guohua Shen,
Yushu Zhang
Abstract:
Copy-move forgery is a manipulation of copying and pasting specific patches from and to an image, with potentially illegal or unethical uses. Recent advances in the forensic methods for copy-move forgery have shown increasing success in detection accuracy and robustness. However, for images with high self-similarity or strong signal corruption, the existing algorithms often exhibit inefficient pro…
▽ More
Copy-move forgery is a manipulation of copying and pasting specific patches from and to an image, with potentially illegal or unethical uses. Recent advances in the forensic methods for copy-move forgery have shown increasing success in detection accuracy and robustness. However, for images with high self-similarity or strong signal corruption, the existing algorithms often exhibit inefficient processes and unreliable results. This is mainly due to the inherent semantic gap between low-level visual representation and high-level semantic concept. In this paper, we present a very first study of trying to mitigate the semantic gap problem in copy-move forgery detection, with spatial pooling of local moment invariants for midlevel image representation. Our detection method expands the traditional works on two aspects: 1) we introduce the bag-of-visual-words model into this field for the first time, may meaning a new perspective of forensic study; 2) we propose a word-to-phrase feature description and matching pipeline, covering the spatial structure and visual saliency information of digital images. Extensive experimental results show the superior performance of our framework over state-of-the-art algorithms in overcoming the related problems caused by the semantic gap.
△ Less
Submitted 17 January, 2023; v1 submitted 19 July, 2022;
originally announced July 2022.
-
Searching for compact objects in binaries with Gaia DR3
Authors:
**-Bo Fu,
Wei-Min Gu,
Zhi-Xiang Zhang,
Tuan Yi,
Sen-Yu Qi,
Ling-Lin Zheng,
Jifeng Liu
Abstract:
We search for compact objects in binaries based on Gaia DR3. A sample of ten targets is derived under the conditions: radial velocity variable, low temperature ($T_{\rm eff} < 6000$ K), high mass function ($f(M_2) > 1 M_\odot$), and ellipsoidal-like light curves. Two targets have LAMOST spectroscopic observations, one of which is a double-lined spectroscopic binary. The observational data of seven…
▽ More
We search for compact objects in binaries based on Gaia DR3. A sample of ten targets is derived under the conditions: radial velocity variable, low temperature ($T_{\rm eff} < 6000$ K), high mass function ($f(M_2) > 1 M_\odot$), and ellipsoidal-like light curves. Two targets have LAMOST spectroscopic observations, one of which is a double-lined spectroscopic binary. The observational data of seven targets are not self-consistent, since their photometric periods are even shorter than the theoretical minimum orbital periods calculated by the stellar parameters from Gaia DR3. After excluding these seven inconsistent targets and another target contaminated by a near-bright star, the remaining two targets may contain compact objects worth follow-up observations. This work may serve as an example to demonstrate the feasibility of searching for compact objects in the massive Gaia data.
△ Less
Submitted 18 October, 2022; v1 submitted 12 July, 2022;
originally announced July 2022.
-
A Closer Look into Transformer-Based Code Intelligence Through Code Transformation: Challenges and Opportunities
Authors:
Yaoxian Li,
Shiyi Qi,
Cuiyun Gao,
Yun Peng,
David Lo,
Zenglin Xu,
Michael R. Lyu
Abstract:
Transformer-based models have demonstrated state-of-the-art performance in many intelligent coding tasks such as code comment generation and code completion. Previous studies show that deep learning models are sensitive to the input variations, but few studies have systematically studied the robustness of Transformer under perturbed input code. In this work, we empirically study the effect of sema…
▽ More
Transformer-based models have demonstrated state-of-the-art performance in many intelligent coding tasks such as code comment generation and code completion. Previous studies show that deep learning models are sensitive to the input variations, but few studies have systematically studied the robustness of Transformer under perturbed input code. In this work, we empirically study the effect of semantic-preserving code transformation on the performance of Transformer. Specifically, 24 and 27 code transformation strategies are implemented for two popular programming languages, Java and Python, respectively. For facilitating analysis, the strategies are grouped into five categories: block transformation, insertion/deletion transformation, grammatical statement transformation, grammatical token transformation, and identifier transformation. Experiments on three popular code intelligence tasks, including code completion, code summarization and code search, demonstrate insertion/deletion transformation and identifier transformation show the greatest impact on the performance of Transformer. Our results also suggest that Transformer based on abstract syntax trees (ASTs) shows more robust performance than the model based on only code sequence under most code transformations. Besides, the design of positional encoding can impact the robustness of Transformer under code transformation. Based on our findings, we distill some insights about the challenges and opportunities for Transformer-based code intelligence.
△ Less
Submitted 9 July, 2022;
originally announced July 2022.
-
Dynamic Contrastive Distillation for Image-Text Retrieval
Authors:
Jun Rao,
Liang Ding,
Shuhan Qi,
Meng Fang,
Yang Liu,
Li Shen,
Dacheng Tao
Abstract:
Although the vision-and-language pretraining (VLP) equipped cross-modal image-text retrieval (ITR) has achieved remarkable progress in the past two years, it suffers from a major drawback: the ever-increasing size of VLP models restricts its deployment to real-world search scenarios (where the high latency is unacceptable). To alleviate this problem, we present a novel plug-in dynamic contrastive…
▽ More
Although the vision-and-language pretraining (VLP) equipped cross-modal image-text retrieval (ITR) has achieved remarkable progress in the past two years, it suffers from a major drawback: the ever-increasing size of VLP models restricts its deployment to real-world search scenarios (where the high latency is unacceptable). To alleviate this problem, we present a novel plug-in dynamic contrastive distillation (DCD) framework to compress the large VLP models for the ITR task. Technically, we face the following two challenges: 1) the typical uni-modal metric learning approach is difficult to directly apply to the cross-modal tasks, due to the limited GPU memory to optimize too many negative samples during handling cross-modal fusion features. 2) it is inefficient to static optimize the student network from different hard samples, which have different effects on distillation learning and student network optimization. We try to overcome these challenges from two points. First, to achieve multi-modal contrastive learning, and balance the training costs and effects, we propose to use a teacher network to estimate the difficult samples for students, making the students absorb the powerful knowledge from pre-trained teachers, and master the knowledge from hard samples. Second, to dynamic learn from hard sample pairs, we propose dynamic distillation to dynamically learn samples of different difficulties, from the perspective of better balancing the difficulty of knowledge and students' self-learning ability. We successfully apply our proposed DCD strategy to two state-of-the-art vision-language pretrained models, i.e. ViLT and METER. Extensive experiments on MS-COCO and Flickr30K benchmarks show the effectiveness and efficiency of our DCD framework. Encouragingly, we can speed up the inference at least 129$\times$ compared to the existing ITR models.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
Chiral current in Floquet cavity-magnonics
Authors:
Shi-fan Qi,
Jun **g
Abstract:
Floquet engineering can induce complex collective behaviour and interesting synthetic gauge-field in quantum systems through temporal modulation of system parameters by periodic drives. Using a Floquet drive on frequencies of the magnon modes, we realize a chiral state-transfer in a cavity-magnonic system. The time-reversal symmetry is broken in such a promising platform for coherent information p…
▽ More
Floquet engineering can induce complex collective behaviour and interesting synthetic gauge-field in quantum systems through temporal modulation of system parameters by periodic drives. Using a Floquet drive on frequencies of the magnon modes, we realize a chiral state-transfer in a cavity-magnonic system. The time-reversal symmetry is broken in such a promising platform for coherent information processing. In particular, the photon mode is adiabatically eliminated in the large-detuning regime and the magnon modes under conditional longitudinal drives can be indirectly coupled to each other with a phase-modulated interaction. The effective Hamiltonian is then used to generate chiral currents in a circular loop, whose dynamics is evaluated to measure the symmetry of the system Hamiltonian. Beyond the dynamics limited in the manifold with a fixed number of excitations, our protocol applies to the continuous-variable systems with arbitrary states. Also it is found to be robust against the systematic errors in the photon-magnon coupling strength and Kerr nonlinearity.
△ Less
Submitted 10 October, 2022; v1 submitted 19 June, 2022;
originally announced June 2022.
-
Parameter-Efficient and Student-Friendly Knowledge Distillation
Authors:
Jun Rao,
Xv Meng,
Liang Ding,
Shuhan Qi,
Dacheng Tao
Abstract:
Knowledge distillation (KD) has been extensively employed to transfer the knowledge from a large teacher model to the smaller students, where the parameters of the teacher are fixed (or partially) during training. Recent studies show that this mode may cause difficulties in knowledge transfer due to the mismatched model capacities. To alleviate the mismatch problem, teacher-student joint training…
▽ More
Knowledge distillation (KD) has been extensively employed to transfer the knowledge from a large teacher model to the smaller students, where the parameters of the teacher are fixed (or partially) during training. Recent studies show that this mode may cause difficulties in knowledge transfer due to the mismatched model capacities. To alleviate the mismatch problem, teacher-student joint training methods, e.g., online distillation, have been proposed, but it always requires expensive computational cost. In this paper, we present a parameter-efficient and student-friendly knowledge distillation method, namely PESF-KD, to achieve efficient and sufficient knowledge transfer by updating relatively few partial parameters. Technically, we first mathematically formulate the mismatch as the sharpness gap between their predictive distributions, where we show such a gap can be narrowed with the appropriate smoothness of the soft label. Then, we introduce an adapter module for the teacher and only update the adapter to obtain soft labels with appropriate smoothness. Experiments on a variety of benchmarks show that PESF-KD can significantly reduce the training cost while obtaining competitive results compared to advanced online distillation methods. Code will be released upon acceptance.
△ Less
Submitted 28 May, 2022;
originally announced May 2022.
-
Dynamically Relative Position Encoding-Based Transformer for Automatic Code Edit
Authors:
Shiyi Qi,
Yaoxian Li,
Cuiyun Gao,
Xiaohong Su,
Shuzheng Gao,
Zibin Zheng,
Chuanyi Liu
Abstract:
Adapting Deep Learning (DL) techniques to automate non-trivial coding activities, such as code documentation and defect detection, has been intensively studied recently. Learning to predict code changes is one of the popular and essential investigations. Prior studies have shown that DL techniques such as Neural Machine Translation (NMT) can benefit meaningful code changes, including bug fixing an…
▽ More
Adapting Deep Learning (DL) techniques to automate non-trivial coding activities, such as code documentation and defect detection, has been intensively studied recently. Learning to predict code changes is one of the popular and essential investigations. Prior studies have shown that DL techniques such as Neural Machine Translation (NMT) can benefit meaningful code changes, including bug fixing and code refactoring. However, NMT models may encounter bottleneck when modeling long sequences, thus are limited in accurately predicting code changes. In this work, we design a Transformer-based approach, considering that Transformer has proven effective in capturing long-term dependencies. Specifically, we propose a novel model named DTrans. For better incorporating the local structure of code, i.e., statement-level information in this paper, DTrans is designed with dynamically relative position encoding in the multi-head attention of Transformer. Experiments on benchmark datasets demonstrate that DTrans can more accurately generate patches than the state-of-the-art methods, increasing the performance by at least 5.45\%-46.57\% in terms of the exact match metric on different datasets. Moreover, DTrans can locate the lines to change with 1.75\%-24.21\% higher accuracy than the existing methods.
△ Less
Submitted 31 July, 2022; v1 submitted 26 May, 2022;
originally announced May 2022.
-
Efficient Distributed Framework for Collaborative Multi-Agent Reinforcement Learning
Authors:
Shuhan Qi,
Shuhao Zhang,
Xiaohan Hou,
Jiajia Zhang,
Xuan Wang,
**g Xiao
Abstract:
Multi-agent reinforcement learning for incomplete information environments has attracted extensive attention from researchers. However, due to the slow sample collection and poor sample exploration, there are still some problems in multi-agent reinforcement learning, such as unstable model iteration and low training efficiency. Moreover, most of the existing distributed framework are proposed for…
▽ More
Multi-agent reinforcement learning for incomplete information environments has attracted extensive attention from researchers. However, due to the slow sample collection and poor sample exploration, there are still some problems in multi-agent reinforcement learning, such as unstable model iteration and low training efficiency. Moreover, most of the existing distributed framework are proposed for single-agent reinforcement learning and not suitable for multi-agent. In this paper, we design an distributed MARL framework based on the actor-work-learner architecture. In this framework, multiple asynchronous environment interaction modules can be deployed simultaneously, which greatly improves the sample collection speed and sample diversity. Meanwhile, to make full use of computing resources, we decouple the model iteration from environment interaction, and thus accelerate the policy iteration. Finally, we verified the effectiveness of propose framework in MaCA military simulation environment and the SMAC 3D realtime strategy gaming environment with imcomplete information characteristics.
△ Less
Submitted 10 May, 2022;
originally announced May 2022.
-
Detecting Recolored Image by Spatial Correlation
Authors:
Yushu Zhang,
Nuo Chen,
Shuren Qi,
Mingfu Xue,
Xiaochun Cao
Abstract:
Image forensics, aiming to ensure the authenticity of the image, has made great progress in dealing with common image manipulation such as copy-move, splicing, and inpainting in the past decades. However, only a few researchers pay attention to an emerging editing technique called image recoloring, which can manipulate the color values of an image to give it a new style. To prevent it from being u…
▽ More
Image forensics, aiming to ensure the authenticity of the image, has made great progress in dealing with common image manipulation such as copy-move, splicing, and inpainting in the past decades. However, only a few researchers pay attention to an emerging editing technique called image recoloring, which can manipulate the color values of an image to give it a new style. To prevent it from being used maliciously, the previous approaches address the conventional recoloring from the perspective of inter-channel correlation and illumination consistency. In this paper, we try to explore a solution from the perspective of the spatial correlation, which exhibits the generic detection capability for both conventional and deep learning-based recoloring. Through theoretical and numerical analysis, we find that the recoloring operation will inevitably destroy the spatial correlation between pixels, implying a new prior of statistical discriminability. Based on such fact, we generate a set of spatial correlation features and learn the informative representation from the set via a convolutional neural network. To train our network, we use three recoloring methods to generate a large-scale and high-quality data set. Extensive experimental results in two recoloring scenes demonstrate that the spatial correlation features are highly discriminative. Our method achieves the state-of-the-art detection accuracy on multiple benchmark datasets and exhibits well generalization for unknown types of recoloring methods.
△ Less
Submitted 22 April, 2022;
originally announced April 2022.
-
Subspace Nonnegative Matrix Factorization for Feature Representation
Authors:
Junhang Li,
Jiao Wei,
Can Tong,
Tingting Shen,
Yuchen Liu,
Chen Li,
Shouliang Qi,
Yudong Yao,
Yueyang Teng
Abstract:
Traditional nonnegative matrix factorization (NMF) learns a new feature representation on the whole data space, which means treating all features equally. However, a subspace is often sufficient for accurate representation in practical applications, and redundant features can be invalid or even harmful. For example, if a camera has some sensors destroyed, then the corresponding pixels in the photo…
▽ More
Traditional nonnegative matrix factorization (NMF) learns a new feature representation on the whole data space, which means treating all features equally. However, a subspace is often sufficient for accurate representation in practical applications, and redundant features can be invalid or even harmful. For example, if a camera has some sensors destroyed, then the corresponding pixels in the photos from this camera are not helpful to identify the content, which means only the subspace consisting of remaining pixels is worthy of attention. This paper proposes a new NMF method by introducing adaptive weights to identify key features in the original space so that only a subspace involves generating the new representation. Two strategies are proposed to achieve this: the fuzzier weighted technique and entropy regularized weighted technique, both of which result in an iterative solution with a simple form. Experimental results on several real-world datasets demonstrated that the proposed methods can generate a more accurate feature representation than existing methods. The code developed in this study is available at https://github.com/WNMF1/FWNMF-ERWNMF.
△ Less
Submitted 18 April, 2022;
originally announced April 2022.
-
Sustainable and effective antimicrobial surface based on cellulose thin films
Authors:
Shaojun Qi,
Ioannis Kiratzis,
Pavan Adoni,
Zania Stamataki,
Aneesa Nabi,
David Waugh,
Javier Rodriguez Rodriguez,
Stuart Clarke,
Peter J Fryer,
Zhenyu J Zhang
Abstract:
In the present work, we developed a sustainable and effective antimicrobial surface film based on Micro-Fibrillated Cellulose. The resulting porous cellulose thin film is barely noticeable to human eyes due to its sub-micron thickness, of which the coverage, porosity and microstructure can be modulated by the formulations developed. Using goniometers and a quartz crystal microbalance (QCM), we obs…
▽ More
In the present work, we developed a sustainable and effective antimicrobial surface film based on Micro-Fibrillated Cellulose. The resulting porous cellulose thin film is barely noticeable to human eyes due to its sub-micron thickness, of which the coverage, porosity and microstructure can be modulated by the formulations developed. Using goniometers and a quartz crystal microbalance (QCM), we observed a threefold reduction in water contact angles and accelerated (more than 50% faster) water evaporation kinetics on the cellulose film. The thin film exhibits not only a rapid inactivation effect against SARS-CoV-2 in 5 minutes, following deposition of the virus loaded droplets, but also an exceptional ability to reduce contact transfer of liquid, e.g. respiratory droplets, onto surfaces such as artificial skin by more than 90%. It also exhibits excellent antimicrobial performance in inhibiting the growth of both gram-negative and gram-positive bacteria (E.coli and S.epidermidis) due to the excellent porosity and hydrophilicity. Additionally, the cellulose film shows nearly 100% resistance to skin scra** in dry condition thanks to its strong attachment to the substrate, whilst good removability once wetted, suggesting its practical suitability for daily use. Importantly, the coating can be formed on solid substrates readily by spraying and requires solely a simple formulation of a plant-based cellulose material with no additives, rendering it a scalable, affordable and green solution for antimicrobial surfaces. Implementing such cellulose films could thus play a significant role in controlling future pan- and epidemics, in particularly during the first phase when appropriate medication needs to be developed.
△ Less
Submitted 27 March, 2022;
originally announced March 2022.
-
Planckian Dissipation and non-Ginzburg-Landau Type Upper Critical Field in Bi2201
Authors:
Qihao Zang,
Zhengyan Zhu,
Zuyu Xu,
Shichao Qi,
Haoran Ji,
Yiwen Li,
Jian Wang,
Huiqian Luo,
Hua-Bing Wang,
Hai-Hu Wen
Abstract:
Resistivity and Hall effect measurements have been carried out on a micro-fabricated bridge of Bi2201 single crystal at low temperatures down to 0.4 K under high magnetic fields. When superconductivity is crashed by a high magnetic field, the recovered "normal state" resistivity still shows a linear temperature dependence in low temperature region. Combining with the effective mass and the charge…
▽ More
Resistivity and Hall effect measurements have been carried out on a micro-fabricated bridge of Bi2201 single crystal at low temperatures down to 0.4 K under high magnetic fields. When superconductivity is crashed by a high magnetic field, the recovered "normal state" resistivity still shows a linear temperature dependence in low temperature region. Combining with the effective mass and the charge carrier density, we get a linear scattering rate $1/τ= αk_{B} T/\hbar$ with $0.77<α<1.16$, which gives a strong evidence of the Planckian dissipation. Furthermore, our results reveal a new type of temperature dependence of upper critical field, $H_{c2}(T)=H^*\sqrt{(1-t)/(t+0.154)}$, which is totally different from the expectation of the Ginzburg-Landau theory, and suggests uncondensed Cooper pairs above $H_{c2}(T)$ line.
△ Less
Submitted 22 February, 2023; v1 submitted 14 March, 2022;
originally announced March 2022.
-
Where Does the Performance Improvement Come From? -- A Reproducibility Concern about Image-Text Retrieval
Authors:
Jun Rao,
Fei Wang,
Liang Ding,
Shuhan Qi,
Yibing Zhan,
Weifeng Liu,
Dacheng Tao
Abstract:
This article aims to provide the information retrieval community with some reflections on recent advances in retrieval learning by analyzing the reproducibility of image-text retrieval models. Due to the increase of multimodal data over the last decade, image-text retrieval has steadily become a major research direction in the field of information retrieval. Numerous researchers train and evaluate…
▽ More
This article aims to provide the information retrieval community with some reflections on recent advances in retrieval learning by analyzing the reproducibility of image-text retrieval models. Due to the increase of multimodal data over the last decade, image-text retrieval has steadily become a major research direction in the field of information retrieval. Numerous researchers train and evaluate image-text retrieval algorithms using benchmark datasets such as MS-COCO and Flickr30k. Research in the past has mostly focused on performance, with multiple state-of-the-art methodologies being suggested in a variety of ways. According to their assertions, these techniques provide improved modality interactions and hence more precise multimodal representations. In contrast to previous works, we focus on the reproducibility of the approaches and the examination of the elements that lead to improved performance by pretrained and nonpretrained models in retrieving images and text. To be more specific, we first examine the related reproducibility concerns and explain why our focus is on image-text retrieval tasks. Second, we systematically summarize the current paradigm of image-text retrieval models and the stated contributions of those approaches. Third, we analyze various aspects of the reproduction of pretrained and nonpretrained retrieval models. To complete this, we conducted ablation experiments and obtained some influencing factors that affect retrieval recall more than the improvement claimed in the original paper. Finally, we present some reflections and challenges that the retrieval community should consider in the future. Our source code is publicly available at https://github.com/WangFei-2019/Image-text-Retrieval.
△ Less
Submitted 27 August, 2022; v1 submitted 8 March, 2022;
originally announced March 2022.
-
A Principled Design of Image Representation: Towards Forensic Tasks
Authors:
Shuren Qi,
Yushu Zhang,
Chao Wang,
Jiantao Zhou,
Xiaochun Cao
Abstract:
Image forensics is a rising topic as the trustworthy multimedia content is critical for modern society. Like other vision-related applications, forensic analysis relies heavily on the proper image representation. Despite the importance, current theoretical understanding for such representation remains limited, with varying degrees of neglect for its key role. For this gap, we attempt to investigat…
▽ More
Image forensics is a rising topic as the trustworthy multimedia content is critical for modern society. Like other vision-related applications, forensic analysis relies heavily on the proper image representation. Despite the importance, current theoretical understanding for such representation remains limited, with varying degrees of neglect for its key role. For this gap, we attempt to investigate the forensic-oriented image representation as a distinct problem, from the perspectives of theory, implementation, and application. Our work starts from the abstraction of basic principles that the representation for forensics should satisfy, especially revealing the criticality of robustness, interpretability, and coverage. At the theoretical level, we propose a new representation framework for forensics, called Dense Invariant Representation (DIR), which is characterized by stable description with mathematical guarantees. At the implementation level, the discrete calculation problems of DIR are discussed, and the corresponding accurate and fast solutions are designed with generic nature and constant complexity. We demonstrate the above arguments on the dense-domain pattern detection and matching experiments, providing comparison results with state-of-the-art descriptors. Also, at the application level, the proposed DIR is initially explored in passive and active forensics, namely copy-move forgery detection and perceptual hashing, exhibiting the benefits in fulfilling the requirements of such forensic tasks.
△ Less
Submitted 6 October, 2022; v1 submitted 2 March, 2022;
originally announced March 2022.
-
Improving the Level of Autism Discrimination through GraphRNN Link Prediction
Authors:
Haonan Sun,
Qiang He,
Shouliang Qi,
Yudong Yao,
Yueyang Teng
Abstract:
Dataset is the key of deep learning in Autism disease research. However, due to the few quantity and heterogeneity of samples in current dataset, for example ABIDE (Autism Brain Imaging Data Exchange), the recognition research is not effective enough. Previous studies mostly focused on optimizing feature selection methods and data reinforcement to improve accuracy. This paper is based on the latte…
▽ More
Dataset is the key of deep learning in Autism disease research. However, due to the few quantity and heterogeneity of samples in current dataset, for example ABIDE (Autism Brain Imaging Data Exchange), the recognition research is not effective enough. Previous studies mostly focused on optimizing feature selection methods and data reinforcement to improve accuracy. This paper is based on the latter technique, which learns the edge distribution of real brain network through GraphRNN, and generates the synthetic data which has incentive effect on the discriminant model. The experimental results show that the combination of original and synthetic data greatly improves the discrimination of the neural network. For instance, the most significant effect is the 50-layer ResNet, and the best generation model is GraphRNN, which improves the accuracy by 32.51% compared with the model reference experiment without generation data reinforcement. Because the generated data comes from the learned edge connection distribution of Autism patients and typical controls functional connectivity, but it has better effect than the original data, which has constructive significance for further understanding of disease mechanism and development.
△ Less
Submitted 19 February, 2022;
originally announced February 2022.
-
Accelerated adiabatic passage in cavity magnomechanics
Authors:
Shi-fan Qi,
Jun **g
Abstract:
Cavity magnomechanics provides a readily-controllable hybrid system, that consisted of cavity mode, magnon mode, and phonon mode, for quantum state manipulation. To implement a fast-and-robust state transfer between the hybrid photon-magnon mode and the phonon mode, we propose two accelerated adiabatic-passage protocols individually based on the counterdiabatic Hamiltonian for transitionless quant…
▽ More
Cavity magnomechanics provides a readily-controllable hybrid system, that consisted of cavity mode, magnon mode, and phonon mode, for quantum state manipulation. To implement a fast-and-robust state transfer between the hybrid photon-magnon mode and the phonon mode, we propose two accelerated adiabatic-passage protocols individually based on the counterdiabatic Hamiltonian for transitionless quantum driving and the Levis-Riesenfeld invariant for inverse engineering. Both the counterdiabatic Hamiltonian and the Levis-Riesenfeld invariant generally apply to the continuous-variable systems with arbitrary target states. It is interesting to find that our counterdiabatic Hamiltonian can be constructed in terms of the creation and annihilation operators rather than the system-eigenstates and their time-derivatives. Our protocol can be optimized with respect to the stability against the systematic errors of coupling strength and frequency detuning. It contributes to a quantum memory for photonic and magnonic quantum information. We also discuss the effects from dissipation and the counter-rotating interactions.
△ Less
Submitted 29 January, 2022;
originally announced January 2022.
-
RA V-Net: Deep learning network for automated liver segmentation
Authors:
Zhiqi Lee,
Sumin Qi,
Chongchong Fan,
Ziwei Xie
Abstract:
Accurate segmentation of the liver is a prerequisite for the diagnosis of disease. Automated segmentation is an important application of computer-aided detection and diagnosis of liver disease. In recent years, automated processing of medical images has gained breakthroughs. However, the low contrast of abdominal scan CT images and the complexity of liver morphology make accurate automatic segment…
▽ More
Accurate segmentation of the liver is a prerequisite for the diagnosis of disease. Automated segmentation is an important application of computer-aided detection and diagnosis of liver disease. In recent years, automated processing of medical images has gained breakthroughs. However, the low contrast of abdominal scan CT images and the complexity of liver morphology make accurate automatic segmentation challenging. In this paper, we propose RA V-Net, which is an improved medical image automatic segmentation model based on U-Net. It has the following three main innovations. CofRes Module (Composite Original Feature Residual Module) is proposed. With more complex convolution layers and skip connections to make it obtain a higher level of image feature extraction capability and prevent gradient disappearance or explosion. AR Module (Attention Recovery Module) is proposed to reduce the computational effort of the model. In addition, the spatial features between the data pixels of the encoding and decoding modules are sensed by adjusting the channels and LSTM convolution. Finally, the image features are effectively retained. CA Module (Channel Attention Module) is introduced, which used to extract relevant channels with dependencies and strengthen them by matrix dot product, while weakening irrelevant channels without dependencies. The purpose of channel attention is achieved. The attention mechanism provided by LSTM convolution and CA Module are strong guarantees for the performance of the neural network. The accuracy of U-Net network: 0.9862, precision: 0.9118, DSC: 0.8547, JSC: 0.82. The evaluation metrics of RA V-Net, accuracy: 0.9968, precision: 0.9597, DSC: 0.9654, JSC: 0.9414. The most representative metric for the segmentation effect is DSC, which improves 0.1107 over U-Net, and JSC improves 0.1214.
△ Less
Submitted 15 December, 2021; v1 submitted 15 December, 2021;
originally announced December 2021.
-
High-Temperature Anomalous Metal States in Iron-Based Interface Superconductors
Authors:
Yanan Li,
Haiwen Liu,
Haoran Ji,
Chengcheng Ji,
Shichao Qi,
Xiaotong Jiao,
Wenfeng Dong,
Yi Sun,
Wenhao Zhang,
Zihan Cui,
Minghu Pan,
Nitin Samarth,
Lili Wang,
X. C. Xie,
Qi-Kun Xue,
Yi Liu,
Jian Wang
Abstract:
The nature of the anomalous metal state has been a major puzzle in condensed matter physics for more than three decades. Here, we report systematic investigation and modulation of the anomalous metal states in high-temperature interface superconductor FeSe films on SrTiO3 substrate. Remarkably, under zero magnetic field, the anomalous metal state persists up to 20 K in pristine FeSe films, an exce…
▽ More
The nature of the anomalous metal state has been a major puzzle in condensed matter physics for more than three decades. Here, we report systematic investigation and modulation of the anomalous metal states in high-temperature interface superconductor FeSe films on SrTiO3 substrate. Remarkably, under zero magnetic field, the anomalous metal state persists up to 20 K in pristine FeSe films, an exceptionally high temperature standing out from previous observations. In stark contrast, for the FeSe films with nano-hole arrays, the characteristic temperature of the anomalous metal state is considerably reduced. We demonstrate that the observed anomalous metal states originate from the quantum tunneling of vortices adjusted by the Ohmic dissipation. Our work offers a perspective for understanding the origin and modulation of the anomalous metal states in two-dimensional bosonic systems.
△ Less
Submitted 4 June, 2024; v1 submitted 30 November, 2021;
originally announced November 2021.
-
An Entropy Weighted Nonnegative Matrix Factorization Algorithm for Feature Representation
Authors:
Jiao Wei,
Can Tong,
Bingxue Wu,
Qiang He,
Shouliang Qi,
Yudong Yao,
Yueyang Teng
Abstract:
Nonnegative matrix factorization (NMF) has been widely used to learn low-dimensional representations of data. However, NMF pays the same attention to all attributes of a data point, which inevitably leads to inaccurate representation. For example, in a human-face data set, if an image contains a hat on the head, the hat should be removed or the importance of its corresponding attributes should be…
▽ More
Nonnegative matrix factorization (NMF) has been widely used to learn low-dimensional representations of data. However, NMF pays the same attention to all attributes of a data point, which inevitably leads to inaccurate representation. For example, in a human-face data set, if an image contains a hat on the head, the hat should be removed or the importance of its corresponding attributes should be decreased during matrix factorizing. This paper proposes a new type of NMF called entropy weighted NMF (EWNMF), which uses an optimizable weight for each attribute of each data point to emphasize their importance. This process is achieved by adding an entropy regularizer to the cost function and then using the Lagrange multiplier method to solve the problem. Experimental results with several data sets demonstrate the feasibility and effectiveness of the proposed method. We make our code available at https://github.com/Poisson-EM/Entropy-weighted-NMF.
△ Less
Submitted 27 November, 2021;
originally announced November 2021.
-
EMDS-7: Environmental Microorganism Image Dataset Seventh Version for Multiple Object Detection Evaluation
Authors:
Hechen Yang,
Chen Li,
Xin Zhao,
Bencheng Cai,
Jiawei Zhang,
**li Ma,
Peng Zhao,
Ao Chen,
Tao Jiang,
Hongzan Sun,
Yueyang Teng,
Shouliang Qi,
Tao Jiang,
Marcin Grzegorzek
Abstract:
The Environmental Microorganism Image Dataset Seventh Version (EMDS-7) is a microscopic image data set, including the original Environmental Microorganism images (EMs) and the corresponding object labeling files in ".XML" format file. The EMDS-7 data set consists of 41 types of EMs, which has a total of 2365 images and 13216 labeled objects. The EMDS-7 database mainly focuses on the object detecti…
▽ More
The Environmental Microorganism Image Dataset Seventh Version (EMDS-7) is a microscopic image data set, including the original Environmental Microorganism images (EMs) and the corresponding object labeling files in ".XML" format file. The EMDS-7 data set consists of 41 types of EMs, which has a total of 2365 images and 13216 labeled objects. The EMDS-7 database mainly focuses on the object detection. In order to prove the effectiveness of EMDS-7, we select the most commonly used deep learning methods (Faster-RCNN, YOLOv3, YOLOv4, SSD and RetinaNet) and evaluation indices for testing and evaluation. EMDS-7 is freely published for non-commercial purpose at: https://figshare.com/articles/dataset/EMDS-7_DataSet/16869571
△ Less
Submitted 28 October, 2021; v1 submitted 10 October, 2021;
originally announced October 2021.
-
Generation of Bell and GHZ states from a hybrid qubit-photon-magnon system
Authors:
Shi-fan Qi,
Jun **g
Abstract:
We propose a level-resolved protocol in a hybrid architecture for connecting a superconducting qubit and a magnon mode contained within a microwave cavity (resonator) to generate the local and global entangled states, enabling a wide range of applications in quantum communication, quantum metrology, and quantum information processing. Exploiting the high-degree of controllability in such a hybrid…
▽ More
We propose a level-resolved protocol in a hybrid architecture for connecting a superconducting qubit and a magnon mode contained within a microwave cavity (resonator) to generate the local and global entangled states, enabling a wide range of applications in quantum communication, quantum metrology, and quantum information processing. Exploiting the high-degree of controllability in such a hybrid qubit-photon-magnon system, we derive effective Hamiltonians at the second- or the third-order resonant points by virtue of the strong counter-rotating interactions between the resonator and the qubit and between the resonator and the magnon. Consequently, we can efficiently generate the Bell states of the photon-magnon and the qubit-magnon subsystems and the Greenberger-Horne-Zeilinger state of the whole hybrid system. We also check the robustness of our protocol against the environmental noise by the Lindblad master equation. Our work makes this hybrid platform of high-degree of controllability a high-fidelity candidate for the realization of the maximally-entangled multiple states.
△ Less
Submitted 13 October, 2021;
originally announced October 2021.